org.w3c.dom.Document doc = (org.w3c.dom.Document) builder.parse(new ByteArrayInputStream(result.getBytes("utf-8"))); //See https://stackoverflow.com/questions/1706493/java-net-malformedurlexception-no-protocol , if don't know why ByteArrayInputStream 

Here, so I give the document on rassparsivanie. The problem is that the dock is not valid, it is not so valid that it cannot be fixed. Of course, I tried to fix it, but it is useless. Proper dock

I decided to change the parser. I want a parser that works as a browser - not Vlad? Well, anyway!

Actually the question is that tell me such a loose parser

  • current link to dock does not plow :))) - ProkletyiPirat
  • Yes, the link was cool. Corrected - kandi
  • I'm probably already thinking badly in the evening, but what's wrong in the dock? - ProkletyiPirat 6:59 pm
  • If you viewed inspect code in chrome, then there is nothing, because chrome itself corrects everything). In general, run the dock through the validator (the link cannot be given due to the nature of the validator). First swears on doctype. If you add a doctype, swears at the other crap - kandi

1 answer 1

jsoup did not try: http://jsoup.org/

PS: Literally the other day, I struggled with the type "xml" using the usual StAx, with preprocessing that corrects the incorrect source.