Hey. Please help me figure out how to pull out the text <td>Облачно, дымка</td> from the code below, of course, without the prefixes <td> :

 <dl class="cloudness"> <dt class="png" title="Облачно, дымка" style="background-image: url(http://s2.gismeteo.ua/static/images/icons/new/n.moon.c3.png)"><br /></dt> <dd><table><tr><td>Облачно, дымка</td></tr></table></dd> </dl> 

    1 answer 1

    For example, using jsoup

     String html = "<dl class=\"cloudness\"><dt class=\"png\" title=\"Облачно, дымка\" style=\"background-image: url(http://s2.gismeteo.ua/static/images/icons/new/n.moon.c3.png)\"><br /></dt><dd><table><tr><td>Облачно, дымка</td></tr></table></dd></dl>"; Document doc = Jsoup.parse(html); Elements columns = doc.select("td"); String desiredText = columns.first().text(); 
    • ahahah, thank you, here I am a fool) with the help of parsil myself :) Many thanks :) Just in case I still have interesting options. - djbolya
    • one
      @djbolya: In fact, there are no other options. In any case, you take an HTML parser and a parsit, then choose the right one from the resulting object model (or react to SAX events, it depends on the parser). Regulars - a method of stuffing cones, do not use them to parse complex languages ​​such as HTML. - VladD
    • spasibo bolshoe - djbolya