There is such a structure

<div class="b-player-skin__header-inner"> <span>007: СПЕКТР</span> <div class="b-player-skin__header-origin">Spectre</div></div></div> <div class="b-player-skin-left"> <div class="b-player-skin__poster-wrap"> <a class="b-player-skin__poster-image" indx="0" rel="#b-screenshots-gallery" href="#"> <img src="http://img.dotua.org/fsua_items/cover/00/38/41/1/00384173.jpg" ></a></div></div></div> 

It is necessary to obtain information about the film, Russian title, ang and a link to the image. Is it possible to extract text from a specific class in a specific tag? Just in the div tag, for example, not just this title.

My code is titleRus = doc.select("div").select("span").text(); getting it all the same. Is there something like this titleRus = doc.select("div" из class="b-player-skin__header-inner" ).select("span").text();

That is, the div tag has class="b-player-skin__header-inner" and you need to get data from the nested span tag.

Maybe not quite correctly asked the question, but I think the essence can catch. Thanks in advance and with ACCESS !!!)))

  • doc.getElementsByClass ("b-player-skin__header-inner"). text () Try something like this. Well, or under the debug in this direction dig. Happy New Year) - Android Android

3 answers 3

Is it possible to extract text from a specific class in a specific tag?

If I understand you correctly, you can do something like this:

 ArrayList<String> foundText = new ArrayList<>(); Elements els = doc.getElementsByTag("НУЖНЫЙ_ТЭГ_НАПРИМЕР_DIV"); for(Element el : els) { Element elWithClass = el.getElementsByClass("НУЖНЫЙ_КЛАСС").first(); if(elWithClass != null) { foundText.add(elWithClass.text()); } } 

You might also find methods like

getElementsByAttributeValue(String key, String value)

  • after getElementsByTag it is impossible to call getElementsByClass - compilation error - Dmitry Alexandrov
  • @DmitryAleksandrov, yes, indeed. These are Element methods, not Elements . Then everything is somewhat more complicated - see the updated answer. - Yuriy SPb pm

Jsoup understands CSS selectors , so you can write directly in the request that you need:

 Document doc = Jsoup.parse( html ); // взять все узлы span входящие непосредственно в div с классом b-player... // и получить из них скопом текст (т.к. нужный нам span - единственный) String name = doc.select( "div.b-player-skin__header-inner > span" ).text(); System.out.println( "selector: " + name ); // можно выбрать нужный узел, и разбираться в нем руками: // взять все div с классом b-player..., из них взять первый элемент Element div = doc.select( "div.b-player-skin__header-inner" ).get( 0 ); // для каждого подузла вывести for ( Element e : div.children() ) { System.out.println( ">>> " + e ); } 
  • The first variant is String name = doc.select( "div.b-player-skin__header-inner > span" ).text(); System.out.println( "selector: " + name ); String name = doc.select( "div.b-player-skin__header-inner > span" ).text(); System.out.println( "selector: " + name ); issued an empty line, and on the second, under debugging, Element div = doc.select( "div.b-player-skin__header-inner" ).get( 0 ); // для каждого подузла вывести for ( Element e : div.children() ) { System.out.println( ">>> " + e ); } Element div = doc.select( "div.b-player-skin__header-inner" ).get( 0 ); // для каждого подузла вывести for ( Element e : div.children() ) { System.out.println( ">>> " + e ); } Element div = doc.select( "div.b-player-skin__header-inner" ).get( 0 ); // для каждого подузла вывести for ( Element e : div.children() ) { System.out.println( ">>> " + e ); } application crashed (( - Dmitry Alexandrov

You can add an empty class just for span (Simple class name without styles).
Like: <span class="nujnoe_mne">123</span> .

And parse already in this class:

 titleRus = doc.select(".nujnoe_mne"); 

Thus, it will only parse those spans where this class will be specified.