I have never been interested in parsing sites before, but now I need to parse the VK page. I wrote such a code, which, as I understand it, should output Online. But he takes nothing at all. What is the problem? No errors

public static void main(String[] args) throws IOException { Document doc = Jsoup.connect("https://vk.com/id94283688").get(); List<String> strings = new ArrayList<>(); Elements h4Elements = doc.getElementsByAttributeValue("class", "profile_online"); h4Elements.forEach(h4Element -> { Element element = h4Element.child(0); strings.add(element.text()); }); for (String s :strings) { System.out.println(s); } } 

Here is this element:

 <h4 class="profile_online"><div id="profile_online_lv">Online<b class="mob_onl profile_mob_onl unshown" id="profile_mobile_online" onmouseover="mobileOnlineTip(this, {mid: cur.oid, right: 1})" onclick="mobilePromo(); "></b></div> </h4> 

PS: I am new and do not have much knowledge in programming

  • Debut tried? What is in doc , what is in h4Elements ? Obviously, if the code prints nothing, then h4Elements contains no elements. - andreycha
  • one
    In general, it is more correct to use the VK API, and not to download the page. - andreycha
  • I did not find the documentation for it, only in android - Nick
  • vk api: vk.com/dev/users.get , field online . - Roman

2 answers 2

The problem is that you have an empty h4Elements .

To get what you need, you can use, for example, the div#profile_online_lv .

Upd. See what you have in doc . Then install the correct User Agent and look again at what is in the doc .

Upd 1. You can use the method:

 static boolean isOnline(String url) throws IOException { Document doc = Jsoup.connect(url).get(); if(doc.text().indexOf("Online") != -1) return true; return false; } 

But note that there are pages whose contents are not available without authorization . For such pages you will receive false . Although in reality this may not be the case.

This is for you as an example, not very good, but still.

  • I tried this: Element element2 = doc.getElementsByAttributeValue ("id", "profile_online_lv"). Get (0); strings.add (element2.text ()); Throws npe - Nick
  • What is a user agent? - Nick
  • @Nick, Well, do not I give you a quote on the first link from the search results? Read what the user agent is, and then look towards the Jsoup.userAgent(...) method. I do not pretend to the only right decision, but it works. - s8am
  • That is, you need to write like this? Document doc = Jsoup.connect (" vk.com/zhenya_silver").userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 47.0.2526.80 Safari / 537.36") .get (); - Nick
  • @Nick, well, yes, something like that. - s8am pm

I needed to check if the person was online. Changed the code for this:

 Document doc = Jsoup.connect("https://vk.com/id94283688").get(); Element element2 = doc.getAllElements().get(0); if (element2.text().contains("Online")) { System.out.println("Online"); } else { System.out.println("No"); } 

It seems everything works!

  • Keyword - like . Try checking other pages. - s8am
  • He writes No on some, but I doubt that it works correctly ... How to make more accurate? - Nick
  • Your doubts are well founded. Read my post. Especially about the user agent. - s8am