I am writing a simple parser.
Depending on the site, an error occurs. If the site is https:// , then everything seems fine. If http:// , then problems occur as messages:

 Exception in thread "main" java.net.SocketTimeoutException: connect timed out at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.SocksSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at sun.security.ssl.SSLSocketImpl.connect(Unknown Source) at sun.net.NetworkClient.doConnect(Unknown Source) at sun.net.www.http.HttpClient.openServer(Unknown Source) at sun.net.www.http.HttpClient.openServer(Unknown Source) at sun.net.www.protocol.https.HttpsClient.<init>(Unknown Source) at sun.net.www.protocol.https.HttpsClient.New(Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(Unknown Source) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:571) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:548) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:235) at org.jsoup.helper.HttpConnection.get(HttpConnection.java:224) at parser1.main(parser1.java:14) 

The code itself:

 import org.jsoup.Jsoup; import org.jsoup.helper.Validate; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import java.io.IOException; /** * Example program to list links from a URL. */ public class parser1 { public static void main(String[] args) throws Exception { Document doc = Jsoup.connect("https://tinko.ru/").get(); System.out.print(doc.select(".emet_index").text()); Elements els = doc.select("a"); for(Element link : els) { System.out.println(link.text()); } } } 
  • judging by the error, the page cannot be loaded for some time (30 or 60 seconds usually). - pavel
  • Check to make sure your site opens at all - Chubatiy
  • and try another Jsoup.connect("https://tinko.ru/").timeout(100*1000).get(); - Senior Pomidor
  • this site is not responding at all - Senior Pomidor
  • The correct address is tinko.ru , but it also gives an error. And in the browser it opens normally. Timeout also helped Nge - Michael

3 answers 3

site without www redirects to version with www . do this:

 Document doc = Jsoup.connect("https://tinko.ru/").followRedirects(true).get(); 

    The fact is that the correct address of the page " http://www.tinko.ru/ " and it does not parse in any case. Checked. Those sites that are watched and they http they do not connect. And with the https rules.

    • и он не парсится ни в каком случае. Проверял и он не парсится ни в каком случае. Проверял - how и он не парсится ни в каком случае. Проверял you check? - post_zeew
    • Well, out of 1000 attempts, not a single connection. - Michael

    The problem is that your site is unavailable by https . Use http :

     Document doc = Jsoup.connect("http://www.tinko.ru").get(); System.out.println(doc.html()); 

    Result:

     <!doctype html> <html> <head> <meta charset="utf-8"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="viewport" content="width=1200"> <title>Торговый Дом ТИНКО — поставка технических средств безопасности</title> ...