Hello! Can you tell me how to assign a User-Agent to the HTTP header? If there is an example, many thanks. I just need to, when my little robot visited the site, he was there somehow "glowing". Maybe there is an article with code examples?

Here is the code for my little crawler:

UPD

 package crawler; import java.io.BufferedReader; import java.io.File; import java.io.FileWriter; import java.io.IOException; import java.io.InputStreamReader; import java.net.MalformedURLException; import java.net.URL; import java.util.HashSet; import java.util.Scanner; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; public class crawler { /** * @author ivan * @version 2.2 * @param args */ private static final String CRAWLER_SOURSE = "C:\\apache-tomcat-6.0.36\\webapps\\wasks\\WEB-INF\\indexer\\urllist.txt"; private static final String CRAWLER_WRITE = "C:\\apache-tomcat-6.0.36\\webapps\\wasks\\WEB-INF\\indexer\\urllist.txt"; private static final Pattern p = Pattern.compile("http://(.+?.ru)/"); private static final Set<String> urls = new HashSet<String>(); public static void main(String[] args) { BufferedReader reader = null; Scanner scanner = null; URL url = null; FileWriter wr = null; String sRead = null; File file = new File(CRAWLER_SOURSE); try { scanner = new Scanner(file); while (scanner.hasNext()) { urls.add(scanner.nextLine()); } scanner.close(); for (String s : urls) { StringBuffer buffer = new StringBuffer(); try { try { url = new URL(s); } catch (MalformedURLException e) { e.printStackTrace(); } reader = new BufferedReader(new InputStreamReader( url.openStream())); } catch (IOException e) { e.printStackTrace(); continue; } while (true) { sRead = reader.readLine(); if (sRead == null) { break; } buffer.append(sRead); } Matcher m = p.matcher(buffer.toString()); while (m.find()) { url = new URL(m.group()); String group = new String(m.group()); urls.add(group); System.out.println(urls); } } wr = new FileWriter(CRAWLER_WRITE); for (String s : urls) { wr.write(s + "\n"); } } catch (IOException e) { e.printStackTrace(); } finally { if (wr != null) { try { wr.flush(); wr.close(); } catch (IOException e) { e.printStackTrace(); } } } } } 

As you can see it saves pages. It would be nice if the webmaster knew that I had walked around his site. Plus, I want to give the name of the crawler, like that of the big uncles.

    1 answer 1

    See class HTTPRequest

    • A simple example is possible, some, for better mastering, thanks) Here: - vanekk1
    • one
      Show all the code, and I will tell you how to insert UA into it. BTW, it is not necessary to use HTTP classes at all, sockets can also be: javaportal.ru/java/articles/java_http_web/article04.html - user6550
    • All added) - vanekk1