Given:

The site on the wikiDot engine. It has a button-link to a random article site:

https://scpfoundation.net/wikidot_random_page

When clicking on a link, a GET request is executed and a redirect is sent to the generated in some way (see some script on the server) URL address of the page on the site.

Task:

Programmatically get a random address of the site page at the link above. In this case, no access to any of the sites and servers specified in the question.

Possible Solution:

Somehow to find out what happens in the browser when clicking on a link and form a similar request programmatically to get the final random link.

I tried:

  • Google did not answer (google as it does on the site's engine).
  • An attempt to extract something through the chrome developer tools showed only that the GET request was made via the link. What is sent there is not visible (maybe it looked bad).
  • Attempting to download a link programmatically does not give any hints to a random URL.

Question:

How and with what tools can you find out what happens in the browser when clicking on this link and how to emulate all this to get the generated URL to which this link is redirected?

  • 2
    If you click on the query of interest in the Chrome developer’s tool, it displays everything. And which query and with which headers, etc. - Sergey Mitrofanov
  • @SergeyMitrofanov, for sure, thank you) I didn’t even guess to click there) - YurySPb

3 answers 3

On

  1. According to the @Sergey Mitrofanov comment, headers and other information about the request and response can be viewed by clicking on the request of interest in the Chrome developer tool.
  2. Next, you need to make a get request with headers from item 1.
  3. And, as @VladD correctly noted in his answer, you should definitely disable automatic redirection.

Here is the Java code using OkHttp:

 OkHttpClient client = new OkHttpClient.Builder() .followRedirects(false) .build(); String url = "https://scpfoundation.net/wikidot_random_page"; Request.Builder request = new Request.Builder(); request.url(url); request.addHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"); request.addHeader("Accept-Encoding", "gzip, deflate, br"); request.addHeader("Accept-Language", "en-US,en;q=0.8,de-DE;q=0.5,de;q=0.3"); request.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"); request.get(); Response response = client.newCall(request.build()).execute(); Log.i("LOG", response.body().string()); //выдаст каждый раз разный URL для перенаправления: //<html><body>You are being <a href="http://scpfoundation.ru/scp-851">redirected</a>.</body></html> 

    On roll?

    I have the following code:

     var address = new Uri("https://scpfoundation.net/wikidot_random_page"); var client = new HttpClient(new HttpClientHandler() { AllowAutoRedirect = false }, disposeHandler: true) { DefaultRequestHeaders = { { "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" }, { "Accept-Encoding", "gzip, deflate, br" }, { "Accept-Language", "en-US,en;q=0.8,de-DE;q=0.5,de;q=0.3" }, { "User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0" } }, }; using (client) { while (true) { Console.Write(address); var response = await client.GetAsync(address); switch (response.StatusCode) { case HttpStatusCode.Found: case HttpStatusCode.Moved: address = response.Headers.Location; Console.Write(" -> "); break; case HttpStatusCode.OK: Console.WriteLine(" (Finished OK)"); return; default: Console.WriteLine($" (Finished with status {response.StatusCode})"); return; } } } 

    gives out

    https://scpfoundation.net/wikidot_random_page -> http://scpfoundation.ru/scp-958-v (Finished OK)

    (the second address is always different, of course).

    Headers are honestly stolen from Firefox, the AllowAutoRedirect = false flag is required.

    • Oh, thanks) Now I’ll try to jab this through OkHttp) * transparently hinting =))) - Yuriy SPb
    • @Yuriy SPb: And what is OkHttp? (I'm in Java quite a layman.) - VladD
    • one
      This is a library for android to work with the network. Everything has already turned out (thanks to your code), I’ll add my code with the code) And yes - the auto - forwarding ban is obligatory) - YuriySPb

    Same for :

     $opts = array('http' => array( 'method' => 'GET', 'follow_location' => false, 'header' => "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n" . "Accept-Encoding: gzip, deflate, br\r\n" . "Accept-Language: en-US,en;q=0.8,de-DE;q=0.5,de;q=0.3\r\n" . "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0\r\n" ) ); $context = stream_context_create($opts); $html = file_get_contents('https://scpfoundation.net/wikidot_random_page', false, $context); print $html . "\n";