Good day! It is necessary to take some data from the site, but it gives an error:

Warning: file_get_contents ( http://whois.domaintools.com/195.90.131.231) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP / 1.1 403 Forbidden

I try to load the page through simple_html_dom:

$ site = file_get_html ("http://whois.domaintools.com/$ip");

how do i solve this problem?

  • Try to get the body of the page using socket and then transfer to simple_dom_html, write the headers of a real browser to the socket - ReinRaus
  • one
    I tried to open through curl, but the result is the same> $ ch = curl_init (); >> curl_setopt ($ ch, CURLOPT_URL, " whois.domaintools.com/$ip" ); >> curl_setopt ($ ch, CURLOPT_USERAGENT, 'Mozilla / 5.0> (Windows NT 6.1; WOW64; rv: 20.0) Gecko / 20100101 Firefox / 20.0'); >> curl_setopt ($ ch, CURLOPT_REFERER, " whois.domaintools.com/$ip" );>>> curl_exec ($ ch); >> curl_close ($ ch); - woland
  • thanks @ReinRaus - your script still helped me - woland

2 answers 2

Get it. Sign here.

<pre><? $headers="Host: whois.domaintools.com\r\nAccept: */*\r\nAccept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4\r\nAccept-Charset: windows-1251,utf-8;q=0.7,*;q=0.3\r\nCache-Control: max-age=0\r\nAccept-Encoding: gzip,deflate,sdch\r\nUser-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.11 (KHTML, like Gecko) Ubuntu/11.10 Chromium/17.0.963.79 Chrome/17.0.963.79 Safari/535.11\r\n"; $sock=fsockopen('whois.domaintools.com', 80); $query="GET /195.90.131.231 HTTP/1.0\r\n".$headers."\r\n\r\n"; fwrite($sock, $query); $res=""; while (!feof($sock)) $res.=fread($sock, 2048); fclose($sock); $sp=explode("\r\n\r\n", $res); echo htmlspecialchars(gunzip($sp[1]))."\r\n"; function gunzip($zipped) { $offset = 0; if (substr($zipped,0,2) == "\x1f\x8b") $offset = 2; if (substr($zipped,$offset,1) == "\x08") { return gzinflate(substr($zipped, $offset + 8)); } return "Unknown Format"; } ?></pre> 

CURL and file_get_contents transfer the headers that contain PHP in the user agent, and the site apparently returns 403 because of this user agent.

  • Yes, yes, for whom the API is written ... Or am I already picking on it? - user6550
  • @klopp, well, the author wanted to pull the text out of HTML, I gave him an example of how to get HTML, and the fact that he went the wrong way is his problem. You prompted him the right path, and I helped to go through the wrong one :) - ReinRaus
  • Well, if you think about it - the more people who are churning, the more we have a salary :) - user6550
  • the result is the same - woland
  • If you open the example now, you will see the message there:> Thank you for using the DomainTools for your domain research. > To protect> Registered registrants> Whois lookups> That are allowed. You can create and> log in to your account account before doing it. - woland 2:58 pm

To get started, get the first utility of each spider: wget . Run, and see:

 wget -S "http://whois.domaintools.com/195.90.131.231" --17:54:15-- http://whois.domaintools.com/195.90.131.231 => `195.90.131.231' Resolving whois.domaintools.com... done. Connecting to whois.domaintools.com[199.93.60.254]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 403 Forbidden 2 Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 3 Pragma: no-cache 4 Set-Cookie: csrftoken=a8346261cd9fdd81fa9af22a80be95ca; path=/; domain=.domaintools.com 5 Set-Cookie: dtsession=ak67m6mt66fn012ppi629vmdc3; expires=Tue, 11-Apr-2023 13:54:15 GMT; path=/; domain=.domaintools.com 6 Content-Type: text/html 7 Expires: Thu, 19 Nov 1981 08:52:00 GMT 8 Server: lighttpd/1.4.30 9 Date: Sat, 13 Apr 2013 13:54:16 GMT 10 Connection: close 

17:54:16 ERROR 403: Forbidden.

Think about it.

But it is better not to suffer foolishness, but to read about the domaintools.com API, everything is very detailed there. As a home exercise, I suggest finding a link to the documentation on their website. If it takes you more than two clicks and 10 seconds - pichalka ...