<?php $text = file_get_contents("http://example.com"); echo $text; ?> 

The essence of the problem is that there is protection there, but in general, the html text is not loaded.

From other sites it turns out to get the source code, and from this - no. Question: how to parse the php site not through file_get_contents ? I rummaged in the internet, but I didn’t find something suitable, please help. :)

Closed due to the fact that off-topic participants are Visman , xaja , Aslan Kussein , Denis Khvorostin , PashaPash Oct 23 '15 at 10:22 .

It seems that this question does not correspond to the subject of the site. Those who voted to close it indicated the following reason:

  • "The question is caused by a problem that is no longer reproduced or typed . Although similar questions may be relevant on this site, solving this question is unlikely to help future visitors. You can usually avoid similar questions by writing and researching a minimum program to reproduce the problem before publishing the question. " - Visman, xaja, Aslan Kussein, Denis Khvorostin, PashaPash
If the question can be reformulated according to the rules set out in the certificate , edit it .

  • Thanks to all. I did not think that the problem would be solved so quickly, a really good site and competent users are sitting on it. Everything turned out through CURL and through file_get_contents. On the Internet, I also looked at parsing-and-i.blogspot.com/2009/09/curl.html - Yuriyvot

2 answers 2

I do not know what the problem is. Try curl. He loads everything.

UPD

It can easily be compiled. This one loads everything if you replace example.com with the one you sent. Only it is very important to enable FOLLOWLOCATION to enable follow redirects, otherwise there will be a void coming. Your wonderful site seems to redirect immediately without a single word.

 <?php $ch = curl_init("http://exmaple.com"); curl_setopt($ch, CURLOPT_GET, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); $output = curl_exec($ch); curl_close($ch); echo $output; ?> 
  • You can write an example of a small curl on a page number of atoms a long time to go :) - Yuriyvot

Apparently, if the User-Agent is not specified in the header, it redirects somewhere to the forest. Such is the simple protection against parsers.

You can change the sent headers like this:

 $opts = array( 'http' => array( 'method' => "GET", 'header' => "User-Agent: Nyaaa\r\n" ) ); $context = stream_context_create($opts); $text = file_get_contents("http://example.com", false, $context); echo $text; 
  • And, in view of the fact that a page that is located on a laptop, which now plays the role of a router, jumped out from me, it redirects to the IP address of the requester. - Alexey Sonkin