enter image description here Good day! Previously, when parsing, Avito had a slightly different ip blocking algorithm, in which the ban could be bypassed by a slip for 6-8 seconds. But, sometimes it happened that the ban still caught. 20-30 minutes pass and everything is OK .. Tolerant!

Now, apparently, Avito slightly changed the algorithm and the ban jumps out more often. But it was not a ban that created complexity for me, but a way out of it. Now you need to enter the captcha to get out of the bath. I do not see any reason to try captcha now, because there is very little time, but sometimes I’ll write it manually - this is necessary, because the bath doesn’t often occur during parsing.

Difficulty: when opening the lock page via CURL, captcha does not want to be displayed (error 403). It means that some additional parameter should be sent to display it .. Cookies are .. Where to dig? Thank you in advance !

function curl($url){ $ch = curl_init(); curl_setopt($ch,CURLOPT_COOKIEJAR,'./cookie.txt'); curl_setopt($ch,CURLOPT_COOKIEFILE,'./cookie.txt'); curl_setopt($ch, CURLOPT_URL, $url); //curl_setopt($ch, CURLOPT_POST, true); //curl_setopt($ch, CURLOPT_POSTFIELDS, "captcha=8дя1ц&yes"); curl_setopt($ch,CURLOPT_REFERER,"https://www.avito.ru"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_NOBODY, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36'); $page=curl_exec($ch); curl_close($ch); return $page; } $content=curl("https://www.avito.ru/blocked"); echo '<base href="https://www.avito.ru" />'; echo $content; 
  • 403 - You can not. Guess what they check heders, most likely ( Origin:http://... Referer:http:// ...) ... maybe something else. Open normally in the browser and see. - E_p
  • Run Fiddler, compare the request sent by curl and the browser, bring the first one to the second. - Geslot
  • Thanks for answers. I rummage in heders. They differ. Prompt why I can not install a host? curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: www.avito.ru')); everything is exactly in the firebag my website is hanging - Sarkis Allahverdian
  • stackoverflow.com/questions/29343110/cant-set-host-in-curl-php "Host: ". parse_url($url, PHP_URL_HOST), "Host: ". parse_url($url, PHP_URL_HOST), - E_p
  • 6
    The site programmer explicitly tells you that he does not want to allow automatic interrogation of data, and you ask us how to break the rules of the site? Your question is not worth the slightest sympathy. - VladD

0