Good day! Previously, when parsing, Avito had a slightly different ip blocking algorithm, in which the ban could be bypassed by a slip for 6-8 seconds. But, sometimes it happened that the ban still caught. 20-30 minutes pass and everything is OK .. Tolerant!
Now, apparently, Avito slightly changed the algorithm and the ban jumps out more often. But it was not a ban that created complexity for me, but a way out of it. Now you need to enter the captcha to get out of the bath. I do not see any reason to try captcha now, because there is very little time, but sometimes I’ll write it manually - this is necessary, because the bath doesn’t often occur during parsing.
Difficulty: when opening the lock page via CURL, captcha does not want to be displayed (error 403). It means that some additional parameter should be sent to display it .. Cookies are .. Where to dig? Thank you in advance !
function curl($url){ $ch = curl_init(); curl_setopt($ch,CURLOPT_COOKIEJAR,'./cookie.txt'); curl_setopt($ch,CURLOPT_COOKIEFILE,'./cookie.txt'); curl_setopt($ch, CURLOPT_URL, $url); //curl_setopt($ch, CURLOPT_POST, true); //curl_setopt($ch, CURLOPT_POSTFIELDS, "captcha=8дя1ц&yes"); curl_setopt($ch,CURLOPT_REFERER,"https://www.avito.ru"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_NOBODY, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36'); $page=curl_exec($ch); curl_close($ch); return $page; } $content=curl("https://www.avito.ru/blocked"); echo '<base href="https://www.avito.ru" />'; echo $content;
Origin:http://...Referer:http://...) ... maybe something else. Open normally in the browser and see. - E_pcurl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: www.avito.ru'));everything is exactly in the firebag my website is hanging - Sarkis Allahverdian"Host: ". parse_url($url, PHP_URL_HOST),"Host: ". parse_url($url, PHP_URL_HOST),- E_p