Parsing is done using PHP Simple HTML DOM Parser

include('simple_html_dom.php'); $urls = array( 'http://sitename/1.html', 'http://sitename/2.html', ... 'http://sitename/98.html', 'http://sitenamey/100.html' ); foreach($urls as $urlsItem){ $output = curl_init(); curl_setopt($output, CURLOPT_URL, $urlsItem);//отправляем адрес страницы curl_setopt($output, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($output, CURLOPT_TIMEOUT, 60); curl_setopt($output, CURLOPT_HEADER, 0); curl_setopt($output, CURLOPT_CONNECTTIMEOUT, 3); $out .= curl_exec($output); curl_close($output); } $html = new simple_html_dom(); $html->load($out); 

Then it works with $ html and breaks it into components.

At the moment, the script refers immediately to all the links in the array and receives the content. Is it possible to make it refer to links in the array in turn and move on to the next one after processing the previous one after the specified time in ms or sec?

2 answers 2

You can somehow like this:

 foreach($urls as $urlsItem){ $output = curl_init(); curl_setopt($output, CURLOPT_URL, $urlsItem);//отправляем адрес страницы curl_setopt($output, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($output, CURLOPT_TIMEOUT, 60); curl_setopt($output, CURLOPT_HEADER, 0); curl_setopt($output, CURLOPT_CONNECTTIMEOUT, 3); $out .= curl_exec($output); curl_close($output); usleep(1000000); //задержка в 1 сек } 
  • If only one second is possible and just sleep - Naumov

Use a DOMDocument for this.

 $dom = new DOMDocument; $doms = array(); foreach($urls as $url){ if($dom->load($url)) { $doms[] = $dom; } } echo'<pre>', print_r($doms, true), '</pre>';