Error with parser on php

Question

Here is the parser code:

ini_set('error_reporting', E_ALL); ini_set('display_errors', 1); ini_set('display_startup_errors', 1); ini_set('max_execution_time', 900000);

$ link = $ _POST ['page_link']; // page_link is the google results page $ i = 0; // starting number of the link to the page from the issue results

 $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$link); curl_setopt($ch, CURLOPT_USERAGENT, ""); curl_setopt($ch, CURLOPT_FAILONERROR, 1); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_REFERER, "http://www.google.ru/"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); curl_setopt($ch, CURLOPT_TIMEOUT, 30); curl_setopt($ch, CURLOPT_POST, 0); $data = curl_exec($ch); preg_match_all("/<cite>(.+?)<\/cite>/is",$data,$matches); $result = $matches[1]; $resultLength = count($result); for ($i; $i < $resultLength; $i++) { $str_out = strip_tags($result[$i]); $str = file_get_contents($str_out, false); preg_match_all('#(.+?)\@([a-z0-9-_]+)\.(ru|net|com|ua|in|by|tv|pl|biz)#i',$str,$matches); $urls[] = $matches[0]; $urls_result = implode("",$urls[0]); echo $str_out."<br>"; echo $urls_result."<br>"; }

If a

 $link = "https://www.google.com.ua/search?q=odessa+web+studio+contact&oq=odessa+web+studio+contact&gs_l=psy-ab.3..33i160k1.721.721.0.1337.1.1.0.0.0.0.108.108.0j1.1.0....0...1.1.64.psy-ab..0.1.107....0.hXfl1TDkaHc";

Here is the result:

 https://skylogic.com.ua/contacts.html sup@skylogic.com https://lynx.od.ua/contacts/ sup@skylogic.com https://www.trendline.in.ua/ sup@skylogic.com https://sozdat-sayt.com.ua/contact/ sup@skylogic.com

and so on.

Question: why the sparse email is repeated, if a separate email should be parsed on each separate page. If you change the number of the start page with which the parsing should start, the e-mail is changed, to the one that was sent from another page, but duplicated by analogy.

You use it purely for practice, and not in any way to send spam, right?

Farkhod Daniyarov Farkhod Daniyarov 1.569 one 9 18 · Accepted Answer · 2017-10-10T10:30:52

You seem to loop shove everything here $urls[] = $matches[0];
And then output only $urls_result = implode("",$urls[0]); First index $urls[0]
At the beginning of the loop, add $urls = []; and there will be happiness

Error with parser on php

1 answer 1

More articles: