Actually a subject, here's the code

function get_url_contents($stream_url){ $url=$stream_url; $crl = curl_init(); $timeout = 5; curl_setopt ($crl, CURLOPT_URL,$url); curl_setopt($crl, CURLOPT_URL, $url); curl_setopt($crl, CURLOPT_HEADER, 0); curl_setopt($crl, CURLOPT_POST, 0); curl_setopt($crl, CURLOPT_COOKIE, 0); curl_setopt($crl, CURLOPT_TIMEOUT, 15); curl_setopt($crl, CURLOPT_REFERER, $url); curl_setopt($crl, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)"); curl_setopt($crl, CURLOPT_RETURNTRANSFER, 0); curl_setopt($crl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0); $ret = curl_exec($crl); return $ret; } $q='запрос'; $url='https://www.google.ru/search?q='.$q; $content = get_url_contents($url); $content=html_entity_decode($content); $txt = strip_tags ($content); echo $txt; 

after running all tags in place ....


P.S.

code

 $q='запрос'; $url='https://www.google.ru/search?q='.$q; $content = get_url_contents($url); function html2txt($document){ $search = array('@<script[^>]*?>.*?</script> @si ', // Strip out javascript '@<style[^>]*?>(.*?)</style> @si U', // Strip style tags properly '@<style>(.*?)</style> @si U', // Strip style tags properly '@<[\/\!]*?[^<>]*?> @si ', // Strip out HTML tags '@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments including CDATA ); $text = preg_replace($search, '', $document); return $text; } echo html2txt($content); 

the result has not changed

    2 answers 2

    html_entity_decode try to remove.

    Here, keep, leave only the text, if you need any tags, you can add exceptions to the regulars: http://pastebin.com/MZewPR4f

    • I tried, it does not help - arashvg
    • get_url_contents, show what returns? - Denis Sultanov
    • here is goo.gl/dbz13r This result gives out just as get_url_contents, and after strip_tags - arashvg
    • See the answer - Denis Sultanov
    • See the question - arashvg

    strip_tags (CHtml :: decode ($ content));