Friends, please tell me a simple parser in php. The task is very simple, you need to take one line from a third-party site:

<img class="png" src="//s3.gismeteo.by/static/images/icons/new/d.sun.png" alt="Ясно" title="Ясно" width="55" height="55"> 

If the title is clear, then output 1, if it is cloudy, then 2 and so on. I tried to find some kind of parser in the Internet, but I haven’t found something yet. Thanks a lot in advance.

4 answers 4

 <?php function getPage($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_TIMEOUT, 60); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12'); $data_fin = curl_exec($ch); curl_close($ch); return $data_fin; } $url = "http://yoursite.com/index.php"; $page = getPage($url); if(preg_match('/title="Ясно"/',$page)){ $result = 1; }else{ $result=0; } echo $result; 

The code works if the curl module for php is installed

  • For a more accurate spelling of a regular expression, write a link to the page you want to parse. - ActivX

Use the SimpleHtmlDom , Luke!
For example:

 $html = file_get_html('http://your_site.co.uk/'); $data = null; foreach($html->find('img.png') as $img){ if(preg_match("/gismeteo\.by/i", $img->href) && $img->width == 55 && $img->height == 55) // Точно идентифицировать изображение нельзя, приходится как-то так switch($img->title){ case "Ясно": $data = 1; break; case "Облачно": $data = 2; break; // Сколь угодно условий default: $data = 0; // По дефолту break; } } 

Although the query "HTML parser for PHP" displays a huge pile of good and not very content with parsers, how can you skip everything ..

    How about using ready-made ?: phpQuery

      Try using regular expressions.

       $str = '<img class="png" src="//s3.gismeteo.by/static/images/icons/new/d.sun.png" alt="Ясно" title="Ясно" width="55" height="55">'; $result = preg_match('/title="Ясно"/',$str); 
      • If from a third-party site, use curl to get the remote page in addition to regular expressions. If you yourself can not make yourself a parser, you can contact the experts :) - ActivX