I get the page content through CURL and skip through the htmlspecialchars function. At the output of a large amount of code, so I will indicate the desired section:

 <script type=\"text\/javascript\" src=\"http:\/\/{domain}\/search.js?p=&query=Hlo9oDDbbm4{ch}&keywords=phone\"><\/script> 

I put the code above in the $text variable and try to get the value of Hlo9oDDbbm4 :

 preg_match("/search.js?p=&query=(.*?){ch}&keywords=phone/",$text,$matches); echo $matches[1]; 

but nothing comes out of me. How to make a regular expression?

  • one
    s? in your expression means the presence or absence of the letter s . The question mark itself is not expected in the line. We must put in front of him a backslash that he would have meant himself. But since you have a string in double quotes, then you need to double backslash i. \\.js\\? - Mike

2 answers 2

You forgot to shield . and ? Just correct the search.js?p=&query=(.*?){ch}&keywords=phone pattern

 preg_match("/search\\.js\\?p=&query=(.*?){ch}&keywords=phone/",$text,$matches); 
  • one
    minus as it is necessary to parse html through the dom tree and not regular. - Naumov
  • @Venta, thank you! Your answer helped. ;) - Cinema Trailers

And so you get the content of the page through curl , why do you need to screen htmlspecialchars for nothing. Next, load the variable into the parser and look at the example of simple_html_dom

 $html = str_get_html($content); foreach($html->find('script') as $element) { $src = $element->src; if(strpos($src,'/search.js') { parse_string($src); echo $query;// искомое значение } } 

So it is more correct.

  • Plus, because I learned something, it's not entirely clear what to do if search.js connects several times - Venta
  • @Venta Well, what about the regular season? Plus, the regular page iterates over a string of characters, finds a match and then searches for the next entry several times, depending on the regular expression ... - Naumov