How to parse the necessary link, if there are many identical classes on the page and the number is always different, I mean [0, etc.], the beginning of the link, class is known. Example:

<?php include 'dom.php'; $html = file_get_html ('$_POST['name']); $link = $html->find('a[class=classname href=http://site.co/]'); ?> 

It is necessary to search by two parameters:

  1. the class
  2. beginning of the link

And to pull out the link, but not the name, at the same time I will insert this link into another variable.

  • It is unlikely to do this with this parser. - Visman
  • I have already done too much to switch to another language) - Arthur Kotov
  • Find the desired link with the desired class on the page easier through the regular schedule, and not with this parser. Although many, perhaps, they will argue about the opposite, but then I would like to see a decision from such people (I wrote this for the approvers);) - Visman
  • there regulars can be entered in the attributes . Yes, and who prevents to select links by class, and then from the result filter by url by hand? - teran
  • This is not the end of the script, then I need to insert this link into another variable, and then collect the information from this variable. - Arthur Kotov

1 answer 1

In the search expression, specify the reference class in the usual way a.classname , and check the beginning of the link through the [href^='....'] attribute

 $body = <<<HTML <div> <a class="link" href="http://yandex.ru">yandex</a> <a class="link needle" href="http://site.co">site.co</a> <a class="link" href="http://google.com">google</a> <a class="xxx" href="http://bing.com">bing</a> </div> HTML; $html = str_get_html($body); $links = $html->find("a.link[href^='http://site.co']"); foreach($links as $l){ print_r($l->href); } 

as a result of this code, the $links will contain an array of matching links. In this case, it will be the only link pointing to site.co


To get the first element found, pass index 0 to the find method. You can save the values ​​of the link and its text for further use with the href and plaintext properties:

 $lnk = $html->find("a.link[href^='http://site.co']", 0); $url = $lnk->href; $txt = $lnk->plaintext; 
  • It really works. The bad thing is that only one class can be specified. - Visman
  • @Visman, as I wrote in the comments to the question, you can cram the regular a[class*=.....] , where instead ... will be an expression for filtering the class. That is, for example, a[class*=class1.+class2] will receive links containing both classes. well, or there a[class*=link\d+] will find all link1, link2 , etc. - teran
  • Then this question ru.stackoverflow.com/q/561117/186083 designated as a duplicate of the question with your answer. - Visman
  • Many thanks) - Arthur Kotov
  • @ArturKotov if this is what you need, then mark the answer as correct - teran