Hello, there is an HTML code:

<input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="Abracadabra111" > 

How can I get "Abracadabra111" ?. I used regexp /value=\"/ , but in the end I also got too much - " > .

  • one
    How about using a DOMDocument? Is this option (correct) being considered? - Wiktor Stribiżew
  • @stribizhev is considered :) (and if it is possible through simple_html_dom, then even better) - misc

5 answers 5

I'll offer you such a regular season:

 ~value="(.*?)"~ 

It will work. *? - any number of characters (in our case any), but as little as possible (that is, before the first " )

All code below. There's a tricky regular expression:

 ~<(?:.*?)value="(.*?)"(?:.*?)>~s 

It describes the search only inside tags, and the s modifier allows you to search in multi-line text. Look here at the https://regex101.com/r/qB9gS3/1 site tells in detail what's what (truth in English).

  $str = '<input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="AJKY99B2mC__AP7vPza111" >'; preg_match_all('~<(?:.*?)value="(.*?)"(?:.*?)>~s', $str, $matches); $i = 0; foreach ($matches as $match) { foreach ($match as $item) { echo "$i: ", htmlentities($item), "<br>"; } $i++; } 

Conclusion:

 0: <input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="AJKY99B2mC__AP7vPza111" > 1: AJKY99B2mC__AP7vPza111 

UPD: you can still add a space '~<(?:.*?)\svalue="(.*?)"(?:.*?)>~s'

    it’s not at all clear how you could get anything with this regexp . in the simplest case, it should be something like this:

      /value=\"(.+)\"/ 

    or better yet:

     /value=\"([[:alnum:]]+)\"/ 

    well, and you know that the array will return, in which you need to look at the [1] -th index, and not [0]?

    • Index 0: <input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="JKY992esIt7vpuT9ZYqwfk99pibSJwB2mC__AP7vPzM" >
    • well, well, you would say that there are other characters there, besides letters and numbers. And the first option that I proposed? generally without knowledge of the regexp syntax, at least basic, in programming nowhere. you need to read at least one book, but to practice, here is the resource: regex101.com - toxxxa

    I remember parsing pages using XPath queries, their main focus is XML, but HTML can also be processed. Here is a link to the article on Habrahabr

    • one
      Although the link can find the answer to the question, it is better to point out the most important thing here, and give the link as a source. If the page to which the link leads will be changed, the response link may become invalid. - Nick Volynkin
    • @NickVolynkin is good, next time I will not do that) - Roman

    Using Simple HTML DOM , try

     $str = '<input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="Abracadabra111" >'; $html = str_get_html($str); foreach($html->find('input[id=tracking__token]') as $element) echo $element->value;` 

    Here is another example:

     $html = '<input type="hidden" id="tracking__token" name="tracking[_token]" class="form-control" value="Abracadabra111" >'; $dom = new DOMDocument('1.0', 'UTF-8'); $dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); $searchNode = $dom->getElementsByTagName("input"); foreach($searchNode as $s) { if ($s->getAttribute("name") == "tracking[_token]") { $value = $s->getAttribute('value'); echo "$value\n"; } } 
       /<input\s+[^>]*?\s*value=(["'])?(.*?)\1/ 

      Take the second group.