How to make sure that the href attribute is present in the tag a and that it is not empty, and set the correct value?

How to implement this without a DOMDocument ? Because after saving, my encodings are flying. Well, in general, for now I don’t want to get involved with it.

I try to do it in the following way, but nothing works out for me

  $html = <<<HTML <a href='' title='title1'>AAA</a> <a class='xyz' href='' title='title2'>BBB</a> <a title='title1'>CCC</a> <a class='xyz' title='title2'>DDD</a> HTML; $html = preg_replace_callback( '/<a(.*?)(href=)?([\'"][\'"])?(.*?)>/', function ($matches) { return "<a{$matches[1]}href='aaa/bbb'$matches[3]>"; }, $html ); echo $html; 

Help me please.

Thank.

  • not by sabzh, but as a comment, instead of preg_replace_callback in this case, you can simply use preg_replace , where you can refer to $matches[1] simply $1 - teran
  • Use DOMDocument + DOMXPath. XPath - //a[not(@href and string-length(@href) > 0)] . And then .setAttribute("href", $val) . - Wiktor Stribiżew
  • @ WiktorStribiżew I need exactly a regular schedule, without a DOMDocument. I corrected the question - user216109
  • You need to use a DOMDocument. It would be better to ask the question why the encodings are flying - andreymal

1 answer 1

So that the question does not go unanswered, I will propose one of the solutions using regular expressions. Two templates are used.
The first pattern #<a(.+)href=(\'|\")\\2(.+?)># (. +?)> #<a(.+)href=(\'|\")\\2(.+?)># Selects links with an empty href attribute. You can use either single or double quotes (\'|\") , using the backward link \\2 checks that the closing quote matches the opening one. In the replacement pattern, using the $1 and $3 groups, all the source attributes of the link are substituted.

The second template using SKIP & FAIL (about which, by the way, I read on ruSO a couple of days ago) skips all links that have the href attribute filled: #<a.+href=(\'|\")[\w/]+\\1.*?>(*SKIP)(*F)|<a(.+)>#" , Thus leaving us only links where the attribute is missing.

 foreach(explode("\n", $html) as $h){ $result = preg_replace([ "#<a(.+)href=(\'|\")\\2(.+?)>#", "#<a.+href=(\'|\")[\w/]+\\1.*?>(*SKIP)(*FAIL)|<a(.+)>#",. ], ["<a href='first'$1$3>", "<a href='second'$2>"], $h); } 

At the output we have the following result ( 0 is the original, 1 is the result):

 Array( [0] => <a href='' title='title1'>Одинарная, пустая</a> [1] => <a href='first' title='title1'>Одинарная, пустая</a> ) Array( [0] => <a class='xyz' href='' title='title2'>Пустая, одинарная в середине</a> [1] => <a href='first' class='xyz' title='title2'>Пустая, одинарная в середине</a> ) Array( [0] => <a title='title1'>Нет</a> [1] => <a href='second' title='title1'>Нет</a> ) Array( [0] => <a class='xyz' title='title2'>Нет</a> [1] => <a href='second' class='xyz' title='title2'>Нет</a> ) Array( [0] => <a class="xyz" title="title2" href="/test">Двойная, ссылка</a> [1] => <a class="xyz" title="title2" href="/test">Двойная, ссылка</a> ) Array( [0] => <a class="xyz" title="title2" href='/test'>Одинарная, ссылка</a> [1] => <a class="xyz" title="title2" href='/test'>Одинарная, ссылка</a> ) Array( [0] => <a href="qeqweqwe" class="test">Двойная, ссылка</a> [1] => <a href="qeqweqwe" class="test">Двойная, ссылка</a> ) 

The first rule changes the attribute to first , the second to second