I work in PHP. It is necessary to replace all links in which there is an extension jpg, png, etc. I do this:

$row["message"] = preg_replace('/<a\shref=\"(.+?[jpeg|jpg|png])\"\starget=\"_blank\">(.+?)<\/a>/is', '<a data-fancybox="gallery" href="$1"><img src="$1" alt="" class="tmp_class"></a>', $row["message"]); 

For testing, the following line:

 <p>С этой формы приходят заявки <a href="https://site.com/lack_tech.php">https://site.com/lack_tech.php</a></p> <p>Или что ты имеешь ввиду?</p> <div class="attachment_files_message"> <p>Прикреплённые файлы:</p> <a href="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png" target="_blank">Screenshot_1.png</a> </div> 

That is, you need to replace the second link, and in my code, everything is replaced between the beginning of the link <a href=".... and up to the most recent end tag </a>

  • 2
    [jpeg|jpg|png] = [jegnp|] . Replace \"(.+?[jpeg|jpg|png])\" with "([^"]*(?:jpe?g|png))" . But you'd better first parse with DOM, then everything will be easier. - Wiktor Stribiżew
  • @ WiktorStribiżew thanks, your decision helped, publish the answer full-fledged ... - Vladimir

1 answer 1

The [jpeg|jpg|png] template is identical to [jegnp|] , since [...] is a character class that finds 1 character specified in the class. Also .+? can capture too much text, as the dot finds any character.

If you replace \"(.+?[jpeg|jpg|png])\" with "([^"]*\.(?:jpe?g|png))" , the error will disappear , but in some cases this expression is all equal does not work (a different number of whitespace characters, the absence or location elsewhere of the attribute target=\"_blank\" ).

A DOMDocument seems more appropriate in this case.

See the PHP demo :

 $html = <<<EOD <p>С этой формы приходят заявки <a href="https://site.com/lack_tech.php">https://site.com/lack_tech.php</a></p> <p>Или что ты имеешь ввиду?</p> <div class="attachment_files_message"> <p>Прикреплённые файлы:</p> <a href="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png" target="_blank">Screenshot_1.png</a> </div> EOD; $dom = new DOMDocument(); // Создаем DOM $dom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); // Парсим DOM $xpath = new DOMXPath($dom); // Инициализируем структуру XPath на основе DOM foreach ($xpath->query('//a[@target="_blank"]') as $OurNode) { if (preg_match('~\.(?:jpe?g|png)$~i', $OurNode->getAttribute('href'))) { $fragment = $dom->createDocumentFragment(); $aNode = $dom->createElement('a'); $aNode->setAttribute('data-fancybox', 'gallery'); $aNode->nodeValue = ''; $aNode->setAttribute('href', $OurNode->getAttribute('href')); $img = $dom->createElement('img'); $img->setAttribute("src", $OurNode->getAttribute('href')); $img->setAttribute("alt", ""); $img->setAttribute("class", "tmp_class"); $aNode->appendChild($img); $fragment->appendChild($aNode); $OurNode->parentNode->replaceChild($fragment, $OurNode); } } echo mb_convert_encoding($dom->saveHTML(), 'UTF-8', 'HTML-ENTITIES'); 

Result

 <p>С этой формы приходят заявки <a href="https://site.com/lack_tech.php">https://site.com/lack_tech.php</a><p>Или что ты имеешь ввиду?</p><div class="attachment_files_message"> <p>Прикреплённые файлы:</p> <a data-fancybox="gallery" href="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png"><img src="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png" alt="" class="tmp_class"></a> </div></p> 

In foreach ($xpath->query('//a[@target="_blank"]') as $OurNode) we search for all tags a with the target attribute equal to _blank . Next, we check if the value of the href attribute ends with .jpeg , .jpg or .png , then we create a new a tag, add attributes to it, an img child element with attributes, and at the end replace the old a tag with a new one.

If you use a regular expression here, you should try

 preg_replace('~<a\s+((?:[^\s<>\'"=]+(?:=(?:"[^"]*"|\'[^\']*\'|[^\s\'">]+))?\s+)*)href=(?|"([^"]*\.(?:jpe?g|png))"|\'([^\']*\.(?:jpe?g|png))\'|([^\s\'">]*\.(?:jpe?g|png))(?=[>\s]))((?:\s+[^\s<>\'"=]+(?:=(?:"[^"]*"|\'[^\']*\'|[^\s\'">]+))?)*)\s*>(.*?)</a>~is', '<a $1$3 data-fancybox="gallery" href="$2"><img src="$2" alt="" class="tmp_class"></a>', $row["message"]) 

Regular expression demo

Details

  • <a - substring <a
  • \s+ - 1+ whitespace characters
  • ((?:[^\s<>'"=]+(?:=(?:"[^"]*"|'[^']*'|[^\s'">]+))?\s+)*) - addictive mask # 1: 0 or more duplicate attributes with optional values ​​(i.e. there may be something like required , target='_blank' , etc.)
  • href=(?|"([^"]*\.(?:jpe?g|png))"|'([^']*\.(?:jpe?g|png))'|([^\s'">]*\.(?:jpe?g|png))(?=[>\s])) - = , followed by optional quotation marks ' / " and an exciting submask # 2 that captures the text inside '...' / "..." or characters other than whitespace, ' , " and > if the href value ends with . + jpeg , jpg or png
  • ((?:\s+[^\s<>'"=]+(?:=(?:"[^"]*"|'[^']*'|[^\s'">]+))?)*) - addictive mask # 3: 0 or more repetitions of attributes with optional values ​​(i.e. there could be something like required , target = '_ blank', etc.)
  • \s*> - 0+ whitespace and >
  • (.*?) - an exciting disguise number 4: 0 or more of any characters to the first entry
  • </a> is a substring </a> .

Replace the match to <a $1$3 data-fancybox="gallery" href="$2"><img src="$2" alt="" class="tmp_class"></a> , because in the exciting groups number 1 and Number 3 may be additional attributes.

See also PHP demo :

 $message = '<p>С этой формы приходят заявки <a href="https://site.com/lack_tech.php">https://site.com/lack_tech.php</a></p> <p>Или что ты имеешь ввиду?</p> <div class="attachment_files_message"> <p>Прикреплённые файлы:</p> <a href="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png" target="_blank">Screenshot_1.png</a> </div>'; $message = preg_replace('~<a\s+((?:[^\s<>\'"=]+(?:=(?:"[^"]*"|\'[^\']*\'|[^\s\'">]+))?\s+)*)href=(?|"([^"]*\.(?:jpe?g|png))"|\'([^\']*\.(?:jpe?g|png))\'|([^\s\'">]*\.(?:jpe?g|png))(?=[>\s]))((?:\s+[^\s<>\'"=]+(?:=(?:"[^"]*"|\'[^\']*\'|[^\s\'">]+))?)*)\s*>(.*?)</a>~is', '<a $1$3 data-fancybox="gallery" href="$2"><img src="$2" alt="" class="tmp_class"></a>', $message); echo $message; 

Result

 <p>С этой формы приходят заявки <a href="https://site.com/lack_tech.php">https://site.com/lack_tech.php</a></p> <p>Или что ты имеешь ввиду?</p> <div class="attachment_files_message"> <p>Прикреплённые файлы:</p> <a target="_blank" data-fancybox="gallery" href="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png"><img src="http://site.com/public/uploads/kylticket/2670/Screenshot_1.png" alt="" class="tmp_class"></a> </div>