Deleting attributes in all html tags except <img> (php)

Question

It is necessary to clear the html code from the styles that the visual editor has written to it. I do this:

 $handle = @fopen("new_item.htm", "r"); if ($handle) { while (!feof($handle)) { $buffer= fgetss($handle, 4096,'<img>,<title>,</title>,<table>,<tr>,</tr>,<td></td>,</table>'); $html_clean= trim(preg_replace('/<([az][a-z0-9]*)[^>]*?(\/?)>/i','<$1$2>',$buffer)); echo($html_clean); } fclose($handle); }

As a result, I get the required type of code, but with the empty contents of the <img> .

How to add a regular season so that it also ignores the contents of the <img> ?

I want to get something like:

 <table> <tr><td><img src="/path/img.jpg" width="100" height="400"></td></tr> </table>

Maybe this: preg_replace('/<img\b[^<]*>(*SKIP)(*F)|<([az][a-z0-9]*)[^>]*?(\/?)>/i','<$1$2>',$buffer) ?
If you add your comment as an answer, I can close the question :)
What is a normal HTML parser and why not to do it with regulars?

Wiktor Stribiżew Wiktor Stribiżew 11.8k 2 13 32 · Accepted Answer · 2017-02-14T09:24:48

You can capture the whole <img> and restore it using a backward link in the replacement template, but PCRE has another way: use the match skip mechanism.

 preg_replace('/<img\b[^<]*>(*SKIP)(*F)|<([az][a-z0-9]*)[^>]*?(\/?)>/i','<$1$2>',$buffer)

<img\b[^<]*>(*SKIP)(*F)| means: find <img , after g should be a character other than a letter / number or _ , then 0+ characters other than < , and then > , and when the pattern finds a match, all this should be discarded and continue the search for matches from that place where the previous coincidence is stuck.

There may be more exceptions, just add an alternative group: /(?:ДРУГОЙ_ШАБЛОН_ИСКЛЮЧЕНИЯ|<img\b[^<]*>)(*SKIP)(*F)|ВАШ_ОСНОВНОЙ_ШАБЛОН/i .

Deleting attributes in all html tags except <img> (php)

1 answer 1

More articles: