Important! Not all tags should be removed, some should remain in the white list: a, b, i, u, ul, li, ol, img

Here is my code: (it removes all tags)

 pole = pole.replace(/<!--[\s\S]*?--!?>/g, '').replace(/<\/?[az][^>]*(>|$)/gi, ''); 

And still it is necessary to delete attributes in tags from the white list.

  • one
    Apparently, speech about browser javascript. And this means that you have the perfect tool for solving your problem - the DOM and its api. A bit of recursion and the simplest logic and you solve your problem without using regular expressions. Well, of course, the indispensable link is stackoverflow.com/a/1732454/1510966 - Alexey Ukolov

2 answers 2

 text = "<a class=test><i><b><pre></pre></b></div><img src=/test.png>"; text = text.replace(/<(\/?)([az]+)[^>]*(>|$)/gi, function(match, slash, tag) { if (["a", "b", "i", "u", "ul", "li", "ol", "img"].indexOf(tag) < 0) { // Ρ‚Π΅Π³ Π½Π΅ ΠΈΠ· Π±Π΅Π»ΠΎΠ³ΠΎ списка удаляСм ΠΏΠΎΠ΄Ρ‡ΠΈΡΡ‚ΡƒΡŽ return ''; } // Ρ…ΠΎΡ€ΠΎΡˆΠΈΠΉ Ρ‚Π΅Π³ оставим Π±Π΅Π· Π°Ρ‚Ρ€ΠΈΠ±ΡƒΡ‚ΠΎΠ² return '<' + slash + tag + '>'; }); console.log(text); // Π²Ρ‹Π²Π΅Π΄Ρ‘Ρ‚ <a><i><b></b><img> 

Of course, you will need to refine this example, for example, for tags written in capital letters. Also, you definitely do not need the <img> without the src attribute.

This is all if you forget about the fact that you are solving the wrong task with the wrong tools . What if the tag attribute is < ? ..

  • Thank. The attribute will not be < , I am doing the text formatting after the editor, that is, all that he does is insert <p style="..."> , etc. How to add an attribute to an exception in <img> ? - Abmin

Alas, I do not know javascript, so I wrote an example in Ruby:

 str = <<'EOT' Π­Ρ‚ΠΎ <!--ΠΏΡ€ΠΈΠΌΠ΅Ρ€ тСкста--> с <a href="#">ΠΊΠ°ΠΊΠΎΠΉ-Ρ‚ΠΎ</a> <br>Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΠΎΠΉ <br /> <b class="11">ΠΈ</b> <amg src="1"> <area>Π½Π΅ΠΏΠ°Ρ€Π½Ρ‹ΠΌΠΈ</area> <strong> Ρ‚Π΅Π³Π°ΠΌΠΈ</strong> ΠΈ, <!--ΠΊΠΎΠΌΠ΅Π½Ρ‚Π°ΠΌΠΈ-->, <ul id="w"> <li id="id" class="class">1</li></ul>. <u class="Π½Ρƒ ΠΈ Π΄Π°ΠΆΠ΅ Ρ‚Π°ΠΊ" /> EOT puts str.gsub(/(<(?!\/?a\b)(?!\/?b\b)(?!\/?i\b)(?!\/?u\b)(?!\/?ul\b)(?!\/?li\b)(?!\/?ol\b)(?!\/?img\b)[^>]+\s*\/?>)/m,'') .gsub(/(<a\b|b\b|i\b|u\b|ul\b|li\b|ol\b|img\b)(\s[^>]+?)(\s*\/?>)/m,'\\1\\3') 

Result:

 Π­Ρ‚ΠΎ с <a>ΠΊΠ°ΠΊΠΎΠΉ-Ρ‚ΠΎ</a> Ρ€Π°Π·ΠΌΠ΅Ρ‚ΠΊΠΎΠΉ <b>ΠΈ</b> Π½Π΅ΠΏΠ°Ρ€Π½Ρ‹ΠΌΠΈ Ρ‚Π΅Π³Π°ΠΌΠΈ ΠΈ, , <ul> <li>1</li></ul>. <u /> 

Eventually

The first global replacement regexp removes all tags not from a white sheet:

 /(<(?!\/?a\b)(?!\/?b\b)(?!\/?i\b)(?!\/?u\b)(?!\/?ul\b)(?!\/?li\b)(?!\/?ol\b)(?!\/?img\b)[^>]+\s*\/?>)/m,'' 

The second regexp cleans tags from the white sheet:

 /(<a\b|b\b|i\b|u\b|ul\b|li\b|ol\b|img\b)(\s[^>]+?)(\s*\/?>)/m,'\\1\\3' 

Online at ideone

Thus, program cycles are not needed, all can regexps themselves :)

Ps. Naturally, if the register is not important - in the regexp add my qualifier "i"