Please help me write a tag cleaning macro. It is necessary to clear the sparse descriptions of unnecessary "garbage" in the tags.
There is a macro that removes all attributes in tags:
Sub ЧисткаHTML() Dim cell As Range For Each cell In ActiveSheet.UsedRange.Columns(1).Cells cell = HTML_DeleteAttributes(cell) Next cell End Sub Function HTML_DeleteAttributes(ByVal txt$) On Error Resume Next With CreateObject("VBScript.RegExp") .Global = True .Pattern = "(<[A-Za-z1-6]+)[^<>]*(>)" txt$ = .Replace(txt$, "$1$2") .Pattern = ">\s*<" txt$ = .Replace(txt$, "><") End With HTML_DeleteAttributes = txt$ End Function The macro is excellent, all tags are cleared with a bang! But in my case there is one nuance. In td tags rowspan and colspan values are possible. They must be left when cleaning.
For example, from this tag
<td rowspan=2 style=width:115.55pt,border:solid windowtext 1.0pt, mso-border-alt:solid windowtext .5pt,padding:0cm 5.4pt 0cm 5.4pt width=154 valign=top> during cleaning should be replaced by <td rowspan=2>
Moreover, the options are possible such:
1. <td ...простой мусор, без rowspan и colspan> 2. <td ...colspan=...> - только colspan 3. <td ...rowspan=...> - только rowspan 4. <td ..colspan=4...rowspan=5...> - присутствует и rowspan и colspan. The value of colspan and rowspan can be any number from 2 to 13 (in my case). I have about 10 thousand such positions, it’s impossible to check everything manually. I ask for help in writing or editing this macro. Thank!