In regular expressions, the .NET dot works about as well as in other regular expression libraries other than POSIX: the dot finds any character other than a newline character ( "\n" ). Quantifier *? finds not the shortest substring; it finds as many characters as it needs to find a match. Since the string (like the regular one) is parsed from left to right (default), <td.*?> Finds <td , then 0 or more characters other than the line feed character, before the first occurrence > , followed by </tr> . If it were not </tr> , <td.*?> Would find <td collspan="5"> , as expected.
Solution : in order not to go beyond a single tag, use the exclusive symbol class [^<>]* , [^<]*? or [^>]* . If the tag can be non-serialized < or > , for example <tr><td name="<67"></td></tr> , you will need a tempered greedy token (English) ("greedy" "moderate" quantifier) (?:(?!</?[a-zA-Z]).)* , which does not find such characters with which the tag begins ( <a or </a ).
It is best to use the HTML parser - "slower you go - you will continue."
In most cases, fit:
<tr[^<]*?>(<td[^<]*?>|</td>|\s)*</tr>
See the demo
<tr[^>]*>(\s*<td[^>]*>\s*<\/td>)*\s*<\/tr>- splash58 am>takes out - splash58чтобы получилось ".*?" - как можно меньше.чтобы получилось ".*?" - как можно меньше.This does not mean that the minimum section will be captured if a larger section of text fits a regular schedule, but this smaller section is not. Check out theRegex Debuggerdebugger how the regex101.com/r/lG7eT9/1 regular schedule behaves. - Visman