Good afternoon. There is a html type code

... {if var > 0} тут хтмл код {else}тут другой хтмл код {endif} ..... {if id < 6}тут хтмл код{else}тут другой код{endif} ... 

Trying to find all entries matching the pattern:

 !{if ([a-z0-9:]{3,20}) ([\=\=|>|<|\!\=]{1,2}) ([0-9a-z]{1,20})}(.*?){else}(.*?){endif}!U 

I use (. *?) Because inside the conditions there can be any content (not only html code and text).

The problem is that if the condition is one, it finds everything correctly. But if there are more conditions, the entire piece of code from the first {if to the last endif} is taken.

  • Such problems are not solved by regulars, but through the construction of AST. Parsing recurrent structures with regulars is a dead number (although this is theoretically possible). - Dmitriy Simushev
  • And how to build such a tree? - terantul
  • Give an example of the text on which it finds the desired result. - ReinRaus 1:51 pm

1 answer 1

The U flag that you applied to the regular expression does quantification *? greedy, and quantification * - minimal.
Simply remove it and everything will work for you:
https://regex101.com/r/kB7nS3/1
Why just one coincidence, not 2? You yourself indicated that the symbol must be at least 3:

 [a-z0-9:]{3,20} 

Remove the flag and will work. Krivenko will certainly work, so I corrected your regular expression a little.

 {if +([a-z0-9:]{1,20})+ +(==|>|<|\!=) +([0-9a-z]{1,20})}(.*?){else}(.*?){endif} 

https://regex101.com/r/kB7nS3/2

  • unfortunately php doesn't know anything about the g flag. - terantul
  • Of course I do not know. This is a regex101 chip. In PHP, there is a preg_match_all instead of this flag. - ReinRaus