The regular expression /(1|2|3) (?1) (?1)/ finds 1 1 2 in the string 1 1 2 3 4 . But in this line the mask corresponds to a piece 1 2 3 .

Question: I need to use 1 1 2 and 1 2 3 using a regular expression, how to do it? Used by: pcre, PHP.

Example: https://regex101.com/r/kO1wD8/2

2 answers 2

Whatever the pattern regular expression pattern to search for all intersecting matches, it is enough to put this pattern in the forward validation:

 (?=pattern) 

In this case, the regular expression will be sequentially checked for each position in the text, and therefore can find intersecting matches:

 (?=(1|2|3) (?1) (?1)) 

https://regex101.com/r/kO1wD8/3 Such a regular expression already finds two desired entries. Such a match has zero length, because the position in the text is checked, not the text, which means that to extract the text of the match, you need to place the template in a saving group

I want to immediately warn ( https://regex101.com/r/kO1wD8/4 ) that if you apply a regular expression to the text:

 1 1 2 3 4 21 2 1 

then it is possible that the third match is undesirable; in order to rule out such matches, it is necessary to do a retrospective check for the absence of a digit in front of the pattern:

 (?=(?<!\d)(1|2|3) (?1) (?1)) 

https://regex101.com/r/kO1wD8/5

  • for the latter case, \b denoting the word boundary is suitable, and the regular schedule will become shorter , getting rid of retrospectives - (?=\b(1|2|3) (?1) (?1)) - Mi Ke Bu
  • I just wrote for example, most likely the TS gave the minimally reproducible example and in fact it doesn’t have the alternative regular numbers 1, 2, 3, but other regular expressions. At least I want to believe it :) - ReinRaus

Only one option comes to my mind - to shift the beginning of the search in the string using the fifth element of the preg_match_all() function. Perhaps there are more elegant solutions.

 $str = '1 1 2 3 4'; $pattern = '/(1|2|3) (?1) (?1)/'; for($i = 0; $i < mb_strlen($str); $i += 2) { if(preg_match_all($pattern, $str, $out, PREG_PATTERN_ORDER, $i)) { echo $out[0][0].'<br />'; } } 

Script result

 1 1 2 1 2 3