Sample code here:
http://ideone.com/IUAwc5
In the regular expression, you will see two lines:

(?<!\\d(?:р|г|к)\\.) # (?<!\\d[ргк]\\.) 

By common sense, they mean the same thing, but if you uncomment the second and comment out the first, then as a result of the work, we will see that line wrapping was made after 101p. 50. 2020 , that is, the second expression does not work.
And I observe this behavior not only in PHP, but also in Python.
Is there a reasonable explanation for this, or is this some kind of magic?

  • And you can write in words, what do you want? That is, what should be compared with the sample. And then in one alternative of the group - \\. and a bunch of negative lookbehind assertions (which should stand before the template with which you are mapping, and not after). - alexlz
  • @alexlz, post what is happening as a bug here: bugs.exim.org/show_bug.cgi?id=1341 Here is a minimally working example: ideone.com/fdOelO I’ll delete the question soon, please first: I post a description of the bug using a google translator. Please rate the quality of the English text, are there any incidents, Although in the example everything is clearly visible, but still. - ReinRaus
  • Description of the bug did not understand. As for ideone.com/fdOelO , everything seems normal. - alexlz
  • Added such line to the example: echo preg_replace ("/ $ RE2 /", $ repl, $ text); // fail, expected ar bg vg yy Description of the bug in Russian: If the negative lookahead contains a character class with unicode characters and anything else besides this, then a match will not be found. (? <= \\ s [abc]) for example, if there is only a character class (? <= [abc]), or its corresponding alternative (a | b | c), then a match will be found. - ReinRaus
  • one
    Inadvertently, the @avp comment has been deleted instead of its own, sorry. - ReinRaus

1 answer 1

@ReinRaus eclipse found, alas. And if so (RTFM):

 echo preg_replace("/$RE1/u", $repl, $text); // ok echo preg_replace("/$RE2/u", $repl, $text); // fail, expected аг бГ вГ гг echo preg_replace("/$RE3/u", $repl, $text); // ok 
  • @alexlz, well, yes, I forgot about u (((Shame and shame (((It becomes clear why this happens if you imagine that any Russian character is two ASCII characters - ReinRaus