This is how a regular expression filters Russian letters before requesting to the database

preg_match('/[а-яА-Я]+/', $_GET['q'], $q); 

but there is some kind of glitch when entering the Cyrillic letter "p", looked at the values ​​that the regulars gives

 var_dump($q); var_dump($_GET['q']); 

it turned out that the letter r is replaced with diamonds with a question inside. and p is buggy with any character set

pic

why it happens?

  • one
    Use the u flag in the regular season? - user207618
  • yeah, good idea - user193361

1 answer 1

The symbol of a diamond means that you have a Russian UTF-8 character that takes 2 bytes, was cut in half - this is one byte, half of the Russian UTF-8 character. You must ensure that all components of your program work with UTF-8 characters correctly, for example, in regular expressions, you should add the modifier u to support UTF-8

 '/[а-яёА-ЯЁ]+/u' 

To handle strings, use the mbstring extension function.

  • In your regular season, like the author of the question, the letters of the Russian language are listed only, not Cyrillic, and even then not all. /[а-яё]+/ui - all letters of the Russian language, /\p{Cyrillic}+/u - all Cyrillic. - Visman
  • Thanks for the remark corrected. I wanted to draw attention to precisely the split UTF-8 character. - cheops