There is a string ON CYRILLES, extracted from the database, which must be truncated by n characters. The collation is utf8-general-ci. Utf8 encoding. I am writing code:

echo substr($string, 0, 20); 

Displays 10 characters. Logically, arguing one character is taken for two, BUT "space" - for one.
What to do???

  • 2
    I think I found a solution - mb_substr () - Alexof
  • Try my version. He should definitely help you. - Node_pro

4 answers 4

Before echo mb_substr($string, 0, 20); write mb_internal_encoding("UTF-8"); . Must help, good luck!

  • one
    mb_internal_encoding affects only mb_ functions - Alex Kapustin
  • one
    I use one mb_substr () and everything works correctly. still looking at it (function) in 4 arguments, you can pass the encoding. thanks for the answer. - Alexof

Try this:

 echo mb_substr($string, 0, 20, 'UTF-8'); 
  • Please try to write more detailed answers. I am sure the author of the question would be grateful for your expert commentary on the code above. - Nicolas Chabanovsky

Translate Cyrillic to Unicode, cut into Unicode, translate back to Cyrillic.

ps refuse Cyrillic forever - use Unicode :)

  • one
    It is so in unicode. Written the same:> Collation - utf8-general-ci. Utf8 encoding - Alex Kapustin
  • and unless utf8 not? Cyrillic in the sense of - Russian letters - Alexof

In Unicode, Russian letters are 2 bytes, Latin letters and a space is 1 byte. Calculate the length of the string in bytes and call substr with this number.

  • one
    Isn't it better to use mb functions? - Node_pro
  • Russian yes but the space (space) - 1 byte. here and guess, think where you get a space,) - Alexof