When extracting substrings using PHP, question symbols appear.

Question

Hello! There is a problem when accessing the elements of the string in Russian. When accessing the 0 line element in the $phrase variable, the question is displayed I only got around this by using mb_substr($phrase,$i,1,'UTF-8') . But this option does not suit me, since each character of the string should be sent to a function where it is searched for as an array key and returns a value.

 function getImageSymbol($symbol) { $symbol = mb_strtolower($symbol,'UTF-8'); $data = array( '0' => 'alphabet-0.png', '1' => 'alphabet-1.png', '2' => 'alphabet-2.png', '3' => 'alphabet-3.png', '4' => 'alphabet-4.png', '5' => 'alphabet-5.png', '6' => 'alphabet-6.png', '7' => 'alphabet-7.png', '8' => 'alphabet-8.png', '9' => 'alphabet-9.png', 'а' => 'alphabet-A.png', 'б' => 'alphabet-B.png', 'ь' => 'alphabet-bb.png', 'ъ' => 'alphabet-bbb.png', 'ц' => 'alphabet-C.png', 'ч' => 'alphabet-ch.png', 'д' => 'alphabet-D.png', '-' => 'alphabet-dash.png', '.' => 'alphabet-dot1.png', ':' => 'alphabet-dot2.png', ';' => 'alphabet-dot3.png', '?' => 'alphabet-dot4.png', '!' => 'alphabet-dot5.png', ',' => 'alphabet-dot6.png', 'е' => 'alphabet-E.png', 'ё' => 'alphabet-EE.png', 'э' => 'alphabet-EEE.png', 'ф' => 'alphabet-F.png', 'г' => 'alphabet-G.png', 'х' => 'alphabet-H.png', 'и' => 'alphabet-i.png', 'й' => 'alphabet-ii.png', 'к' => 'alphabet-K.png', 'л' => 'alphabet-L.png', 'м' => 'alphabet-M.png', 'н' => 'alphabet-N.png', '--' => 'alphabet-Ndash.png', 'о' => 'alphabet-O.png', 'п' => 'alphabet-P.png', 'р' => 'alphabet-R.png', 'с' => 'alphabet-S.png', 'ш' => 'alphabet-SH.png', 'щ' => 'alphabet-SHCH.png', 'т' => 'alphabet-T.png', 'у' => 'alphabet-U.png', 'в' => 'alphabet-V.png', 'ы' => 'alphabet-Y.png', 'я' => 'alphabet-YA.png', 'ю' => 'alphabet-yy.png', 'з' => 'alphabet-Z.png', 'ж' => 'alphabet-zh.png', ); return !empty( $data[$symbol] ) ? $data[$symbol] : 'alphabet-' . $symbol; } function makePhrase($phrase) { foreach (explode(' ', $phrase) as $word): for($i = 0; $i < strlen($word); $i++): $symbol = mb_substr($word,$i,1,'UTF-8'); ?> <img src="assets/images/alphabet-min/<?= getImageSymbol( $symbol ) ?>.png" alt=""> <?php endfor; endforeach; }

use mb_strlen ($ word) in the loop counter and everything will be believable

Accepted Answer · 2016-05-09T08:52:31

This question mark is half a two-byte UTF-8 character. The fact is that each character of the Russian text in UTF-8 occupies 2 bytes, and the substr() function by default considers all characters to be single-byte and can cut the Russian character in the middle.

For UTF-8, you have to either use the mb-functions mb_substr() , mb_strlen() , or turn on the php.ini level to replace the classic string functions with mb-variants by setting the value of the mbstring.func_overload directive to 2 in the [mbstring] section

 [mbstring] ... mbstring.func_overload = 2

After that, you can use the classic substr() and strlen() functions with Russian text in UTF-8. However, changing mbstring.func_overload may not be possible on all hostings, in addition, it may "break" the work of other applications, for example, the same phpMyAdmin.

@ splash58 Yeah, you need to either mb_substr () and mb_strlen (), or switch to mbstring.func_overload = 2 and use subst () and strlen ().

When extracting substrings using PHP, question symbols appear.

1 answer 1

More articles: