There is a script that is trying to determine the encoding and change to WINDOWS-1251.
Here is the main piece:

$from = mb_detect_encoding($cell,"auto"); print_r($from);echo"\n"; $cell = iconv( $from, "WINDOWS-1251//IGNORE", $cell ); 

Who guessed, it happens to every cell of one of the xls file.
In general, the second line with a print, shows that ASCII.
The result shows the scrubber , which indicates a wrong definition of the encoding and further actions.

I was a long time, perhaps not enough, googled, and came to check these two lines:

 print_r(mb_list_encodings()); print_r(mb_detect_order()); 

The second gives modest massif:

 Array ( [0] => ASCII [1] => UTF-8 ) 

The first is much larger . And depending on the version of PHP, the list slightly changes.
At first I sinned on it, but then I opened the file in OpenOffice, which did not even suggest choosing encodings.
I installed which extension, where I chose латин-1 => кирилица , and everything is good.
That is latin-1 , I thought, but it is, as I understand it, in any version .

I wanted to get the first line of the list and mb_detect_encoding it in mb_detect_encoding , but there still is some sort of order .

In general, I'm confused already ...

Tell me how to properly determine the encoding?
Or add on my sabzh, if you do not get a complete porridge.

Thank.

UPD1 :
Today I tried:

 print_r(mb_check_encoding($cell,"WINDOWS-1251")); print_r(mb_check_encoding($cell,"ASCII")); print_r(mb_check_encoding($cell,"WINDOWS-1252")); print_r(mb_check_encoding($cell,"UTF-8")); print_r(mb_check_encoding($cell,"ISO-8859-1")); 

All give 1 .
What does mean?
Is the encoding wrong or any suitable ??

  • Everything is correct, you get the list and fuss it in mb_detect_encoding, it determines the order of your encodings, and then use the first value from the array to convert to. The very first in the list will be the most "inclusive encoding" (the one that includes the others) - Daniel Protopopov
  • It is better not to define the coding, but I will know all the definitions because Many encodings intersect, for example, UTF8 up to 128 characters this is ASCII one to one ie. one byte per character in the end mb_detect does not know what to choose. For this advice, first manually determine the encoding and then bring it from the original to the required. - fens
  • one
    If my memory serves me, Latin-1 does not support Cyrillic at all. - fens
  • Look what will happen like this: mb_detect_order(mb_list_encodings()); print_r(mb_detect_order()); mb_detect_order(mb_list_encodings()); print_r(mb_detect_order()); - ilyaplot
  • @ilyaplot, it is not (and did not work), there is at least some order needed, as far as I understood. - borodatych

0