I wrote a code that parses the xlsx file and spits it out into csv. When exporting, the data in English is displayed correctly. But there is no Cyrillic. (krakozyabry), as I understood it troubles with encoding.

This is an export file. Export This is an import file. Import

The online decoder suggests that the exported text is encoded with WIN1251 Decoder

For example, you can take one of the columns: "R Γ‘β€š "

It also determines what is actually required for UTF-8 encoding to properly display (interpret?).

The problem is that I cannot understand which part of my code is not working correctly, importing, or exporting, because there are no cracks in the sheets with cells when debugging. Everything is displayed correctly.

Here is my import code (EPPlus library is used for import:

try { FileInfo existingFile = new FileInfo(path); using (ExcelPackage package = new ExcelPackage(existingFile)) { ExcelWorksheet worksheet = package.Workbook.Worksheets[1]; // Π΄Π°Π½Π½Ρ‹Π΅ всСгда Π½Π° ΠΏΠ΅Ρ€Π²ΠΎΠΌ листС. int row = GetDimensionRows(worksheet); // ΡƒΠ·Π½Π°Π΅ΠΌ сколько строк Π² Ρ‚Π°Π±Π»ΠΈΡ†Π΅. for (int r = 1; r <= row; r++) // считываСм всС строки { string[] buff = new string[18]; // создаСм Π±ΡƒΡ„Π΅Ρ€ строки эксСль ΠΈΠ· 18 ячССк. (18 ΠΊΠΎΠ»ΠΎΠ½ΠΎΠΊ) for (int i = 1; i < 19; i++) // сканируСм 18 ΠΊΠΎΠ»ΠΎΠ½ΠΎΠΊ (с 1 ΠΏΠΎ 18) { try { Value = null; //ΡƒΠ±ΠΈΡ€Π°Π΅ΠΌ мусор Value = worksheet.Cells[r, i].Value.ToString(); // ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ ячСйки // Value = GetRightEncoding(Value); // ΠΊΠΎΠ½Π²Π΅Ρ€Ρ‚ΠΈΡ€ΡƒΠ΅ΠΌ Π² ΠΏΡ€Π°Π²ΠΈΠ»ΡŒΠ½ΡƒΡŽ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²ΠΊΡƒ. } catch (NullReferenceException) { Value = "empty"; System.Console.WriteLine("JDC: ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚Π°Π½ΠΎ пустоС Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ ячСйки"); } buff[i-1] = Value.ToString(); // ЗасовываСм отсканированныС стролбцы Π² Π±ΡƒΡ„Π΅Ρ€ } 

Here is my export code (used jitbit / CsvExport library):

 try { export.AddRow(); export["handleId"] = input[i].massive[0]; export["fieldType"] = input[i].massive[1]; export["name"] = input[i].massive[2]; export["description"] = input[i].massive[3]; export["productImageUrl"] = input[i].massive[4]; export["collection"] = input[i].massive[5]; export["sku"] = input[i].massive[6]; export["ribbon"] = input[i].massive[7]; export["price"] = input[i].massive[8]; export["surcharge"] = input[i].massive[9]; export["visible"] = input[i].massive[10]; export["discountValue"] = input[i].massive[11]; export["inventory"] = input[i].massive[12]; export["weight"] = input[i].massive[13]; export["productOptionName1"] = input[i].massive[14]; export["productOptionDescription1"] = input[i].massive[15]; } catch(ArgumentOutOfRangeException) { Console.WriteLine("JDC: ΠŸΡ€ΠΈ ΠΏΠΎΠΏΡ‹Ρ‚ΠΊΠ΅ записи, ΠΌΠ΅Ρ‚ΠΎΠ΄ обратился ΠΊ Π½Π΅ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰Π΅ΠΌΡƒ Π·Π²Π΅Π½Ρƒ массива"); } export.ExportToFile(MakePath(Program.GlobalPath) + "Export.csv", 8); 

If you are interested, here is the code for the export method from the library, a little refined:

 public void ExportToFile(string path, int encodingType) { if (encodingType == 8) try { File.WriteAllLines(path, ExportToLines(), Encoding.UTF8); } catch (System.IO.IOException) { //No access to file } if (encodingType == 32) try { File.WriteAllLines(path, ExportToLines(), Encoding.UTF32); } catch (System.IO.IOException) { //No access to file } } 

Help me figure out where the dog is buried, ideally, if the problem is in interpretation at the data import stage, then you can limit the transcoding method after the string: Value = worksheet.Cells[r, i].Value.ToString(); What you can actually see (the commented-out method is my yesterday's attempt)

Method code:

  public static string GetRightEncoding(string input) { string output = null; // Encoding utf8 = Encoding.GetEncoding("UTF-8"); // Encoding win1251 = Encoding.GetEncoding("Windows-1251"); //byte[] utf8Bytes = win1251.GetBytes(input); //byte[] win1251Bytes = Encoding.Convert(utf8, win1251, utf8Bytes); //output = win1251.GetString(win1251Bytes); // byte[] win1251Bytes = win1251.GetBytes(input); //byte[] utf8Bytes = Encoding.Convert(win1251, utf8, win1251Bytes); // var fromEncodind = Encoding.GetEncoding("UTF-8");//ΠΈΠ· ΠΊΠ°ΠΊΠΎΠΉ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²ΠΊΠΈ // var bytes = fromEncodind.GetBytes(input); // var toEncoding = Encoding.GetEncoding("Windows-1251");//Π² ΠΊΠ°ΠΊΡƒΡŽ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²ΠΊΡƒ // output = toEncoding.GetString(bytes); //string s = new string(Encoding.GetEncoding("Windows-1251").GetChars(input.toByte)); return output; } //ΠšΠΎΠ½Π²Π΅Ρ€Ρ‚ΠΈΡ€ΡƒΠ΅Ρ‚ строку Π² Π½ΡƒΠΆΠ½ΡƒΡŽ ΠΊΠΎΠ΄ΠΈΡ€ΠΎΠ²ΠΊΡƒ 

No matter how I dodge, when the method is turned on (several pieces of code were tested there), everything only gets worse. I seem to be at a dead end.

  • Opened your export file via FAR in UTF-8 encoding, all valves, bolts and other hydraulic compensators are visible in Russian. Judging by the code, did you achieve this? - kodv
  • I don’t understand how you did it ... I open it in Excel - cracks, the default encoding in Excel seems to be UTF-8. But for some reason, when I open the export.csv file in Excel, gibberies are discarded. I decided that the problem was in my code and not in eksel, because I took the cell with krakozyabrami, sent a converter to it online, and he showed that in order for what I gave him, it was not krakozyabyrami, it is necessary to interpret the text as win1251 .. Accordingly I was embarrassed and I climbed to the forum in search of a solution. - Julian Del Campo
  • This is one of the most famous moments that csv should not be clicked to open, but to import and specify the format. habr.com/en/company/mailru/blog/129476 - AK ♦
  • It is clear that if you open it by importing text in Excel, then by selecting the encoding you can get readable. It is necessary to me that display was normal. - Julian Del Campo

0