I wrote a code that parses the xlsx file and spits it out into csv. When exporting, the data in English is displayed correctly. But there is no Cyrillic. (krakozyabry), as I understood it troubles with encoding.
This is an export file. Export This is an import file. Import
The online decoder suggests that the exported text is encoded with WIN1251 Decoder
For example, you can take one of the columns: "R Γβ "
It also determines what is actually required for UTF-8 encoding to properly display (interpret?).
The problem is that I cannot understand which part of my code is not working correctly, importing, or exporting, because there are no cracks in the sheets with cells when debugging. Everything is displayed correctly.
Here is my import code (EPPlus library is used for import:
try { FileInfo existingFile = new FileInfo(path); using (ExcelPackage package = new ExcelPackage(existingFile)) { ExcelWorksheet worksheet = package.Workbook.Worksheets[1]; // Π΄Π°Π½Π½ΡΠ΅ Π²ΡΠ΅Π³Π΄Π° Π½Π° ΠΏΠ΅ΡΠ²ΠΎΠΌ Π»ΠΈΡΡΠ΅. int row = GetDimensionRows(worksheet); // ΡΠ·Π½Π°Π΅ΠΌ ΡΠΊΠΎΠ»ΡΠΊΠΎ ΡΡΡΠΎΠΊ Π² ΡΠ°Π±Π»ΠΈΡΠ΅. for (int r = 1; r <= row; r++) // ΡΡΠΈΡΡΠ²Π°Π΅ΠΌ Π²ΡΠ΅ ΡΡΡΠΎΠΊΠΈ { string[] buff = new string[18]; // ΡΠΎΠ·Π΄Π°Π΅ΠΌ Π±ΡΡΠ΅Ρ ΡΡΡΠΎΠΊΠΈ ΡΠΊΡΠ΅Π»Ρ ΠΈΠ· 18 ΡΡΠ΅Π΅ΠΊ. (18 ΠΊΠΎΠ»ΠΎΠ½ΠΎΠΊ) for (int i = 1; i < 19; i++) // ΡΠΊΠ°Π½ΠΈΡΡΠ΅ΠΌ 18 ΠΊΠΎΠ»ΠΎΠ½ΠΎΠΊ (Ρ 1 ΠΏΠΎ 18) { try { Value = null; //ΡΠ±ΠΈΡΠ°Π΅ΠΌ ΠΌΡΡΠΎΡ Value = worksheet.Cells[r, i].Value.ToString(); // ΠΏΠΎΠ»ΡΡΠ°Π΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ ΡΡΠ΅ΠΉΠΊΠΈ // Value = GetRightEncoding(Value); // ΠΊΠΎΠ½Π²Π΅ΡΡΠΈΡΡΠ΅ΠΌ Π² ΠΏΡΠ°Π²ΠΈΠ»ΡΠ½ΡΡ ΠΊΠΎΠ΄ΠΈΡΠΎΠ²ΠΊΡ. } catch (NullReferenceException) { Value = "empty"; System.Console.WriteLine("JDC: ΠΎΠ±ΡΠ°Π±ΠΎΡΠ°Π½ΠΎ ΠΏΡΡΡΠΎΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ ΡΡΠ΅ΠΉΠΊΠΈ"); } buff[i-1] = Value.ToString(); // ΠΠ°ΡΠΎΠ²ΡΠ²Π°Π΅ΠΌ ΠΎΡΡΠΊΠ°Π½ΠΈΡΠΎΠ²Π°Π½Π½ΡΠ΅ ΡΡΡΠΎΠ»Π±ΡΡ Π² Π±ΡΡΠ΅Ρ }
Here is my export code (used jitbit / CsvExport library):
try { export.AddRow(); export["handleId"] = input[i].massive[0]; export["fieldType"] = input[i].massive[1]; export["name"] = input[i].massive[2]; export["description"] = input[i].massive[3]; export["productImageUrl"] = input[i].massive[4]; export["collection"] = input[i].massive[5]; export["sku"] = input[i].massive[6]; export["ribbon"] = input[i].massive[7]; export["price"] = input[i].massive[8]; export["surcharge"] = input[i].massive[9]; export["visible"] = input[i].massive[10]; export["discountValue"] = input[i].massive[11]; export["inventory"] = input[i].massive[12]; export["weight"] = input[i].massive[13]; export["productOptionName1"] = input[i].massive[14]; export["productOptionDescription1"] = input[i].massive[15]; } catch(ArgumentOutOfRangeException) { Console.WriteLine("JDC: ΠΡΠΈ ΠΏΠΎΠΏΡΡΠΊΠ΅ Π·Π°ΠΏΠΈΡΠΈ, ΠΌΠ΅ΡΠΎΠ΄ ΠΎΠ±ΡΠ°ΡΠΈΠ»ΡΡ ΠΊ Π½Π΅ΡΡΡΠ΅ΡΡΠ²ΡΡΡΠ΅ΠΌΡ Π·Π²Π΅Π½Ρ ΠΌΠ°ΡΡΠΈΠ²Π°"); } export.ExportToFile(MakePath(Program.GlobalPath) + "Export.csv", 8);
If you are interested, here is the code for the export method from the library, a little refined:
public void ExportToFile(string path, int encodingType) { if (encodingType == 8) try { File.WriteAllLines(path, ExportToLines(), Encoding.UTF8); } catch (System.IO.IOException) { //No access to file } if (encodingType == 32) try { File.WriteAllLines(path, ExportToLines(), Encoding.UTF32); } catch (System.IO.IOException) { //No access to file } }
Help me figure out where the dog is buried, ideally, if the problem is in interpretation at the data import stage, then you can limit the transcoding method after the string: Value = worksheet.Cells[r, i].Value.ToString();
What you can actually see (the commented-out method is my yesterday's attempt)
Method code:
public static string GetRightEncoding(string input) { string output = null; // Encoding utf8 = Encoding.GetEncoding("UTF-8"); // Encoding win1251 = Encoding.GetEncoding("Windows-1251"); //byte[] utf8Bytes = win1251.GetBytes(input); //byte[] win1251Bytes = Encoding.Convert(utf8, win1251, utf8Bytes); //output = win1251.GetString(win1251Bytes); // byte[] win1251Bytes = win1251.GetBytes(input); //byte[] utf8Bytes = Encoding.Convert(win1251, utf8, win1251Bytes); // var fromEncodind = Encoding.GetEncoding("UTF-8");//ΠΈΠ· ΠΊΠ°ΠΊΠΎΠΉ ΠΊΠΎΠ΄ΠΈΡΠΎΠ²ΠΊΠΈ // var bytes = fromEncodind.GetBytes(input); // var toEncoding = Encoding.GetEncoding("Windows-1251");//Π² ΠΊΠ°ΠΊΡΡ ΠΊΠΎΠ΄ΠΈΡΠΎΠ²ΠΊΡ // output = toEncoding.GetString(bytes); //string s = new string(Encoding.GetEncoding("Windows-1251").GetChars(input.toByte)); return output; } //ΠΠΎΠ½Π²Π΅ΡΡΠΈΡΡΠ΅Ρ ΡΡΡΠΎΠΊΡ Π² Π½ΡΠΆΠ½ΡΡ ΠΊΠΎΠ΄ΠΈΡΠΎΠ²ΠΊΡ
No matter how I dodge, when the method is turned on (several pieces of code were tested there), everything only gets worse. I seem to be at a dead end.