Specify the encoding when reading the contents of the file. Those. for reading ("transcoding" when reading) from 866, it is enough just to specify Encoding:
File.WriteAllText(@"c:\temp\test.txt", "тест!", Encoding.GetEncoding(866)); var text = File.ReadAllText("test.txt", Encoding.GetEncoding(866));
If you had a specific case, for example, you received already corrupted text as a string, then it is enough just to save it back to bytes indicating the wrong encoding, and read indicating the correct one:
static void Main(string[] args) { string bad = "ЇаЁўҐв"; string good = Convert(bad, 1251, 866); } static string Convert(string source, int from, int to) { byte[] bytes = Encoding.GetEncoding(from).GetBytes(source); return Encoding.GetEncoding(to).GetString(bytes); }
True, this will only work if reading bytes in the wrong encoding (by a happy coincidence!) Turns out to be reversible. The following is an example of when this is not the case.
Concerning "recoding":
You are trying to fix the consequences, not the problem itself.
How does this problem arise:
- You have an old file encoded in 866.
- You read it into a string without specifying an encoding. The system does not find the BOM, and reads the file in the Encoding.Default
Encoding.Default . - You are trying to "transcode read line".
Example:
// создали старый файл с содержимым в 866 File.WriteAllText("test.txt", "тест!", Encoding.GetEncoding(866)); // Открыли без указания кодировки, увидели кракозяблы: Console.WriteLine(File.ReadAllText("test.txt"));
The solution you are trying to apply is "convert a string". Those. You hope the following code works:
static void Main(string[] args) { // создали старый файл с содержимым в 866 File.WriteAllText(@"c:\temp\test.txt", "тест!", Encoding.GetEncoding(866)); // Открыли без указания кодировки, увидели кракозяблы: var text = File.ReadAllText("test.txt"); Console.WriteLine(text); text = Convert(text, 866, 1251); Console.WriteLine(text); } static string Convert(string source, int from, int to) { byte[] bytes = Encoding.UTF8.GetBytes(source); byte[] newBytes = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding(from), bytes); string newStr = Encoding.GetEncoding(to).GetString(newBytes); return newStr; }
There is a weak point in this solution - it assumes, strings in .net are just a kind of byte set. Those. no matter in what form the line is read - it can be converted back into the same bytes from which it was read. In fact, it is not. The example above is non-working.
If you do not guess the encoding of the file when reading - it will not work to write back.
File.WriteAllText(@"c:\temp\test.txt", "тест!", Encoding.GetEncoding(866)); var text = File.ReadAllText("test.txt"); File.WriteAllText(@"test2.txt", text);
Suddenly, this code produces two different files, although there was no "transcoding".