It is impossible to read the text that includes the Cyrillic alphabet and output to the console from a file that has the encoding UTF8. The following code is available:

#include <iostream> #include <fstream> #include <string> #include <locale> #include <clocale> int main(int argc, wchar_t* argv[]) { std::wcout.imbue(std::locale("rus_rus.866")); std::wcin.imbue(std::locale("rus_rus.866")); std::wfstream fout; fout.open(L"cd.txt", std::ios::in); if (fout.is_open()) { fout.imbue(std::locale("rus_rus.1251")); wchar_t ch; std::wstring inputT; while (fout.get(ch)) inputT += ch; std::wcout << inputT; std::wcout << std::endl; fout.close(); } std::cin.get(); return 0; } 

And if the file being read is ANSI encoded, the entire text including Russian letters is displayed correctly, but if the file is in UTF8 encoding, then something is wrong. How can this be fixed?

  • We had a big answer somewhere about the output of the Cyrillic alphabet to the console. Have you seen him? - VladD
  • Here it is: www.stackoverflow.com/a/459299 (but this is for Visual Studio / Windows). What is your platform / compiler? It is important. - VladD
  • @VladD Visual Studio / Windows platform. I read the article using _setmode when typing lines from the keyboard, everything is displayed, but if I read text from a file that has UTF-8 encoding, strange characters continue to appear instead of Cyrillic, English words are displayed correctly - Max Yu
  • Then the problem is when reading, but not at the conclusion, apparently. - VladD
  • Well, yes, you write fout.imbue(std::locale("rus_rus.1251")); , and the file you have in utf-8. - VladD

1 answer 1

The problem is divided into two parts: how to display the text and how to read it.

For output, it makes sense to use the recommendation from here ( _setmode(_fileno(...), _O_U16TEXT); and use std::wstring ).

This means that we need to read utf-8 strings from a file in wstring format. This is done like this :

 fout.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>)); 

(and should work regardless of the width of wchar_t ). You are using std::locale("rus_rus.1251") , this should not work.