How to make isupper and other similar Cyrillic functions work in Visual C ++

Question

How to make the following code work correctly

#include <stdio.h> #include <clocale> #include <ctype.h> int main(int argc, char** argv) { setlocale(LC_ALL, "Russian"); isupper('П'); return 0; }

The problem is that in Release it falls, and in Debug it produces ASSERT: Expression: c> = -1 && c <= 255.

I have several solutions, but all do not fit in one degree or another:

I cannot switch to UNICODE, because supported project with a large amount of such code.
You can use the overloaded isupper function, which takes the second argument locale. I can not for the same reason - I do not want to rewrite calls everywhere.
Surprisingly, the version works with isupper ((unsinged char) 'P'). That is, after all, RTL understands setlocale and works with Russian letters. (This is confirmed by the fact that if you remove setlocale, it will compile, but the isupper will return incorrect results.) But ASSERT does not understand this and it works regardless of locale, which is understandable, but bad.
The option similar to (3) to put the compilation key / J (Make char as unsigned char) does not fit is incompatible with libraries, for example MFC.
Writing your own isupper_rus functions and exchanging them with definitions is a bad way to rewrite RTL.

Question: Is it possible to solve the problem without significant rewriting of the code proposed above? So is it possible to make isupper work exactly?

I would suggest, as a long-term solution, to switch non-Unicode and wide strings.
And then you start having problems with the output to the console too.
I would be happy, but, as I wrote in paragraph 1, the project is large and there is a lot of code.
Plus, it processes large amounts of textual data — to double these volumes — to lose performance, at least due to departure from caches.
And one more trouble UNICODE - to write wchar_t everywhere, instead of the usual char, wstring instead of string is unpleasant.
It is strange that this is not done at the compiler option level.
I understand this problem, you described it with paragraph 1. Therefore, I don’t offer it as an answer, but only as a outline for the future.
(About 10 years ago, it seemed, it was customary to use TCHAR everywhere. It would have been easier with him.)

Alex Titov Alex Titov 745 one 6 · Answer 1 · 2017-10-03T15:04:05

The problem is that the function accepts an int, but assumes that the character codes are> 0, and the 'P' character is less than zero! To work with Cyrillic, we put the / J compiler key, which makes char by default unsigned (0..255). Once there was such an item in the settings, now you can simply add to the command line. Overproblems from this key, it seems, no.

By the way, instead of wchar_t (which still does not cover the entire Unicode!), You can use utf-8. In this case, the transfer of text between modules does not change at all, but the language processing varies greatly.

How to make isupper and other similar Cyrillic functions work in Visual C ++

1 answer 1

More articles: