Why in the ASCII table the Russian letter 'a' is 160, and when I write:

int a = 'а'; cout << a << endl; 

does it output me to -32?

  • 2
    Type char 8 bit signed, values ​​from -128 to 127, the most significant bit (determines the sign) when converting to int is spread to the left, so it turns out -32. - avp
  • @avp: I think it’s worth postponing. I just started writing almost the same thing. - VladD
  • @avp: In C, the type of literal 'а' , as far as I remember, int , there would have been no such effect there. - VladD
  • @VladD, my comment on the answer does not pull, write you (if not laziness). - avp
  • @avp and how to make it stay 160? unsigned char or unsigned int ? - Vorobey.A

2 answers 2

The result of the execution of your code depends on several factors. In particular, from the encoding of the source code, the compiler used and the encoding rules of character and string literals.

For example, it is possible to display the number 53424 and warnings:

multi-character character constant [-Wmultichar]

when using UTF-8 gcc source encoding and compiler. So 'а' is not a symbol at all, but an integer (for more information on multi-character literals, read here ), as can be seen by checking sizeof , which returns the same value as for int , and not for char (by definition 1 ) .

When using clang we generally get a compilation error :

character too large for enclosing character literal type

To get rid of this error, you will have to change the type of the character literal (for example, to wchar_t , by specifying the prefix L ). In this case, the output will get another number: 1072 , which corresponds to the UTF-16 code of the Russian letter а .

To get the same result as yours, but with the gcc online compiler, you can use the -fexec-charset switch with code page 1251 .

In this case, I want to note that for the letter а code 160 ( 0xA0 ) corresponds to the coding cp866 , and not cp1251 , in which it is equal to 224 ( 0xE0 ). Therefore, -32 to 160 will not be able to transform by simply converting a signed number into an unsigned one.

If you really want to get to the output exactly 160 in addition to converting to the unsigned type, you must also use the code page cp866. Those. compile the following code with the -fexec-charset=cp866 :

 #include <iostream> int main() { int a = static_cast<unsigned char>('а'); std::cout << a << std::endl; } 
  • good analysis, +1 - ixSci
  • @ixSci Yes, I myself have learned a lot of new things while writing the answer :) - αλεχολυτ
  • Here pro -fexec-charset=... I did not know. Thank you (+1). - avp

Because the number greater than 127 does not fit into the signed char , whose range is from -128 to 127. But it fits into the unsigned char , but, apparently, on your system, char is signed, so the conversion from signed char to signed int occurs. Because If any number of char always holds in int , then it turns out that a negative number from char falls into int

To get a positive number, simply use any unsigned type as a value receiver, and convert the original char to an unsigned char :

 unsigned int a = static_cast<unsigned char>('а'); 
  • Sample code to see :) - αλεχολυτ
  • @alexolut, why? So hard to substitute unsigned? Apparently, the author is not difficult, because he accepted the answer. - ixSci
  • I'm about converting -32 to 160 . - αλεχολυτ
  • @alexolut, this is on the conscience of the author. I did not calculate what encoding is in - this, in my opinion, does not apply to the question. The question, in my opinion, is about why a person has a negative number, and not a positive one. - ixSci
  • So not in codings, but the fact that (unsigned char)(-32) will give not 160 , but 224 . - αλεχολυτ