I started working with Python, and here is the problem I got using pymssql on Windows:

We receive by request line with Cyrillic, we do fetch of this line. If we try to print this line to the console or save to a file, we get:

File "C:\py\lib\encodings\cp866.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-8: character maps to <undefined> 

If we try to apply the .encode('utf8') method to the string, then on the output we get without error, but in an incomprehensible form:

 b'\xc3\x87\xc3\xa0\xc3\xa2\xc3\xae\xc3\xa4\xc3\xb1\xc3\xaa\xc3\xa0\xc3\xbf' 

I tried setting the connection with the MSSQL server to set the utset charset parameter utf-8 - it did not help.

In this case, everything works fine on Linux, everything is displayed without any problems. I use Python 3.4.

  • this is not about the same? - aleksandr barakin

1 answer 1

The error is due to the fact that in the Windows console there is no Unicode, but, as can be seen from the error text, the cp866 encoding, and obviously, not all Unicode characters are present in it. Namely - in the given line it is written “The beginning” (and not “Factory” at all, as the author of the question can mistakenly think), and none of these characters in the cp866 encoding is available :)

Alas, I do not know a competent solution to this problem and I can only advise a crutch (maybe someone will advise something better):

 import sys sys.stdout.buffer.write('Çàâîäñêàÿ'.encode(sys.stdout.encoding) + b'\n') 

However, before that, you need to double-check all the parameters for connecting to the database and the contents of the database itself, in order to get the “Factory” one, and not “Shut up”: there is a cant somewhere else there. As an option - in the connection parameters for some reason, latin-1 (is it the default in the library?), While the database itself is stored in cp1251.

  • I decided to use pyodbc on windows - the problem went away with it, and the code written should not be prepared, except for a few lines to connect. And since you are probably right about the connection parameters, the pymssql library has the ability to pass the charset parameter in the connection, but when I tried to substitute the values ​​for cp1251, windows-1251 and other options, python stopped its work without even throwing errors into the console, just with the usual windows-error window. - arimanov
  • Try to get from the database the lines containing, for example, the characters ©, ",", ± or -. They are in cp1251, but they are not in cp866, and when they appear, the problem should reappear. - andreymal