There is a code:
#!/usr/bin/python # -*- coding: utf-8 -*- hw = 'мир' print hw string = [] string.append(hw) print string After start issues this:
мир ['\xd0\xbc\xd0\xb8\xd1\x80'] With tuples the same, how to fix it?
There is a code:
#!/usr/bin/python # -*- coding: utf-8 -*- hw = 'мир' print hw string = [] string.append(hw) print string After start issues this:
мир ['\xd0\xbc\xd0\xb8\xd1\x80'] With tuples the same, how to fix it?
In general, on python2 - no way.
You are trying to get a string representation of the list (in your case, this is similar to calling repr). However, this causes problems so repr returns a 'str' object (actually a byte string) that contains the utf-8 characters of this list, and when you try to output it, python converts it to the default encoding, which for python 2 is ascii, respectively screened unicode is displayed.
You can try to output as
print u'[%s]' % u','.join(unicode(x) for x in [u'привет', u'мир']) In python 3, there is no such problem, because now everything is unicode. And the default encoding is utf-8. Everywhere. So everything works as you expect.
$python3 Python 3.5.1+ (default, Mar 30 2016, 22:46:26) >>> print(['привет', 'мир']) ['привет', 'мир'] >>> repr(['привет', 'мир']) "['привет', 'мир']" >>> # аналогично ['привет', 'мир'].__str__() "['привет', 'мир']" unicode(x) is either useless here (the input in the example is already the unicode type) or harmful: if you want to convert the unicode bytes, you should use the .decode() method with the encoding .decode() . - jfsUse Unicode instead of bytes to work with text in Python. For example, add from __future__ import unicode_literals so that string constants would create unicode objects even without an explicit u'' prefix." When reading text from a file, use io.open() to get unicode. When retrieving data from the network, decode the bytes to Unicode according to the protocol, for example, if the encoding is specified in the Content-Type http header :
text = data.decode(response.headers.getparam('charset')) See the answer for how to get text if data is returned by an external process .
Directly print lists / tuples only for debugging, since in this case for each element the repr() function is called: whose task is to get an unambiguous representation of the object, for example, ['\xd0\xbc\xd0\xb8\xd1\x80'] is text representation of a list containing a byte string. In Python 3, you would get [b'\xd0\xbc\xd0\xb8\xd1\x80'] (explicit b'' for a byte constant). See What makes __repr__ different from __str__ ?
Format lists / tuples / other collections explicitly:
>>> print ', '.join([u'мир']) мир In Python 2, repr() leaves only "printing characters" (in C locale, it is ascii-typed characters) for which isprint() returns a non-zero value (such characters are a textual representation of themselves). The remaining characters are escaped:
>>> print([u'мир']) [u'\u043c\u0438\u0440'] In Python 3, str(some_list) also calls repr() for the elements of the some_list list, but the characters printed in the current environment can be displayed as they are ( мир ) instead of using screening ( '\u043c\u0438\u0440' ).
Similar questions:
Source: https://ru.stackoverflow.com/questions/538520/
All Articles
u'мир'. What you see inside the list is your utf-8 encoding. - insolorprint string[0], and do not pay attention to theprintoutput from the list, sincerepris applied to the elements of the list, which makes the Cyrillic lines unreadable. True, under Windows and this conclusion is not readable, because console encoding is cp866 and not utf-8. But it works fine with unicode strings. - insolorreprspoils the output, but inside it is stored normally. - insolor