u"\u043b\u043e\u043c, \u0432\u044b\u0440\u0435\u0437\u043a\u0430, to scrap, \u043e\u0442\u043a\u0430\u0437\u0430\u0442\u044c\u0441\u044f" How to overtake it in Python in the usual str ()?
I can call print:
>>> print s лом, вырезка, to scrap, отказаться But if I just call dump s:
>>> s u'\u043b\u043e\u043c, \u0432\u044b\u0440\u0435\u0437\u043a\u0430, to scrap, \u043e\u0442\u043a\u0430\u0437\u0430\u0442\u044c\u0441\u044f' And in str I can not drive:
>>> str(s) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) This is what I achieved:
>>> for i in xrange(0,len(s)): print i, s[i], ord(s[i]) 0 л 1083 1 о 1086 2 м 1084 3 , 44 4 32 5 в 1074 6 ы 1099 7 р 1088 8 е 1077 9 з 1079 10 к 1082 11 а 1072 12 , 44 13 32 14 t 116 15 o 111 16 32 17 s 115 18 c 99 19 r 114 20 a 97 21 p 112 22 , 44 23 32 24 о 1086 25 т 1090 26 к 1082 27 а 1072 28 з 1079 29 а 1072 30 т 1090 31 ь 1100 32 с 1089 33 я 1103 It seems that reformatting from utf-8 to ascii readable by the web is best of all:
>>> str(s.encode('ascii', 'xmlcharrefreplace')) 'лом, вырезка, to scrap, отказаться'