Turning a repr of a string into a normal string

Question

Suppose there is a file in which a line like this is written:

"Hello World!\n"

I read this line from the file as is, in the variable it turns out that:

 '"Hello World!\\n"'

How can I most easily convert this line to the original "normal" view (open quotes, open escape sequences) without using eval ?

Opening quotes does not cause any difficulties, in principle, there is also a solution for escape sequences (roughly speaking, as long as the line contains something from '\\n' , '\\r' , '\\t' , make the appropriate substitutions), but I would like to maximize the simple / short solution without non-standard dependencies (like parse ).

Need a solution for python3.

For python2, '"Hello World!\\n"'.strip('"').decode("string-escape") , but under python3, the string does not have the decode method, and the decode method of the bytes class does not reveal the escape sequence (or am I doing something wrong).

and what prevents to do so: print((b'%s' % line).decode('unicode_escape')) ?
This is how it worked: line.encode (). Decode ('unicode_escape')
But like this: bytes(line, 'utf-8').decode('unicode_escape') in Python 3.2.3 works fine
and line.encode('cp1252', 'backslashreplace').decode('unicode-escape') appropriate?

jfs jfs 44.5k 8 gold signs 53 silver marks 199 bronze marks · Accepted Answer · 2016-04-19T03:09:40

It should be considered whether it is possible to correct data saving to avoid using repr() when writing text: write text directly, discarding the repr() call, or use the JSON format — both options are more efficient and more portable.

If the input format cannot be changed, ast.literal_eval() can be used:

 #!/usr/bin/env python3 import ast text = ast.literal_eval(text_repr) # where text_repr = '"Привет!\\n"'

Thanks for the function, I thought that there should be something out of the box. The format of the data cannot be changed, I have my own bike for handling .pot / .po files ( gettext ).
For output, I use, however, not repr() (since double quotes are always needed, and other nuances), but ast.literal_eval() , it seems, it will suit me.

Community spirit ♦ one · Answer 2 · 2016-04-18T19:46:54

At the moment I use this option:

 def unescape_string(s): return strip_once(s, '"')\ .replace(r'\\', '\\')\ .replace(r'\t', '\t')\ .replace(r'\r', '\r')\ .replace(r'\n', '\n')\ .replace(r'\"', '\"')

You can also use something like the one suggested by BOPOH :

 line.encode(codepage, 'backslashreplace').decode(codepage, 'unicode-escape')

where instead of the codepage in theory, you can substitute any encoding (tested on the options ascii , cp1251 , cp1252 , latin , utf-8 ), for example:

 >>> ('Привет!\\n').encode('ascii', 'backslashreplace').decode('ascii', 'unicode-escape') 'Привет!\n'

Turning a repr of a string into a normal string

2 answers 2

More articles: