From php called a bash script, and everything was fine

ob_implicit_flush(true); ob_end_flush(); system ("sudo /path/start.sh $name_user 2>&1"); 

The script was run on the server side without any problems, but for a long time.

I rewrote it in python, the script from the console works fine and displays Russian letters.

But if you call it from php:

 header( "Content-Type: text/html; charset=utf-8" ); <code> ob_implicit_flush(true); ob_end_flush(); echo "<pre>"; system ("sudo /path/start.py $name_user 2>&1"); 

That error falls out:

Traceback (most recent call last): File "/path/start.py", line 250, in print ("\ u041f \ u043e \ u0435 \ u0445 \ u0430 \ u043b \ u0438)") UnicodeEncodeError: 'ascii' codec can ' t encode characters in position 0-6: ordinal not in range (128)

Error on the first occurring print with Russian characters.

In the bash script, at the beginning, I prescribed forcibly that the UTF locale, since despite the header, the variables do not go as planned, but there is an error even to variables. Even if all the transmitted variables lead to Latin characters, it swears on Russian words in the script itself.

What to do and how to specify the forced encoding?

Script tried on python2 with # - - coding: utf - - (# - - coding: utf-8 - -)

I also took python3, which is already normal with Russian letters, still the problem remains. At the same time everything works fine from the console.

Of course, there is a desire to rewrite the entire php in python, but nevertheless, the question remains and the problem has not yet been resolved.

How to be?

Added later:

Something is not yet possible to achieve a result, maybe I am not looking there or doing something wrong?

but added PYTHONIOENCODING to the global / etc / profile environment

 export PYTHONIOENCODING="UTF-8" 

The file itself was made with such content.

 #!/usr/bin/env python3.4 # -*- coding: utf -*- import os #print("Русский текст весь") print(os.environ['PYTHONIOENCODING']) 

When calling from php

 system ("sudo /path/start.py 2>&1"); 

Mistake

 Traceback (most recent call last): File "/etc/openvpn/easy-rsa/easyrsa3/vpn.py", line 7, in print(os.environ['PYTHONIOENCODING']) File "/usr/lib64/python3.4/os.py", line 633, in getitem raise KeyError(key) from None KeyError: 'PYTHONIOENCODING' 

Same result if

 system ("PYTHONIOENCODING=utf-8 && sudo /path/start.py 2>&1"); 

This is the third python, and if the second is python, then there are no problems with Russian letters now.

It turns out, as it was written above, that Python3 takes its encoding, but if it is explicitly given, will utf-8 not use it anyway?

Added by:

I also tried this code

 envname = "PYTHONIOENCODING" print("{}:\t{}".format(envname, os.environ.get(envname))) for set_locale in [False]: print("locale({}):\t{}".format(set_locale, locale.getpreferredencoding(set_locale))) for streamname in "stdout stderr stdin".split(): stream = getattr(sys, streamname) print("device({}):\t{}".format(streamname, os.device_encoding(stream.fileno()))) print("{}.encoding:\t{}".format(streamname, stream.encoding)) for set_locale in [False, True]: print("locale({}):\t{}".format(set_locale, locale.getpreferredencoding(set_locale))) 

console output received

 PYTHONIOENCODING: UTF-8 locale(False): UTF-8 device(stdout): UTF-8 stdout.encoding: UTF-8 device(stderr): UTF-8 stderr.encoding: UTF-8 device(stdin): UTF-8 stdin.encoding: UTF-8 locale(False): UTF-8 locale(True): UTF-8 Поехали ) 

A Favorite

 PYTHONIOENCODING: None locale(False): ANSI_X3.4-1968 device(stdout): None stdout.encoding: ANSI_X3.4-1968 device(stderr): None stderr.encoding: ANSI_X3.4-1968 device(stdin): None stdin.encoding: ANSI_X3.4-1968 locale(False): ANSI_X3.4-1968 locale(True): ANSI_X3.4-1968 Traceback (most recent call last): File "/etc/openvpn/easy-rsa/easyrsa3/vpn.py", line 261, in print ("\u041f\u043e\u0435\u0445\u0430\u043b\u0438 )") UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128) 

That is, the encoding is not transmitted as

I tried to add

 sys.stdout.buffer.write((" Русский ").encode('utf8')) 

And in the browser there is no error and the Russian text appeared ... but do not write the same there for every meeting predlozheniya?

  • First of all, you should show the script on Python, especially line 250. Are you working with the unicode line exactly? - tutankhamun
  • So the line is print ("Let's go)") ... but as he said, or not ... I tried to include coding: utf, and write u'-before the text and name.unicode - like when transferring it from php to python encoding suffers - sober

1 answer 1

If the script output is redirected, then you need to set the PYTHONIOENCODING environment variable:

 $ python -c "print(u'\N{EURO SIGN}')" >output.txt Traceback (most recent call last): File "<string>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128) 

vs:

 $ PYTHONIOENCODING=utf-8 python -c "print(u'\N{EURO SIGN}')" >output.txt # bash 

output.txt contains a character encoded using utf-8 encoding.

By default, Python 3 uses the encoding from the locale (LC_ALL, LC_CTYPE, LANG) if the output is redirected:

 $ python3 -c "import sys; print(sys.stdout.encoding)" | cat UTF-8 $ python3 -c "import locale; print(locale.getpreferredencoding(False))" | cat UTF-8 

The default locale (broken or missing settings) implies ascii encoding (C, POSIX). Therefore, when launching init scripts (services), when entering ssh, in cron scripts, if you need to output characters outside the ascii range, then you should select the utf-8 locale, for example, such as C.UTF-8 (if it is available on system) or explicitly specify PYTHONIOENCODING even in Python 3.

In Python 2, the encoding for redirection is not defined at all and therefore sys.getdefaultencoding() , which should always be ascii on Python 2. If the output is not on the screen, then you should explicitly set PYTHONIOENCODING .

To set the PYTHONIOENCODING environment variable directly in a command, if sudo used ( sudo may not save the environment from the parent process), you can use sudo VAR=value command syntax:

 sudo PYTHONIOENCODING=utf-8 /path/to/start.py 

the whole environment can be saved, if necessary, using:

 sudo -E /path/to/start.py 

For security reasons, this command does not work for all users.

  • Thank. for answer. I tried PYTHONIOENCODING = utf-8 python -c "print (u '\ N {EURO SIGN}')"> output.txt - if everything is fine in the console, as well as in general in the console with Russian letters, if the same is from php - then an empty file is created (there are rights to launch). My locale is all UTF-8 (yes LC_ALL = ru_RU.UTF-8). $ python3.4 -c "import sys; print (sys.stdout.encoding)" outputs UTF-8 $ python3.4 -c "import locale; print (locale.getpreferredencoding (False))" displays UTF-8. If you use 2Piton and set sys.setdefaultencoding ("utf-8"), then it swears at the first Russian character in the crypt. - Sober
  • @ Sober: do not use sys.setdefaulencoding() . I see that you are using sudo : check that the necessary environment variables are set, for example, try sudo -E (for debugging). It is also obvious that the environment in which the php script is executed may differ from the environment that is used for the interactive user (when you sit at the keyboard). Check the value: os.environ['PYTHONIOENCODING'] inside the Python script when it is called from php. - jfs
  • Created for this case an empty new file with the text in Russian. And 2 python normally processes it, but 3 does not. Here is the output of os.environ ['PYTHONIOENCODING'] ---- Russian text all Traceback (most recent call last): File "/etc/openvpn/easy- rsa/easyrsa3/vpn.py", line 7, in os.environ ['PYTHONIOENCODING'] File "/usr/lib64/python2.7/UserDict.py", line 23, in getitem raise KeyError (key) KeyError: 'PYTHONIOENCODING' and the same for 3rd python - Sober
  • more precisely, there is no such environment variable, as I understand it. to create it all the same?))) - Sober
  • @ Sober: yes, create (you can practice on variable A and shell-skiptom: echo =$A= ). For the future: put the error messages in the question (to save the formatting) with the appropriate code that caused it, with a description of what was expected to be received and what happens in steps (words). - jfs