Good day!

Help solve the problem with encodings. We have Python 3.2 on which the script is written processing the data received from the form. The data is simply stored in the Postgres database.

Here is a piece of form code:

"form1" action = "/cgi-bin/add_dolz.py" method ="GET"> Наименование: <input type = "text" name = "d_name"><br> Тарифный разряд: <inp ut type = "text" name = "d_razr"> <br><br> <input type = "submit" value = "Добавить"> 

Here is a piece of the script processing the form: (Python 3.2)

 def Main(): # Получаем пераметры скрипта f = cgi.FieldStorage() a = f["d_name"].value b = f["d_razr"].value # Формируем запрос на добавление строки в таблицу quer = "INSERT INTO DOLZ(NAME,RAZR) VALUES ('%s',%s)" % (a,b) # Подсоединяемся к серверу conn = psycopg2.connect(host=HOST, database=DBASE,user=USER,password=PASS) cur = conn.cursor() # Выполняем запрос cur.execute(quer) # Сохраняем результаты запроса в БД conn.commit() # Закрываем соединения с БД cur.close() conn.close() # Формируем ответную html страницу thepage = '''<html> <head> <title>Сообщение</title> </head> <body>Результаты сохранены в БД PostgreSQL</body> </html>''' # отправляем страницу на сервер PrintPage(thepage) # ******************************************************** if __name__ == '__main__': Main() 

When I enter the data in the form in the English layout - everything works fine. Data is stored in the database.

When I enter the fields in Cyrillic - the following error occurs in the browser:

Traceback (most recent call last): File "c: shttpswwwcgi-binadd_dolz.py", line 68, in Main () File "c: shttpswwwcgi-binadd_dolz.py", line 45, in Main cur.execute (quer) File " C: Program FilesPython 3.2.1libencodingscp1251.py ", line 12, in encode return codecs.charmap_encode (input, errors, encoding_table) UnicodeEncodeError: 'charmap' codec can not be encoded characters in position 37-41: character maps to

I understand that the data from the form comes in cp1251 encoding, they must somehow be converted to unicode, before being written to the database ??

Thank you in advance.

    1 answer 1

    Just specify

     <meta http-equiv="content-type" content="text/html; charset=utf-8" /> 

    in the <head> block of the form page

    • This option works fine, Cyrillic is saved in the database, but in the browser the names of the fields and buttons are displayed abracadabra ... Somehow you have to convert the lines using python tools .. - Ivan Babintsev
    • Change the editor to a supporting set in UTF-8 encoding, or check the current encoding of the script text. You need "UTF-8 without BOM" (UTF-8 without BOM) or just UTF-8. - qnub
    • specify the encoding, if it is not specified. On the first or second line: # coding: utf-8 . There is another option import sys; reload(sys).setdefaultencoding("utf-8") import sys; reload(sys).setdefaultencoding("utf-8") . - mrDoctorWho
    • Those. The script text itself (HTML page) is typed in cp1251 because with a meta tag set with UTF-8 encoding, the browser tries to display what is typed in cp1251 as UTF-8, the 3rd python works with UTF-8 strings and the browser returns strings in this format, because indicated this in the meta tag. It remains only to convert the text of the script into the desired encoding (UTF-8). - qnub
    • one
      Glad to help! Mark my answer as correct please. - qnub