Good day. Trying to learn from python, plugging with parsing.

import lxml.html, urllib page = urllib.urlopen('http://site.ru/').read() #сделали запрос на сайт и сохранили в переменную doc = lxml.html.document_fromstring(page) advice = doc.xpath('//title') #нашли значение тега и сохранили в переменную print (advice) 

All this displays the title in this format:

 [<Element title at 0x7f7ecaa11ba8>] 

How can I get a normal Russian text instead of a certain hex?

  • 2
    Logically, you are trying to print not the contents of the object, but himself. But he does not know how to print himself beautifully. Try this print (advice.text_content()) - KoVadim
  • one
    @KoVadim judging by the conclusion, advice this list. The list does not have a text_content() method. You can [0] try. Related question: How can I retrieve the page title of a webpage using Python? - jfs
  • Yes, I have not done this for a long time. Corrected - somewhere like that advice[0].text_content() - KoVadim
  • @jfs advice [0] .text_content () - required, thanks KoVadim - Bubuka

1 answer 1

How can I get a normal Russian text instead of a certain hex?

 title_text = advice[0].text_content() 

Gives what you need , thanks @KoVadim comment above .