I want to write an ordinary program that will search for all branches of the div and copy it inside the text and save it to a file. Found for this library lmxl for parsing the site. I read it slightly and decided to try it. The code works with a bang, but there is a problem, it gives me an empty string in the result. I probably thought it was I, as usual, skrivozhopil, and not even the example with the habr does not work, ie. works, but produces the same empty string.
Here is the code:
import urllib import lxml.html page = urllib.urlopen("http://habrahabr.ru/") #открываем сайт хабрахабр doc = lxml.html.document_fromstring(page.read()) #читаем страницу for topic in doc.cssselect('a.topic'): #ищем все <a> по классу topic print topic.text outFile = open('output.txt', 'w') #создаем файл doc.write(outFile, encoding='utf-16') #записываем, что получилось And voila! An empty file is created. Explain, please, problems. thank