At work, I needed a page parser. I know that there are already many ready-made solutions, like the same Grab-a, but I wanted to make my crutch for practice. I wrote logging on the site and getting the page, however, it works a bit strange.
Code:
import pycurl from StringIO import StringIO c = pycurl.Curl() url = 'https://site.ru/index.php' url1 = 'https://site.ru/index.php?_m=tickets&_a=manage&departmentid=17&ticketstatusid=1' c.setopt(pycurl.URL, url) c.setopt(pycurl.POSTFIELDS, 'username=user&password=pass&_ca=login') c.setopt(pycurl.COOKIEJAR, "/tmp/cookie.txt") c.setopt(pycurl.COOKIEFILE, "/tmp/cookie.txt") def __list(url) : c.setopt(pycurl.URL, url) c.setopt(pycurl.COOKIEJAR, "/tmp/cookie.txt") c.setopt(pycurl.COOKIEFILE, "/tmp/cookie.txt") c.bodyio = StringIO() c.setopt(pycurl.WRITEFUNCTION, c.bodyio.write) c.get_body = c.bodyio.getvalue c.perform() return c.get_body() print __list(url1) As a result, should receive a code ticketing. However, a redirect occurs in the browser after logging. And the code, in the form in which it is above, gives the page redirect. However, when commenting on the part responsible for logging and creating cookies, using the ready-made cookie gives the page you need.
Tell me, please, why is this happening and how can we get rid of it? Itself, honestly, I see python for the first time.
c.setopt(c.FOLLOWLOCATION, True)- andreymal