I logged in, now I need to parse the data from another page. When switching to another page, I am no longer authorized. I understand the case in the cookie? Could anyone help me? How to get cookies after sending a request, and how to use them when switching to another page? (if you can slice the code)

session = requests.Session() r = session.post(url,dann) 

    1 answer 1

     with requests.Session() as session: url = "www.example.com" # Ваш URL с формами логина LOGIN = "username" # Ваш логин PASSWORD = "password" # Ваш пароль dann = dict(username = LOGIN, pass = PASSWORD) # Данные в виде словаря, которые будут отправляться в POST session.get(url) # Получаем страницу с формой логина session.post(url, dann) # Отправляем данные в POST, в session записываются наши куки url2 = "www.example.com/data_for_parsing" # Ваш второй URL - тот с которого вам нужно спарсить данные r = session.get(url2) # Все! Вы получили Response. Поскольку в session записались куки авторизации - при вызове метода get() с этой сессии в Request отправляются ваши куки. print(r.text) # Дальше делайте с вашими данными все что захотите 

    In this piece of code, I used the with ... as construct ... in order for the connection to be interrupted after the execution of the code, even in the event of an unexpected error.

    In the dictionary that we send to POST (in this case, the variable dann), the keys must match the keys that we send in the packet header.

    How to get these keys:

    • Through the code inspector to find the input field element, most often it is a tag
    • Find the name argument, its value is the required key.

    But not always in POST when authorizing send only two values. There are different protection, csrf tokens and so on.

    In order to find out all the data that you sent to POST during authorization (for Google Chrome browser):

    1. Go to the login page. If authorized - log out
    2. Press the F12 button, go to the Network tab
    3. Type in the data entry fields, then click on the authorization button
    4. After sending the data, you will most likely be redirected. Wait for the page to load, then scroll through the list to the top. The request we need is usually the very first timeline on the timeline (timeline), contains in the title (Name) the word "login", the method of sending POST.
    5. Click on the found query, in the Headers tab, find the Form Data section. It contains the keys you need.

    Then everything is individually and experimentally.