Hello. There is a task to parse the site, one of the options is the client's phone number. But the phone number is initially hidden, and it looks like this:

<div class="object-builder-phone" blst="true">+7 495 626-...</div> 

Next - button:

 <div class="toggle-button" id="show-phone_button" blst="313548" lst1="313548" lst="0">Показать телефон</div> 

When you click on it, the phone appears and its div in the blst parameter is set to false

Question:

How to simulate pressing this button? I use

  • If you parse and get the original html with the phone number, then what difference is it displayed in this html or not? - retorta
  • @retorta when parsing the phone is hidden, as shown in the record - Alexander Kurmazov
  • one
    If it is present in the html that you receive, then it already exists and everything is fine, and if it is not present in the source page, then look for what kind of query it loads and execute this query to get the phone. - retorta
  • @retorta clicking on this button sends: "addphoneclick? id = 141492840" by the post method. But I do not know how to get a new html in which the phone is fully present. - Alexander Kurmazov
  • Then simulate this request and take the phone number from his answer. Yes, in order to parse the user you will have to make one request to his main page, as well as one request to get the phone. Why do you need a new html? Collect the necessary data in parts. - retorta

3 answers 3

I saw the familiar name of the attributes and because one time was played with such, then something remained .

Judging by the attributes, I know what the site (or its similarity) is and I assume that to get a phone number that is hidden, the site does not send any requests, but it keeps the phone number hidden - in attributes such as: blst and lst1 .

The algorithm for getting the number is in the js-scripts of the site page, but it’s not difficult to rewrite :

 def decrypt_phone(clipped_phone, value): decrypt = value / 17 import math p1 = int(math.floor(decrypt / 100)) p2 = int(decrypt - 100 * p1) t1 = str(p1)[1:] + '-' + str(p2).zfill(2) return clipped_phone.replace("...", "") + t1 

Take the example of the author and substitute the values:

 clipped_phone = '+7 495 626-...' value = 313548 print(decrypt_phone(clipped_phone, value)) 

We get:

 +7 495 626-84-44 
  • Already a year the hands did not reach the decision, thanks, I did something like this) The algorithm for obtaining the number was actually in the js-scripts - Alexander Kurmazov

Open the browser, there is a console and a history of http requests. View all requests and responses. Most likely the phone arrives by Ajax. On some pages you don’t even need to download html to get the number, and on some pages Ajax just won’t request it.

At best, you will get a json answer, at worst b.soup will have to be changed to a gecko or webkit and perform all the yavaskipt there.

Also, when using the browser engine will be able to programmatically click on the buttons and links.

  • PhantomJS can be used via Selenium for automatic page actions - Andrew Che
 import requests r = requests.post("http://httpbin.org/post", params={"did":123, "ng":456})