Actually, BeautifulSoup here and with it you can find the necessary script element in the HTML tree. After the element is found and its text is on hand, you will need to decide how to parse the JS code and pull out the value of the desired variable.
One rather practical and simple option is a regular expression. Moreover, you can use the same compiled regular expression to find the element, and to get the data object as a string, which we can skip through json.loads() to get a Python data structure (in this example below - a dictionary).
Working example:
import json import re from bs4 import BeautifulSoup data = """ <html> <head> <script type="text/javascript"> data = {"url":" haha.com", "id": "12345", "name": "haha"}; function() { // something here }); </script> </head> </html>""" soup = BeautifulSoup(data, "html.parser") pattern = re.compile(r"data = (\{.*?\});$", re.MULTILINE | re.DOTALL) script = soup.find("script", text=pattern) if script: obj = pattern.search(script.text).group(1) obj = json.loads(obj) print(obj)
At the exit you will receive:
{'url': ' haha.com', 'id': '12345', 'name': 'haha'}
See also this StackOverflow post, where a similar task is disassembled - besides regular expressions, there is an example of using the JS slimit parser: