How to parse code snippet <script type [duplicate]

Question

This question has already been answered:

How to get information from the json line, which is specified in the Javascript code inside the html page, using python3.x? 2 answers

Task: get a link to the image from the code that looks like this

<script type="text/javascript">window._sharedData = {"activity_counts":null,"config": __много кода__ "dimensions":{"height":1350,"width":1080},"display_url":"https://scontent-arn2-1.cdninstagram.com/vp/d60f1dbd5f2a609e6a653569a8a13a11/5B807715/t51.2885-15/e35/31297890_370834860094260_8326112321717927936_n.jpg"

The desired link is after the "display_url" tag.

Through BeautifulSoup failed. Can somehow through regular expressions be possible?

The error itself says nothing, show the code that gives this error

ss_beer ss_beer 363 2 silver marks 14 bronze marks · Answer 1 · 2018-05-14T03:21:53

You need soup to get the necessary script tag from the page code. The content of this tag is a json object.

 from bs4 import BeautifulSoup import json import re src = '<script type="text/javascript">window._sharedData = {"activity_counts":null,"config":"","dimensions":{"height":1350,"width":1080},"display_url":"https://scontent-arn2-1.cdninstagram.com/vp/d60f1dbd5f2a609e6a653569a8a13a11/5B807715/t51.2885-15/e35/31297890_370834860094260_8326112321717927936_n.jpg"}</script>' soup = BeautifulSoup(src, 'lxml') script = soup.find('script') #здесь нужно указать доп.условия поиска именно вашего тега json_text = re.findall('^s*window\._sharedData\s*=\s*({.*?})\s*\s*$', script.string, flags=re.DOTALL | re.MULTILINE)[0] js = json.loads(json_text) print(js['display_url'])

How to parse code snippet <script type [duplicate]

Reported as a duplicate by jfs python Members with Signs May 14 '18 at 7:21 .

1 answer 1

More articles:

How to parse code snippet <script type [duplicate]

Reported as a duplicate by jfs python Members with Signs python can individually close questions labeled python as duplicates, and also re-open them if necessary. May 14 '18 at 7:21 .

1 answer 1

More articles:

Reported as a duplicate by jfs python Members with Signs May 14 '18 at 7:21 .