You need to get @src from iframe using xpath

Question

There is a page

On the page there is a video from YouTube in the iframe.

<iframe id="player" frameborder="0" allowfullscreen="1" title="YouTube video player" width="640" height="360" src="https://www.youtube.com/embed/vvKYhcUSrY4?autoplay=0&amp;rel=0&amp;showinfo=0&amp;controls=1&amp;modestbranding=1&amp;enablejsapi=1&amp;origin=http%3A%2F%2Fvkino.ua"></iframe>

Help get a link to the roller from src

Chrome when copying Xpath gives //*[@id="player"] and that's it. As I did not spin, it does not work. I use lxml (python) as a tool.

I also tried to check print (tree.xpath ('name (// * [@ id = "trailer-holder"] / * [1])')) and as a result I get not a iframe, but a div.

gil9red gil9red 31.9k four 24 69 · Accepted Answer · 2016-02-25T09:44:33

It's all very simple. By the way, for html pages it is better not to use an xml parser - html has errors in the structure, which is why xml parser may not want to parse.

For lxml there is a simple solution that you need to import not from lxml import etree , but from lxml.html import etree :

 text = """ <html><body> <iframe id="player" frameborder="0" allowfullscreen="1" title="YouTube video player" width="640" height="360" src="https://www.youtube.com/embed/vvKYhcUSrY4?autoplay=0&amp;rel=0&amp;showinfo=0&amp;controls=1&amp;modestbranding=1&amp;enablejsapi=1&amp;origin=http%3A%2F%2Fvkino.ua"></iframe> </body> </html> """ from lxml.html import etree root = etree.fromstring(text) # Ищем в любом месте документа атрибут 'src', который принадлежит # тегу 'iframe' с атрибутом 'id' равным 'player': match = root.xpath('//iframe[@id="player"]/@src') if match: print(match[0]) # Ищем в любом месте документа атрибут 'src', который принадлежит # любому тегу с атрибутом 'id' равным 'player': match = root.xpath('//*[@id="player"]/@src') if match: print(match[0]) # Ищем в любом месте документа тег 'iframe' с атрибутом 'id' # равным 'player': match = root.xpath('//iframe[@id="player"]') if match: print(match[0].attrib['src'])

Output to console:

 https://www.youtube.com/embed/vvKYhcUSrY4?autoplay=0&rel=0&showinfo=0&controls=1&modestbranding=1&enablejsapi=1&origin=http%3A%2F%2Fvkino.ua https://www.youtube.com/embed/vvKYhcUSrY4?autoplay=0&rel=0&showinfo=0&controls=1&modestbranding=1&enablejsapi=1&origin=http%3A%2F%2Fvkino.ua https://www.youtube.com/embed/vvKYhcUSrY4?autoplay=0&rel=0&showinfo=0&controls=1&modestbranding=1&enablejsapi=1&origin=http%3A%2F%2Fvkino.ua

You need to get @src from iframe using xpath

1 answer 1

More articles: