There are links like:

<a chapter="1" href="/url1/">1</a> <a chapter="2" href="/url2/">2</a> <a chapter="3" href="/url3/">3</a> <a name="n1" href="/url1/">1</a> <a name="n2" href="/url2/">2</a> 

How can I get href'y only links with the attribute "chapter"?

  • Use xpath or css-selector with the presence of the tag. Or in the forehead: get a list of tags a, filter those that have the attribute chapter - gil9red
  • How to filter something? I can not refer to this tag. The code in the studio - MyNick
  • Getting the list of tags: a_list = root.select('a') - gil9red
  • Thank you brother. - MyNick

2 answers 2

 from bs4 import BeautifulSoup r = ''' <a chapter="1" href="/url1/">1</a> <a chapter="2" href="/url2/">2</a> <a chapter="3" href="/url3/">3</a> <a name="n1" href="/url1/">1</a> <a name="n2" href="/url2/">2</a>''' soup = BeautifulSoup(r, 'html.parser') for a in soup.find_all('a', chapter=True): print(a) 

    An alternative and slightly more concise way is to use CSS selectors — at this point, BeautifulSoup supports a limited set of selectors — but for most everyday tasks there is enough:

     for a in soup.select('a[chapter]'): print(a) # или print(a.get_text()) чтобы распечатать тексты ссылок