I have for example the following code:

<html> <head> <title>Example</title> </head> <body> <div class='General'> <div class='First-class'> <p><a href='links what I need'>Link - links what I need</a></p> <p><a href='links what I'>Link - links what I</a></p> <p><a href='links what'>Link - links what</a></p> <p><a href='links'>Link - links</a></p> </div> </div> </body> </html> 

To get everything in the <a href='...'></a> tag:

 import urllib.request from bs4 import BeautifulSoup html = urllib.request.urlopen('http://your/url') soup = BeautifulSoup(html, 'html.parser').find('div', class_='First-class') for i in soup.find_all('a', href=True): print(i['href']) 

Result:

 links what I need links what I links what links 

How do I write to the file this result? I try this code:

 import urllib.request from bs4 import BeautifulSoup html = urllib.request.urlopen('http://your/url') soup = BeautifulSoup(html, 'html.parser').find('div', class_='First-class') for i in soup.find_all('a', href=True): print(i['href']) with open("file.txt", "a") as file_1: file_1.write(i['href'] + "\n") input() 

As a result, only the last link is written to the file, but you need to record everything.

  • one
    Put the write inside the loop, obviously - andreymal

1 answer 1

 from bs4 import BeautifulSoup html = """ <html> <head> <title>Example</title> </head> <body> <div class='General'> <div class='First-class'> <p><a href='links what I need'>Link - links what I need</a></p> <p><a href='links what I'>Link - links what I</a></p> <p><a href='links what'>Link - links what</a></p> <p><a href='links'>Link - links</a></p> </div> </div> </body> </html> """ soup = BeautifulSoup(html, 'html.parser').find('div', class_='First-class') with open("file.txt", "w") as file_1: file_1.write("\n".join(list(map(lambda i: i['href'], soup.find_all('a', href=True)))))