<article> <div class="article-content"> <h4 class="link-title" style="text-transform:capitalize"> <!--<a rel="nofollow" href="/view-site/viewframe.asp?url=http://www.ama-assn.org"></a> --></h4> <!--<a rel="nofollow" href="/view-site/viewframe.asp?url=http://www.ama-assn.org">http://www.ama-assn.org</a>--> Website : <span class="">http://www.ama-assn.org</span> <br>American Medical Association</span><br><br> <a href="ratingsite.asp?id=1&sta=indian" class="flat-blue">Rate</a> <a href="../regis/chk4login.asp?from=../medicalwebsite/rating_comments.asp?id=1" class="flat-blue">Comments</a> <a href="broken_links.asp?id=1&sta=indian" class="flat-blue"> Submit broken link</a> <!--<a href="edit.asp?urlid=1&sta=indian&pages=Medical&id=1" class="flat-blue">Edit</a>--> </div> </article> 

I need to parse the site name http://www.ama-assn.org and its description of the American Medical Association

I get to and then plug. I can not pull out the data. Code:

 import urllib.request from bs4 import BeautifulSoup def get_html(url): response = urllib.request.urlopen(url) return response.read() def parse(html): soup = BeautifulSoup(html, 'html.parser') site = soup.find('div', class_='article-content') span = site.find('span', class_="") print(span) def main(): parse(get_html('https://www.website.net')) if __name__ == '__main__': main() 
  • And what exactly does not work? show the error. - Alexander
  • There is no error. It turned out using the command span = site.find('span', class_='').get_text() to pull out the name of the site, but then I can not pull out the American Medical Association. Help pliz. - Ivan Petrov

1 answer 1

 import requests from bs4 import BeautifulSoup url = 'https://www.ama-assn.org/' r = requests.get(url) soup = BeautifulSoup(r.text,"lxml") title = soup.find('title').text print(title) 

result:

 American Medical Association | AMA 

or so:

 >>> print(title.partition(' | ')[0]) American Medical Association 
  • MaxU, thank you, are you offering me to climb on the site for each link. There are half sites with protection from bots. I need to get a description from this site. How to do it? - Ivan Petrov
  • @IvanPetrov, I suggest you first specify the "valid" HTML in the question. The closing tag </span> does not have a pair opening tag ... - MaxU