<article> <div class="article-content"> <h4 class="link-title" style="text-transform:capitalize"> <!--<a rel="nofollow" href="/view-site/viewframe.asp?url=http://www.ama-assn.org"></a> --></h4> <!--<a rel="nofollow" href="/view-site/viewframe.asp?url=http://www.ama-assn.org">http://www.ama-assn.org</a>--> Website : <span class="">http://www.ama-assn.org</span> <br>American Medical Association</span><br><br> <a href="ratingsite.asp?id=1&sta=indian" class="flat-blue">Rate</a> <a href="../regis/chk4login.asp?from=../medicalwebsite/rating_comments.asp?id=1" class="flat-blue">Comments</a> <a href="broken_links.asp?id=1&sta=indian" class="flat-blue"> Submit broken link</a> <!--<a href="edit.asp?urlid=1&sta=indian&pages=Medical&id=1" class="flat-blue">Edit</a>--> </div> </article> I need to parse the site name http://www.ama-assn.org and its description of the American Medical Association
I get to and then plug. I can not pull out the data. Code:
import urllib.request from bs4 import BeautifulSoup def get_html(url): response = urllib.request.urlopen(url) return response.read() def parse(html): soup = BeautifulSoup(html, 'html.parser') site = soup.find('div', class_='article-content') span = site.find('span', class_="") print(span) def main(): parse(get_html('https://www.website.net')) if __name__ == '__main__': main()
span = site.find('span', class_='').get_text()to pull out the name of the site, but then I can not pull out the American Medical Association. Help pliz. - Ivan Petrov