Hello everyone, I sketched a simple program that collects data about goods in one online store. When I start everything is fine, but with frequent launch it crashes. Having looked on the Internet, I realized that this was due to frequent requests to the server, restarting the router, everything worked, but after a while it stopped again. Is it possible to somehow bypass this problem on a regular basis? I would be very grateful. Here is a piece of test file code:

def get_html(url): r = requests.get(url) print(r) return r.text def parse(html): soup = BeautifulSoup(html, "lxml") pages = soup.find("li", id="result_59") print(pages) 
  • one
    Error 503 gives this very "one online store", the problem is on its side. Perhaps the problem is caused by the fact that your script accesses the site too often - you need to contact less often, and not restart the routers. And perhaps you have nothing to do with it - these are problems of the site administrators, not yours - andreymal
  • Try to replace the user-agent when receiving an error, then change the proxy - danilshik
  • Thank you very much for the tips, I will understand - Karlson21

2 answers 2

Try this method.

 from urllib.response import Request, urlopen def get_html(url): req = Request(url, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read().decode('utf8') return webpage 

    The advice from floydya brought on, apparently, the final decision. Everything worked for me like this:

     import urllib.request def get_html(url): r = urllib.request.Request(url, data = None, headers= {'User-Agent': 'Mozilla/5.0'} ) f = urllib.request.urlopen(r) return f.read().decode("utf-8")