<Response [503]> when trying to get data from the site

Question

Hello everyone, I sketched a simple program that collects data about goods in one online store. When I start everything is fine, but with frequent launch it crashes. Having looked on the Internet, I realized that this was due to frequent requests to the server, restarting the router, everything worked, but after a while it stopped again. Is it possible to somehow bypass this problem on a regular basis? I would be very grateful. Here is a piece of test file code:

def get_html(url): r = requests.get(url) print(r) return r.text def parse(html): soup = BeautifulSoup(html, "lxml") pages = soup.find("li", id="result_59") print(pages)

Error 503 gives this very "one online store", the problem is on its side.
Perhaps the problem is caused by the fact that your script accesses the site too often - you need to contact less often, and not restart the routers.
And perhaps you have nothing to do with it - these are problems of the site administrators, not yours
Try to replace the user-agent when receiving an error, then change the proxy

floydya floydya 1,448 2 9 · Accepted Answer · 2018-08-29T11:41:09

Try this method.

 from urllib.response import Request, urlopen def get_html(url): req = Request(url, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read().decode('utf8') return webpage

Karlson21 Karlson21 60 7 · Answer 2 · 2018-08-30T17:16:17

The advice from floydya brought on, apparently, the final decision. Everything worked for me like this:

 import urllib.request def get_html(url): r = urllib.request.Request(url, data = None, headers= {'User-Agent': 'Mozilla/5.0'} ) f = urllib.request.urlopen(r) return f.read().decode("utf-8")

<Response [503]> when trying to get data from the site

2 answers 2

More articles: