Just studying the python, I can not understand what I'm doing wrong. Here is the code snippet:

captcha = None url = "https://www.amazon.com/s/?page=" + str(k) + "&keywords=" + req headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} response = requests.get(url, headers=headers) html = response.text soup = BeautifulSoup(html,'html.parser') captcha = soup.find(text="Robot Check") while captcha != None: cap = open('cap.html','w') cap.write(str(soup)) cap.close() capin = input("Требуется ввод Captcha. Введите Captcha из файла cap.html: ") s = requests.Session() data = {"captchacharacters" : capin} r = s.post(url, data=data) s.cookies response = requests.get(url,headers=headers) html = response.text soup = BeautifulSoup(html,'html.parser') captcha = soup.find(text="Robot Check") 

Actually, I save the html file if Amazon requires captcha input. It remains to send him a captcha, but I apparently am doing something wrong, for the Amazon again infinitely asks for a captcha.

  • Well, one page is unlikely to finish the job - there are still http headers (Request Headers) and a cookie (Cookie) - gil9red
  • I advise you to open the developer tools in the browser and see in the networks what requests go when a captcha appears and when sending its value - gil9red
  • captcha hints that it is not for the consumption of bots interface. Amazon has many many existing APIs. Try it instead of trying to circumvent the protection measures. - jfs

0