Good afternoon there are json
var model = {"ALLSKUS":["84664020","07961015","84664113","84664116"],"NBR":"137127","PRICERANGE":"$186.99 - $189.99","GENDER_AGE":"Men's","PRICEADJUSTDATE":"","AVAILABLE_SIZES":[" 07.5"," 08.0"," 08.5"," 09.0"," 09.5"," 10.0"," 10.5"," 11.0"," 11.5"," 12.0"," 12.5"," 13.0","14.0","15.0"],"DISCOUNT_PERCENT":"15","isFieldTestable":false,"SORT":"152","HASCUSTOMPRODUCTTEMPLATE":false,"PR_LIST":"224.99","SPORTS":[{"ID":"3","NM":"Basketball"},{"ID":"39","NM":"Casual"}],"SIZECHART_CD":"S0584","HASSIZES":true,"PR_SALE":"189.99","LOCALIZATION":{},"MODELTEMPLATE":{"ISMODELTEMPLATEACTIVE":"N","MODELTEMPLATE_IMAGE":""},"ISCUSTOMPRODUCT":false,"INTRODUCTIONDATE":"","SKU":"84664020","ISINTANGIBLE":false,"PROD_TP":"Shoes","CUSTPROD_CD":"","NM":"Jordan Retro 6 - Men's","REVIEWS": I'm looking for it
"AVAILABLE_SIZES":[" 07.5"," 08.0"," 08.5"," 09.0"," 09.5"," 10.0"," 10.5"," 11.0"," 11.5"," 12.0"," 12.5"," 13.0"," 14.0"," 15.0"]
then I remove all unnecessary
The output should be a table.csv
|размер|размер|размер|размер|размер|размер|размер|размер|размер|размер| |07.0|07.5|08.0|10.0|10.5|11.5|12.0"|13.0|14.0|15|I write this in csv
I look for data through regular expressions:
ad = requests.get('http://www.footlocker.com/product/model:132512/sku:A1781919/timberland-roll-top-mens/tan/tan/').text #сылка для примера bb = re.findall(r'"AVAILABLE_SIZES":(.*)"DISCOUNT_PERCENT"', ad) out: ['[" 07.0"," 07.5"," 08.0"," 10.0"," 10.5"," 11.5"," 12.0"," 13.0"," 14.0"," 15.0"],'] Then I remove them too much data
How now to remove too much? On swears to replace. The output error is a space in the incorrect output of json?
out: bb = re.findall(r'"AVAILABLE_SIZES":(.*)],"DISCOUNT_PERCENT"', ad).str(var).replace('[', ' ') AttributeError: 'list' object has no attribute 'str' update
bb_strings = re.findall(r'var model = ({.*})', ad) bp = {} if bb_strings: bp = json.loads(bb_strings[0]) out: {'ALLSKUS': ['A1781919', '6635A001', '6634A'], 'NBR': '132512', 'PRICERANGE': '$99.99 - $125.99', 'GENDER_AGE': "Men's", 'PRICEADJUSTDATE': '', 'AVAILABLE_SIZES': [' 07.0', ' 07.5', ' 08.0', ' 10.0', ' 10.5', ' 11.5', ' 12.0', ' 13.0', ' 14.0', ' 15.0'], 'DISCOUNT_PERCENT': '10', 'isFieldTestable': False, 'SORT': '1036', 'HASCUSTOMPRODUCTTEMPLATE': False, 'PR_LIST': '139.99', 'SPORTS': [{'ID': '31', 'NM': 'Snow'}, {'ID': '39', 'NM': 'Casual'}], 'SIZECHART_CD': 'S0629', 'HASSIZES': True, 'PR_SALE': '125.99', 'LOCALIZATION': {}, 'MODELTEMPLATE': {'ISMODELTEMPLATEACTIVE': 'N', 'MODELTEMPLATE_IMAGE': ''}, 'ISCUSTOMPRODUCT': False, 'INTRODUCTIONDATE': '', 'SKU': '6635A001', 'ISINTANGIBLE': False, 'PROD_TP': 'Shoes', 'CUSTPROD_CD': '', 'NM': "Timberland Roll-Top - Men's", 'REVIEWS': {'HASREVIEWS': True, 'TOTALREVIEWCOUNT': '17', 'WEIGHTEDAVERAGERATING': '4.82', 'WEIGHTEDAVERAGERECOMMENDED': '16'}, 'BRAND': 'Timberland', 'INET_COPY': 'A style unlike any other. The Timberland Roll Top Boot rolls down for a little built-in air conditioning and a whole lotta style. Premium, full-grain leather upper provides comfort, durability and abrasion resistance. Direct-attach seam construction promises lasting durability. Padded collar provides a comfortable fit around the ankle and keeps out debris. Rubber lug sole for traction and durability. Embossed Timberland tree logo on the side.'} for bl in bp['AVAILABLE_SIZES']: footlocker.append(('размер', bl)) All the rules work, now how to do that all data was written to csv and not the first values?
now you need to get the data you need