Hello. I write code analyzing images. It is necessary to extract the information recognized on the screen and save it to a file. For 1 image everything works:

def ocr_space_file(filename, overlay=True, api_key='myAPI', language='eng'): payload = {'isOverlayRequired': overlay, 'apikey': api_key, 'language': language, } with open(filename, 'rb') as f: r = requests.post('https://api.ocr.space/parse/image', files={filename: f}, data=payload, ) Info = r.content.decode() obj = json.loads(Info) f = open('Data.txt', 'w') for i in obj['ParsedResults']: for j in range(0,len(i["TextOverlay"]['Lines'])): allInfo = i["TextOverlay"]['Lines'][j]['Words'] for k in allInfo: text = k['WordText'] coords = [k['Left'],k['Top'],k['Width'],k['Height']] f1 = ','.join(map(str, coords)) f.write(text+ " "+ f1 + '\n' ) f.close() return r.content.decode() test_file = ocr_space_file(filename='1my.png', language='eng') 

But then, when I try to process all the images in the directory in the same way, everything is for some reason written into one Data.txt file, instead of creating a new one each time. The remains of attempts to save as well as working with Images were also commuted, but for str it became irrelevant. I tried it this way: https://stackoverflow.com/questions/8024248/telling-python-to-save-a-txt-file-to-a-certain-directory-on-windows-and-mac , also does not plow. Tell me, please, what exactly am I doing wrong. Explicitly, instead of creating, the function is overwriting, but I do not know how to fix it.

 imagePaths = [f for f in glob.glob('cropped_images/*.png')] save_path = 'Labeled_images/' def ocr_all_screenshots(): for pic in imagePaths: #src_fname, ext = os.path.splitext(pic) # split filename and extension # construct output filename, basename to remove input directory #save_fname = os.path.join(save_path, os.path.basename(src_fname) + '_labeled.png') os.chdir(save_path) ocr_space_file(pic) ocr_all_screenshots() 
  • 2
    Replace: open('Data.txt', 'w') -> open(filename, 'w') - MaxU
  • Thanks, but that doesn't work either: TypeError writes: 'NoneType' object is not iterable - TheDoctor

1 answer 1

it is necessary for each new file to open a new file with a different name. something like that.

 def ocr_space_file(filename, overlay=True, api_key='мой API', language='eng'): ...обработка json for count, i in enumerate(obj['ParsedResults']): f = open('Data' +str(count) +'.txt', 'w') ... f1 = ','.join(map(str, coords)) f.write(f1 + '\n' ) f.close() return r.content.decode() 

which essentially works something like this:

  def ocr_space_file(filename, overlay=True, api_key='мой API', language='eng'): ...обработка json count = 0 for i in obj['ParsedResults']: count +=1 f = open('Data' + str(count) +'.txt', 'w') ... f1 = ','.join(map(str, coords)) f.write(f1 + '\n' ) f.close() return r.content.decode() 
  • OSError: [Errno 63] File name too long: He tries to put all the information in Data in the name - TheDoctor
  • Thanks for the advice, I tried. Does not solve the problem, because the first line for i in obj['ParsedResults'] iterates inside one jsona (there is a structure ParsedResults {TextOverlay {Lines {...}}}), and not over several. - TheDoctor