Colleagues, you need to find all entries in the file with the format "data \ model \ folder \ folder \ file.extension". There may be several. There is no time to dig up the file structure (the file is not text), so I decided to go by searching for the first match and finding the first occurrence of the file format (tga, dds in both registers), then extracting the slice (file [start: end]). This script returns too many matches from nonexistent places. Where could I make a mistake?

# -*- coding: utf-8 -*- import os import glob folder = r"C:\files\data\model" final_folder = r"C:\files\data\model_patched" files = glob.glob(folder + r'\**\*.fskin', recursive=True) filecount = 0 oc_count = 0 for file in files: with open(file, 'r+b') as f: fcontent = f.read() fcontent = fcontent.decode("utf-16", errors='ignore') cur_pos = 0 extensions = ['.TGA', '.DDS', '.tga', '.dds'] occurences = [] print("file {}:".format(file)) while cur_pos != -1: string_begin = fcontent.find(r"data\model", cur_pos) cur_pos = string_begin # Π½Π°Ρ‡ΠΈΠ½Π°Π΅ΠΌ с 0 ΠΈΠ»ΠΈ ΠΊΠΎΠ½Ρ†Π° ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰Π΅Π³ΠΎ вхоТдСния string_end = 0 cur_pos_iter = int(cur_pos) # Π§Ρ‚ΠΎΠ±Ρ‹ Π½Π΅ ΠΏΠ΅Ρ€Π΅Π·Π°ΠΏΠΈΡΠ°Ρ‚ΡŒ cur_pos while string_begin != -1: ch = fcontent[cur_pos_iter:cur_pos_iter+4] # Π±Π΅Ρ€Ρ‘ΠΌ 4 символа с Ρ‚Π΅ΠΊΡƒΡ‰Π΅ΠΉ ΠΏΠΎΠ·ΠΈΡ†ΠΈΠΈ if ch not in extensions: # Ссли Π½Π΅ Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ, ΡƒΠ²Π΅Π»ΠΈΡ‡ΠΈΠ²Π°Π΅ΠΌ счётчик Π½Π° 1 ΠΈ ΠΈΠ΄Ρ‘ΠΌ дальшС cur_pos_iter += 1 else: # Если Π½Π°Ρ…ΠΎΠ΄ΠΈΠΌ, ΠΈΡ‰Π΅ΠΌ Π²Ρ…ΠΎΠΆΠ΄Π΅Π½ΠΈΠ΅ Ρ†Π΅Π»ΠΈΠΊΠΎΠΌ, ставим курсор Π½Π° ΠΊΠΎΠ½Π΅Ρ† вхоТдСния ΠΈ Π³Π»ΡƒΡˆΠΈΠΌ Ρ†ΠΈΠΊΠ» string_end = cur_pos_iter+4 occurence = fcontent[string_begin:string_end] cur_pos = string_end print("Found {} [{}:{}]".format(occurence, string_begin, string_end)) oc_count += 1 else: filecount += 1 filename_relative = file[file.find('model')+5:] new_filename = final_folder+filename_relative # os.makedirs(os.path.dirname(new_filename), exist_ok=True) # with open(new_filename, 'wb') as fw: # fw.write(bytearray(fcontent, 'utf-16')) # fw.close() f.close() print("Done. Replaced {} occurences in {} files.".format(filecount, oc_count)) 
  • It is not clear what is at the entrance what is at the exit. What does "by format" look for? Which part is fixed in the string, which is not? (why not use a regular expression?) What do you want to get at the output? Just position in the file? What do you want to replace? - jfs
  • @jfs regexps are not my strongest side. At the entrance to the file with the contents of some kind of Chinese, the path to the file is still Chinese . There may be several paths in different parts of the file. At the output, I want to replace these paths with the generated ones and save them in a separate file according to the folder structure (the commented code part at the end). - Twen Shin
  • You have several tasks here that cause you difficulties. Try to solve them one by one. First, find out if the binary format allows you to change the length of the paths you want to replace (for example, are there any zeros after the line with the path or where is the length you can tweak) ΒΆ If it’s possible in principle to replace the string, then temporarily forget about the binary file and manually create a simple test file where you put examples of possible paths and learn how to find all the necessary paths in this text file. Then change these ways. If you need separate SO questions on subtasks ask. - jfs
  • one

0