You need to replace the id from content2 (this is TU7D2IH8P2D001 ) with the id from content1 ( TYA1G2C9HMD001 ) only if the rest of the expression is the same. That is, as a result, the zero elements of the two arrays should become equal, and the first ones should remain as they are. I have the following code:

 import re content1 = ['MY SECRET CODE IS TYA1G2C9HMD001(\n YEAH IT IS EASY!\n', "HIS SECRET CODE IS TU2Z3D43D4D002(\n THIS IS DIFFERENT PART\n"] content2 = ['MY SECRET CODE IS TU7D2IH8P2D001(\n YEAH IT IS EASY!\n', "HIS SECRET CODE IS TU2A3C83D4D002(\n THIS IS DIFFERENT PARTS\n"] pattern = re.compile("^(.)*T[A-Z0-9]{10}[0-9]{3}[(](.)*$") for c1, c2 in zip(content1, content2): # как-то сравнить id из c1 и с2 здесь 

I understand that this needs to be done somehow with re.sub, but I don’t understand how to add a check for equality of the rest of the string. The python version is 2.7. Thank you in advance!

  • Can there be more than one id in one list item? - MaxU
  • In one list item there can be only one id - nick_gabpe

1 answer 1

Here is a primitive head-to-head solution:

 pattern = r'\b(T[A-Z0-9]{10}[0-9]{3})\b' for i,_ in enumerate(content1): m = re.search(pattern, content1[i]) if m.group(0): content2[i] = re.sub(pattern, m.group(0), content2[i]) 

Result:

 In [52]: content2 Out[52]: ['MY SECRET CODE IS TYA1G2C9HMD001(\n YEAH IT IS EASY!\n', 'HIS SECRET CODE IS TU2Z3D43D4D002(\n THIS IS DIFFERENT PARTS\n'] 

Check:

 In [54]: [c1==c2 for c1, c2 in zip(content1, content2)] Out[54]: [True, False]