Python. How to make a third of two csv files in which there will be a list of values common to the first two files

Question

Given two files in csv format with approximately the following content:

123; 456; 789;

All values are numeric, size is the same. It is necessary to compare the lines of the first file with the second and, if the values match, write the line to the third file.

MaxU MaxU 52.5k 6 18 51 · Answer 1 · 2018-05-11T21:18:17

Example:

file: 1.csv :

 a;b;c 1;2;3 4;5;6 7;8;9

file: 2.csv :

 a;b;c 7;8;9 1;1;1 4;5;6 2;2;2

 import pandas as pd d1 = pd.read_csv('1.csv', sep=';') d2 = pd.read_csv('2.csv', sep=';') d1.merge(d2).to_csv(r'res.csv', sep=';', index=False)

result:

 a;b;c 4;5;6 7;8;9

Answer 2 · 2018-05-11T22:25:11

I cite the author's modified answer , taking into account the comments:

 import argparse parser = argparse.ArgumentParser() parser.add_argument('csvfile') parser.add_argument('csvfile2') args = parser.parse_args() filename1 = args.csvfile filename2 = args.csvfile2 with open('result.csv', 'w') as response_file: with open(filename1) as f: msisdn1_lines = f.readlines() with open(filename2) as f: msisdn2_lines = f.readlines() # На каждую строчку msisdn1_lines for msisdn1 in msisdn1_lines: # Делаем перебор строк другого файла for msisdn2 in msisdn2_lines: if msisdn1 == msisdn2: response_file.write(msisdn1)

The verification algorithm can be simplified by using the intersection method of the set (set), which returns common elements:

 with open('result.csv', 'w') as response_file: with open(filename1) as f: msisdn1_lines = set(f.readlines()) with open(filename2) as f: msisdn2_lines = set(f.readlines()) # Получаем список общих элементов common_lines = msisdn1_lines.intersection(msisdn2_lines) for line in common_lines: response_file.write(line)

Ps. The intersection method is replaced by the & operator, so you can simply:

 common_lines = msisdn1_lines & msisdn2_lines

Pps. instead of a for line in common_lines you can write to a file in one fell swoop if the lines are combined into one line.

It was:

 for line in common_lines: response_file.write(line)

will be:

 response_file.write(''.join(common_lines))

from the cycle for line in common_lines: you can also get rid of: response_file.write('\n'.join(common_lines))
readlines will return lines with '\ n', therefore: response_file.write(''.join(common_lines))

Accepted Answer · 2018-05-11T09:01:58

She wrote herself answered))

 import requests import argparse def main(): parser = argparse.ArgumentParser() parser.add_argument('csvfile') parser.add_argument('csvfile2') args = parser.parse_args() filename1 = args.csvfile filename2 = args.csvfile2 with open('result.csv', 'w') as response_file: with open(filename1) as msisdn1_file: for line in msisdn1_file.readlines(): MSISDN1 = line.strip() b=False with open(filename2) as msisdn2_file: for line in msisdn2_file.readlines(): MSISDN2 = line.strip() if MSISDN1==MSISDN2: b=True if b==True: response_file.write(MSISDN1+'\n') if __name__ == '__main__': main()

Your code is very inefficient, because for every line from the first file you re-read the entire second file and compare it line by line.
import requests not needed, delete it, moreover, in python, variables in uppercase are considered constants, and MSISDN1 and MSISDN2 are not constants, this is not critical, but it is better not to do so
@PyLam, I thought that it would be useful to show the example of a modified code with the comments: ru.stackoverflow.com/a/826655/201445

jfs jfs 44.5k eight 53 199 · Answer 4 · 2018-05-11T21:21:06

For small files, to find common numbers for two files, you can load the numbers from each file into set () and output their intersection (not tested):

 #!/usr/bin/env python3 """Usage: common-numbers <file>...""" import re import sys from pathlib import Path def read_numbers(filename): return set(map(int, re.findall(br'\d+', Path(filename).read_bytes()))) print(*set.intersection(*map(read_numbers, sys.argv[1:])))

Startup example:

 $ common-numbers a.csv b.csv

Python. How to make a third of two csv files in which there will be a list of values common to the first two files

4 answers 4

More articles:

Python. How to make a third of two csv files in which there will be a list of values ​​common to the first two files

4 answers 4

More articles:

Python. How to make a third of two csv files in which there will be a list of values common to the first two files