How to use python to collect information from two csv files?

Question

The problem is this: in the 1.csv file there is a column with unique values of the type "OR-SDR-28HD-OLEC". Each cell of this column corresponds to the cell value of the adjacent column. There is a 2.csv file in which there are values from the first column of the 1.csv file. It is necessary for the values in the first column in the 2.csv file to pull in the corresponding values from the second column.

1.csv looks like this:

Overwrite=No,,"* Changing the setting to ?Overwrite=Yes? will have the added effect of deactivating all of your items from the website except for those listed on this datafeed. If this is not intended, keep ?Overwrite=No?.",,,,,,,, Part #,Item #,Currency,MSRP,MAP,Checkout MAP,Selling Price,Inventory,Fulfillment Option,Shipping,Activation Mark PS2-BEC-5780,9SIAFDG6V86915,USD,,0.00,False,26.98,37,Seller,free,True

2.csv must in the second column contain Item # from 1.csv

 Overwrite = Yes,,* Changing the setting to ?Overwrite=Yes? will have the added effect of deactivating all of your items from the website except for those listed on this datafeed. If this is not intended, keep ?Overwrite=No?. Part #,Item #,Currency,MSRP,MAP,Checkout MAP,Selling Price,Inventory,Fulfillment Option,Shipping,Activation Mark PS2-BEC-5780,,USD,,0.00,False,26.98,37,Seller,free,True

second: the number of cells in the above columns is approximately 11k and will continue to grow
11K lines for Pandas is "about nothing" ... ;-) With English, they will hardly help you here.
Give the code (attempts) and small examples of input data and what you want to get as a result in question ...
The content of the first column will change and each Part # equates its Item #

Accepted Answer · 2018-02-13T14:29:30

If I correctly understood the condition of the problem:

 import pandas as pd d1 = pd.read_csv(filename1, skiprows=1, usecols=['Part #','Item #']) d2 = pd.read_csv(filename2, skiprows=1).drop('Item #',axis=1) res = d2.merge(d1, on='Part #', how='left')

if you want to add only one column from another CSV file (DataFrame), it will be more efficient to use the .map() method:

 d1 = pd.read_csv(filename1, skiprows=1, usecols=['Part #','Item #']) d2 = pd.read_csv(filename2, skiprows=1) d2['Item #'] = d2['Part #'].map(d1.set_index('Part #')['Item #'])

Example:

 In [106]: d2['Item #'] = d2['Part #'].map(d1.set_index('Part #')['Item #']) In [107]: d2 Out[107]: Part # Item # Currency MSRP MAP Checkout MAP Selling Price Inventory Fulfillment Option Shipping \ 0 PS2-BEC-5780 9SIAFDG6V86915 USD NaN 0.0 False 26.98 37 Seller free Activation Mark 0 True

UPDATE: pandas.DataFrame is a tabular structure in memory. To save a DataFrame in CSV, you can use the DataFrame.to_csv () method

 d2.to_csv(r'/path/to/file.csv', index=False)

Unfortunately, when executing the script, the "Item #" column remains empty.
@ Dmitriy Vladimirovich, give the question a few lines (4-8) from each CSV file as text so that we can copy it ...
I apologize for the untidy look of the text, while mastering the resource
@ Dmitry Vladimirovich, in your examples there is not a single coincident 'Part #' :( The Python version in this case does not play a big role - the Pandas methods will work for Python 2 and for Python 3 in the same way ...
Imitated the coincidence of columns, corrected the data in the question.

jfs jfs 44.5k eight 53 199 · Answer 2 · 2018-02-13T13:35:41

Unique values can be used as an index pandas.DataFrame , then you can get the necessary values, you can simply use df.loc[] :

 #!/usr/bin/env python3 import pandas as pd # use unique keys as index df = pd.read_csv('1.csv', index_col=0, names=['key', 'value'], header=None) print(df) # read as series keys = pd.read_csv('2.csv', squeeze=True, names=['key']) print('\nKeys:', *keys) print('\n', df.loc[keys])

Example output:

  value key a 1 b 2 c 3 d 4 Keys: bdb value key b 2 d 4 b 2

I understand that the author wants to combine the data from the two CSV, and not to filter ... Although it is probably worth waiting for what the author of the question says ...
@MaxU: judging by the example, the author wants to add a column with the found values.
I will leave an answer for other people who, like me, understood the verbal description of the task (the author is one, Google is big).

How to use python to collect information from two csv files?

2 answers 2

More articles: