Under the conditions of the problem: Combine frames, leave only those records that are found in all frames (in a specific column)
Suppose we have not two years, and N - the number. But we also need to combine these unique values ββfor 2 years. That is, let's say we have a period of 2005-2015. You must repeat this code:
res = (pd.concat(items, ignore_index=True) .groupby('FIRM') .filter(lambda x: x['YEAR'].nunique()==len(items)) .sort_values(['FIRM','YEAR']))
For all years and save to file Excel, sheet with the name of the years. Those. we have to have 10 DF and, accordingly, 10 sheets in the output Excel file with names like "2005-2006", "2006-2007" ....... "2014-2015".
So, at the entrance there is something like this:
FIRM YEAR x1 x2 x3 A 2005 10 3 9 B 2005 3 4 5 D 2005 6 4 2 A 2006 2 1 5 A 2007 9 7 9 B 2006 1 3 9 C 2008 9 10 10 C 2007 6 4 7 C 2006 7 5 4 D 2006 4 2 1 B 2007 6 8 8
It is necessary to get an excel file with the names of the sheets 2005-2006, 2006-2007,2007-2008.
So, at the output gets a sheet with the name 2005-2006 and such data
FIRM YEAR x1 x2 x3 A 2005 10 3 9 A 2006 2 1 5 B 2005 3 4 5 B 2006 1 3 9 D 2005 6 4 2 D 2006 4 2 1
2006-2007
FIRM YEAR x1 x2 x3 A 2006 2 1 5 A 2007 9 7 9 B 2006 1 3 9 B 2007 6 8 8 C 2006 7 5 4 C 2007 6 4 7
2007-2008
FIRM YEAR x1 x2 x3 C 2007 6 4 7 C 2008 9 10 10
I think that it is necessary to implement it here through a cycle, because in reality there are many more years)