I have a csv file with two columns: user_id and started_at. Values started_at is the date of purchase - once (maximum 2) times per month for one id. I want to somehow compare id by month. For this, I thought to make a table with id and columns with dates by months. How better to turn this thing? no matter how I try to do something on a primitive, somehow everything is not very good. Here is the option that I still have, but this option is eaten part of the data.
dat.index=pd.to_datetime(dat['started_at']) dat5=dat[:'2015-05-31'] dat6=dat['2015-06-01':'2015-06-30'] dat7=dat['2015-07-01':'2015-07-31'] dat8=dat['2015-08-01':] dat5.index=dat5['user_id'] dat6.index=dat6['user_id'] dat7.index=dat7['user_id'] dat8.index=dat8['user_id'] data=dat6.merge(dat5, 'right', on='user_id') data1=dat7.merge(data, 'right', on='user_id') data2=dat8.merge(data1, 'right', on='user_id') data2