Example csv:

Date,1,2,3,4,A,B 2010-01-04,213.429998,214.499996,212.38000099999996,214.009998,123432400,27.727039 2010-01-05,214.599998,215.589994,213.249994,214.379993,150476200,27.774976000000002 2010-01-06,214.379993,215.23,210.750004,210.969995,138040000,27.333178000000004 2010-01-07,211.75,212.000006,209.050005,210.58,119282800,27.28265 2010-01-08,210.299994,212.000006,209.06000500000002,211.98000499999998,111902700,27.464034 

It is required to sort the resulting values ​​only on Mondays, and then the values ​​of Mondays to sort from larger to smaller. For example, there are dates in csv 2010.01.01.-31 - Mondays there are only 4, 11, 18, 25 number. It is necessary to leave only the rows with these dates and the specified column 'total'

 import os import glob import pandas as pd import matplotlib matplotlib.style.use('ggplot') file_mask = r'C:/Users/II/Downloads/*.csv' files = glob.glob(file_mask) for f in files: df = pd.read_csv(f, index_col='Date', encoding='latin1') df['total'] = df['1'] - df['2'] # теперь нужно отсортировать значения на основе дат по понедельникам. df = df.sort('total') # сортировка данных в столбце от большего к меньшему. new_fn = '{0[0]}_total{0[1]}'.format(os.path.splitext(f)) df.to_csv(new_fn) 

  • I honestly did not understand your question at all - нужно отсортировать значения на основе дат по понедельникам . Can you give an example of your data (or a piece of code to create it) and the expected result? - MaxU
  • You can also clarify the значения понедельников отсортировать от большего к меньшему - sort by which field? By total ? - MaxU
  • I correctly understood that you want to filter (discard) all the data that do not fall on Mondays? Another clarification: столбец "1" - соответствует "Open", "2" - "High" ? - MaxU

2 answers 2

Here is an example with classic (original) column names for a single DataFrame:

 import pandas as pd import pandas_datareader.data as wb import matplotlib.pyplot as plt import matplotlib matplotlib.style.use('ggplot') a = wb.DataReader('GOOG', 'yahoo', '2017-01-01') # я так и не понял почему `total = Open - High`? ;) a['total'] = a['Open'] - a['High'] # weekday: 0 - Понедельник, ..., 6 - Воскресенье a.loc[a.index.weekday == 0, 'total'].plot() plt.show() # сохранить график в файл... plt.savefig('/path/to/filename.png') 

PS total sorting will not do anything (the graph will remain the same), since values ​​will be sorted by the X axis, i.e. by Date

Result:

enter image description here

Example of filtered data:

 In [69]: a.loc[a.index.weekday == 0, 'total'] Out[69]: Date 2017-01-09 0.250000 2017-01-23 12.059998 2017-01-30 -12.339966 2017-02-06 1.640015 2017-02-13 3.239990 Name: total, dtype: float64 
  • If you want to draw graphs, then you need to combine the date and time in one column - MaxU
  • And where does time come from? - MaxU
  • @Alex, what exactly is the problem? - MaxU
  • Now I don’t have enough time to figure out all of your code. Especially not understanding where exactly you have a problem. Therefore, I advise you to debug your code in parts, and if you have problems with some part, ask the appropriate question. An alternative option to put a label in the question - "code inspection" and hope that someone will respond ... - MaxU

The resample function and the Offset Aliases parameter are responsible for changing the date frequency. The syntax resample is simple, found here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.resample.html?highlight=resample#pandas.DataFrame.resample . A list of possible parameters Offset Aliases here http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases .

Specifically for Mondays, the resample function should get the parameter "W-MON"

  • It seems to me that resample in this case (in the question clarified by the author) is not quite suitable because after resample() data will be aggregated, not filtered. Although, can the author do just that ??? - MaxU