How can I perform operations in a csv file between columns or rows? I would love to read the instructions and preferably using simple methods. An example of what to do:

  1. Add from the Open column the value of the first row to the second and write the result into a new column. Then the second with the third, etc.
  2. Add the value of the first row from the fifth column from the Open column and write the result in a new column. Then the second with a procession, etc.
  3. Add the value of the first row from the fifth column from the Open column and write the result in a new column. Then the first with the procession, the seventh, etc.
  4. If in the Close line the value is less than in the previous line of the Open column, then put the letter A in the new column.
  5. If in the Close line the value is less than in the previous row of the Open column, then we put the Close value of this row in the new column by adding the value from the Open cell, which is 2 cells lower, to it.
  6. If in the Close line the value is less than in the previous line and greater than the next, then we write the value from the cell 5 lines lower to the new column, adding to the cell 3 lines higher from the Close column.

Examples are slightly figurative to figure out how this works.

import os import glob import pandas as pd import pandas_datareader.data as wb a = r'C:/Users/II/Downloads/*.csv' files = glob.glob(a) df = pd.read_csv(f, index_col='Date', encoding='latin1', parse_dates=['Date']) 

Data

 Date,Open,High,Low,Close,Volume,Adj Close 2011-04-15,15.0,15.0,15.0,15.0,300.0,10.263332 2011-04-18,14.78,14.85,14.24,14.24,1800.0,9.743323 2011-04-19,14.85,14.85,14.85,14.85,400.0,10.160698 2011-04-20,15.49,15.49,14.71,14.71,1100.0,10.064907 2011-04-21,15.14,15.56,14.81,15.01,18600.0,10.270174 2011-04-25,15.01,15.01,15.01,15.01,0.0,10.270174 2011-04-26,15.01,15.01,15.01,15.01,0.0,10.270174 2011-04-27,15.0,15.63,15.0,15.63,3300.0,10.694392 2011-04-28,15.0,15.0,15.0,15.0,500.0,10.263332 2011-04-29,15.18,15.18,14.99,14.99,1700.0,10.256489 2011-05-02,15.0,15.05,14.99,14.99,2000.0,10.256489 2011-05-03,15.05,15.05,14.99,14.99,700.0,10.256489 2011-05-04,14.76,14.76,14.2,14.2,1700.0,9.715954 

Example with df = pd.DataFrame ({'a': [1,2,3,4], 'b': [11,12,13,14], 'c': [21,22,23,24] })

  1. Add from "a" 1 and 2 and write the result in a new column. Then 2 and 3, etc.
  2. Add from column "a" 1 and 3 and write the result to a new column. Then 2 and 4, etc.
  3. Add the values ​​1 and 3 from column β€œa” and write the result in a new column. Then 1 and 4, etc.
  4. If in the line "c" the value 22 is less than in the previous "a" 1, then put the letter A in the new column.
  5. If in line "c" the value 22 is less than in the previous "a" 1, then we put in the new column the value "c" 22 adding to it the value from cell "a", which is 2 positions lower - the number 4.
  6. If the value β€œ22” in the β€œc” line is less than in the previous line and more than the next, 21> 22 <23, then we write the value 24 in the new column (two lines down), adding 21 to the cell one line higher.
  • I suggest you make another example - for example, df = pd.DataFrame({'a':[1,2,3], 'b':[11,12,13], 'c':[21,22,23]}) and based on this DF, show what you want to receive, i.e. indicate in the answer the resulting DataFrame's for each question - MaxU
  • Yes, otherwise it is not entirely clear - MaxU
  • I meant to show the values . For example: in the first question in the new column the following values ​​are expected: [3,5,7,???] - MaxU

1 answer 1

Useful links to documentation and examples:

Source DataFrame:

 In [21]: df Out[21]: abc 0 1 11 21 1 2 12 22 2 3 13 23 3 4 14 24 

Solutions:

one.

 In [17]: df['n1'] = df['a'].shift(-1) + df['a'] In [18]: df Out[18]: abc n1 0 1 11 21 3.0 1 2 12 22 5.0 2 3 13 23 7.0 3 4 14 24 NaN In [19]: df['n1'] = df['a'].shift(-1).fillna(0) + df['a'] In [20]: df Out[20]: abc n1 0 1 11 21 3.0 1 2 12 22 5.0 2 3 13 23 7.0 3 4 14 24 4.0 

2

 In [22]: df['n2'] = df['a'].shift(-2) + df['a'] In [23]: df Out[23]: abc n1 n2 0 1 11 21 3.0 4.0 1 2 12 22 5.0 6.0 2 3 13 23 7.0 NaN 3 4 14 24 4.0 NaN 

3

 In [25]: df['n3'] = df.loc[0, 'a'] + df['a'].shift(-2) In [26]: df Out[26]: abc n1 n2 n3 0 1 11 21 3.0 4.0 4.0 1 2 12 22 5.0 6.0 5.0 2 3 13 23 7.0 NaN NaN 3 4 14 24 4.0 NaN NaN 
  1. not clear, indicate the expected result in the question
  2. not clear, indicate the expected result in the question
  3. not clear, indicate the expected result in the question