There is a code that their excel file reads data on a column label all non-zero values:

import pandas as pd import numpy as np tab=pd.read_excel('primer.xlsx') n_well=len(tab.columns)/5-1 for i in range(1,n_well+2): i=(i-1)*5+1 skv_2=pd.read_excel('primer.xlsx','er',parse_cols=[i+1,i+2,i+3,i+4,i+5]) skv_2drop=skv_2.drop([0,1,2]) met_skv_2=skv_2drop[skv_2drop[i]!=0] met_skv_3=met_skv_2[met_skv_2[i]!=9999] met_skv_3.to_excel('metki1.xlsx',sheet_name='er') 

enter image description here

How to record all the data from the cycle? And then when using met_skv_3.to_excel('metki1.xlsx',sheet_name='er') , the last elements of the loop are written to the file: enter image description here

  • I think it would be easier to merge all DataFrame's into one and write this combined DF to Excel - MaxU
  • I would also not reread Excel in a chain - you can simply select the desired range of columns from tab DF. If you put your Excel file somewhere, I could sketch a working version ... - MaxU
  • @ Maxu dropmefiles.com/KJfVT here is the file. - Ramil Fazliahmetov
  • @MaxU, the point is that the task is such that these ranges may not be 19 pieces, like mine, but maybe more. Besides the cycle, I have no idea how to do it, and it is desirable that the data of each of the ranges are output as well as in the source file - Ramil Fazliahmetov
  • You want to record the result in a vertical form (i.e., blocks of five columns one above the other) - I understand you correctly? - MaxU

1 answer 1

Here is a working version that reads an Excel file only once:

 import pandas as pd fn = r'C:\Temp\.data\primer.xlsx' cols = ['label','x','y','z','value'] df = pd.read_excel(fn, skiprows=4, header=None, parse_cols='C:XFD') dfs = [] for i in range(df.columns.size//5): lbl_col = 5*i x = df.ix[(df[lbl_col] != 0) & (df[lbl_col] != 9999), lbl_col:lbl_col+4] #x.columns = pd.MultiIndex.from_tuples(list(product([i+1], cols))) x.columns = cols dfs.append(x.reset_index(drop=True)) result = pd.concat(dfs, axis=1) #result.to_excel('c:/temp/result.xlsx', startrow=3, index=True) result.to_excel('c:/temp/result.xlsx', startrow=3, index=False) 

Result:

enter image description here

PS I wanted to add the numbers of five-column blocks (as in your source file: 1-19), i.e. create multilevel columns :

 x.columns = pd.MultiIndex.from_tuples(list(product([i+1], cols))) 

but this is not yet implemented:

 NotImplementedError: Writing to Excel with MultiIndex columns and no index ('index'=False) is not yet implemented. 

Those. one has to choose - either multi-level (beautiful) column names and an unnecessary index, or no index and single-level column names ...

PPS if you need a "cap", then you can do it in a similar way.

  • But you can also ask how to check for the presence of all labels from the resulting table (that is, the first column, and if there is no neo - Ramil Fazliahmetov
  • Thank you very much, figured out. Can I have another question? How to make a check from the resulting table for the presence of all labels (that is, the first column) and if some labels are missing, then in their place by x, y, z, value add a line with this label, but with some value (t Is it a mistake, for example 9999)? - Ramil Fazliahmetov
  • @RamilFazliahmetov, always please :) I don’t quite understand what it means to "check for all labels" ... Can you open a new question and give a small (5-7 lines) example? PS I do not want to "overload" the answer ... - MaxU