I need to get at least for each row and only from columns with 1 in the title, but so that it exceeds the threshold value specified in the max_noise column.

Here is the code creating my demo matrix example.

 x = pd.Series([1, 4, 3, 2, 1, 5], [7,1,3,6,7,9]) df = pd.DataFrame({ "A": x**2+8, "B": x*9, "C": x+24, "D": (x*x)+3, "E": (x*2)+5, "F": (3**xx), 'idx':[1,0,1,8,2,1]}) df.columns = df.pop('idx') df['max_noise'] = df.loc[:, df.columns != (1)].max(axis=1) df 

I can get the columns with the name 1 like this: df.loc[:, (df.columns == 1 | 0) ] But I can not guess how to use the min() function with the condition.

Here is the screen that I want to get in the end:

enter image description here

  • I have already advised you to use unique column names and index values. The solution is likely to be cumbersome and ineffective simply because it will be necessary to bypass these “rakes” everywhere ... - MaxU
  • @MaxU, I need to use some tags. In this case, 1 means that a particular vector belongs to a cluster. But it is quite possible to add some kind of suffix for uniqueization (for example, 1_a, 1_b, 1_c ... ), or it may be a multi-index to use. - Mavar 2:38 pm
  • labels can be stored separately from column names. You can use masks, etc. - MaxU pm
  • @MaxU, I am not familiar with masks. Can you give an example? - Mavar 2:41 pm
  • an example of using a mask for an index ... In your case, the mask will be used for columns - MaxU

1 answer 1

 In [281]: x = pd.DataFrame(df.loc[:, df.columns == 1].values) In [282]: x Out[282]: 0 1 2 0 9 25 2 1 24 28 77 2 17 27 24 3 12 26 7 4 9 25 2 5 33 29 238 In [283]: df['porog'] = (x[x.gt(df['max_noise'].values, axis=0)] .min(axis=1) .fillna(0) .astype('int') .values) In [284]: df Out[284]: idx 1 0 1 8 2 1 max_noise porog 7 9 9 25 4 7 2 9 25 1 24 36 28 19 13 77 36 77 3 17 27 27 12 11 24 27 0 6 12 18 26 7 9 7 18 26 7 9 9 25 4 7 2 9 25 9 33 45 29 28 15 238 45 238 
  • I can’t use the function (x[x.gt(df['max_noise'].values, axis=0)] .min(axis=1) .fillna(0) .astype('int') .values) eat up all Memory - Mavar
  • @Mavar, these are the consequences of circumventing the "rake" that I talked about in the comments ... - MaxU February
  • If you rename the columns for example by the mask 1_a, 1_b, 1_c ... will this help? - Mavar pm
  • Try to create a DataFrame with unique column names and unique index values. - MaxU February 2:23 pm