Replacing values in the DataFrame Pandas: replace all values in the “A” column with average values

Question

There is a df - DataFrame pandas. It has two columns. In the column "A" is a variable, in the column "B" is a class label.

How to replace the values in the column "A" by the average values for the corresponding class?

It’s easy to calculate the mean values themselves:

df.groupby('B')['A'].mean()

But now how to replace all the values in the "A" column with the calculated averages?

Answer 1 · 2016-11-22T18:51:16

Use the GroupBy.transform () function.

Example:

source DF:

 In [36]: df = pd.DataFrame({'A':np.random.randint(0, 10, 10), 'B':np.random.choice(list('XYZ'), 10)}) In [37]: df Out[37]: AB 0 8 Y 1 8 Z 2 3 Z 3 1 Y 4 3 Y 5 5 Z 6 7 Z 7 1 X 8 4 X 9 2 Y

Decision:

 In [39]: df['avg_A'] = df.groupby('B')['A'].transform('mean')

Result:

 In [40]: df Out[40]: AB avg_A 0 8 Y 3.50 1 8 Z 5.75 2 3 Z 5.75 3 1 Y 3.50 4 3 Y 3.50 5 5 Z 5.75 6 7 Z 5.75 7 1 X 2.50 8 4 X 2.50 9 2 Y 3.50

if you need to replace the values in column A :

 df['A'] = df.groupby('B')['A'].transform('mean')

Replacing values in the DataFrame Pandas: replace all values in the “A” column with average values

1 answer 1

More articles:

Replacing values ​​in the DataFrame Pandas: replace all values ​​in the “A” column with average values

1 answer 1

More articles:

Replacing values in the DataFrame Pandas: replace all values in the “A” column with average values