I have a DataFrame that contains different values.

 import pandas as pd df = pd.DataFrame({"data": [1, 1, 1, 1, 0, 0, 0, 2, 2, 3]}) 

I want to calculate how many percent of the total data each value takes, that is, to get a table of the form:

 value | percent _____________________ 0 | 30 ( или 0.3) 1 | 40 ( или 0.4) 2 | 20 ( или 0.2) 3 | 10 ( или 0.1) 

I can count it like this:

 # Добавляю еще одну колонку, чтобы нормально посчитать count() df['column'] = 1 df2 = df.groupby('data').count() df2['percent'] = df2['column'] / len(df.index) 

And I get what I want:

  column percent data 0 3 0.3 1 4 0.4 2 2 0.2 3 1 0.1 

However, I do not leave the feeling that I'm doing everything wrong. And such issues should be solved much easier. Tell me how best to solve my problem?

    1 answer 1

    You can use the GroupBy.size () method - in this case you will not need to create a new column:

     In [4]: df.groupby('data').size() / len(df) Out[4]: data 0 0.3 1 0.4 2 0.2 3 0.1 dtype: float64