Bootstrap

python DataFrame数据分组统计groupby()函数,值得推荐

  • 4. 通过 字典 和 Series 对象进行分组统计

    • 4.1通过一个字典
  • 4.2通过一个Series

1. groupby基本用法

=====================================================================================

1.1 一级分类_分组求和


import pandas as pd

data = [[‘a’, ‘A’, 109], [‘b’, ‘B’, 112], [‘c’, ‘A’, 125], [‘d’, ‘C’, 120],

[‘e’, ‘C’, 126], [‘f’, ‘B’, 133], [‘g’, ‘A’, 124], [‘h’, ‘B’, 134],

[‘i’, ‘C’, 117], [‘j’, ‘C’, 128]]

index = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

columns = [‘name’, ‘class’, ‘num’]

df = pd.DataFrame(data=data, index=index, columns=columns)

print(df)

print("=================================================")

df1 = df.groupby(‘class’).sum() # 分组统计求和

print(df1)

在这里插入图片描述


1.2 二级分类_分组求和


给groupby()传入一个列表,列表中的元素为分类字段,从左到右分类级别增大。(一级分类、二级分类…)

import pandas as pd

data = [[‘a’, ‘A’, ‘1等’, 109], [‘b’, ‘B’, ‘1等’, 112], [‘c’, ‘A’, ‘1等’, 125], [‘d’, ‘B’, ‘2等’, 120],

[‘e’, ‘B’, ‘1等’, 126], [‘f’, ‘B’, ‘2等’, 133], [‘g’, ‘A’, ‘2等’, 124], [‘h’, ‘B’, ‘1等’, 134],

[‘i’, ‘A’, ‘2等’, 117], [‘j’, ‘A’, ‘2等’, 128], [‘h’, ‘A’, ‘1等’, 130], [‘i’, ‘B’, ‘2等’, 122]]

index = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

columns = [‘name’, ‘class_1’, ‘class_2’, ‘num’]

df = pd.DataFrame(data=data, index=index, columns=columns)

print(df)

print("=================================================")

df1 = df.groupby([‘class_1’, ‘class_2’]).sum() # 分组统计求和

print(df1)

;