我正在与以下几点作斗争.
这是我的数据框.最后两列是我能够在查询中创建的计算.
Day Month Result Vol FirstCol Second Col
0 26 5 Good 123 1% 0%
1 26 5 Bad 716 0% 2%
2 26 5 Other 36 0% 0%
3 26 6 Good 4721 26% 11%
4 26 6 Bad 7148 0% 16%
5 26 6 Other 1387 0% 3%
6 27 5 Good 196 1% 0%
7 27 5 Bad 627 0% 1%
8 27 5 Other 60 0% 0%
9 27 6 Good 6188 34% 14%
10 27 6 Bad 6688 0% 15%
11 27 6 Other 1068 0% 2%
12 28 5 Good 339 2% 1%
13 28 5 Bad 1114 0% 3%
14 28 5 Other 72 0% 0%
15 28 6 Good 6524 36% 15%
16 28 6 Bad 6103 0% 14%
17 28 6 Other 820 0% 2%
计算全部Good Result
%的百分比的第一列
df['FirstCol'] = np.where(df['Result'].isin(['Good']),df['Vol']/df[df['Result']=='Good']['Vol'].sum(),0)
第二列,用于计算全部Result
df['SecondCol'] = df['Vol']/df['Vol'].sum()
对于另一种,代码必须更加动态,这是我正在努力解决的问题. 第三列应该得到基于每个月的百分比.因此,第0-8行的百分比应该是100%,第9-17行的百分比也应该是100%. 第四列应该得到基于每个日期和月份的百分比.因此,第0-2行的百分比应为100%,第3-5行的百分比也应相同,依此类推.我想要一个动态查询.因为我不想每个月都换.
Desired Output
Day Month Result Vol FirstCol Second Col 0-17 Third Col 1-9 (Month) Forth Col 1-3 (Day)
0 26 5 Good 123 1% 0% 1% 14%
1 26 5 Bad 716 0% 2% 5% 82%
2 26 5 Other 36 0% 0% 0% 4%
3 26 6 Good 4721 26% 11% 31% 36%
4 26 6 Bad 7148 0% 16% 48% 54%
5 26 6 Other 1387 0% 3% 9% 10%
6 27 5 Good 196 1% 0% 1% 22%
7 27 5 Bad 627 0% 1% 4% 71%
8 27 5 Other 60 0% 0% 0% 7%
9 27 6 Good 6188 34% 14% 21% 44%
10 27 6 Bad 6688 0% 15% 23% 48%
11 27 6 Other 1068 0% 2% 4% 8%
12 28 5 Good 339 2% 1% 1% 22%
13 28 5 Bad 1114 0% 3% 4% 73%
14 28 5 Other 72 0% 0% 0% 5%
15 28 6 Good 6524 36% 15% 23% 49%
16 28 6 Bad 6103 0% 14% 21% 45%
17 28 6 Other 820 0% 2% 3% 6%