python dataframe：创建一个列，根据日和月进行动态计算百分比

发布于07月18日

我正在与以下几点作斗争.

这是我的数据框.最后两列是我能够在查询中创建的计算.

    Day Month   Result  Vol     FirstCol    Second Col
0    26    5      Good   123    1%          0%
1    26    5      Bad    716    0%          2%
2    26    5      Other   36    0%          0%
3    26    6      Good  4721    26%         11%
4    26    6      Bad   7148    0%          16%
5    26    6      Other 1387    0%          3%
6    27    5      Good   196    1%          0%
7    27    5      Bad    627    0%          1%
8    27    5      Other   60    0%          0%
9    27    6      Good  6188    34%         14%
10   27    6      Bad   6688    0%          15%
11   27    6      Other 1068    0%          2%
12   28    5      Good   339    2%          1%
13   28    5      Bad   1114    0%          3%
14   28    5      Other   72    0%          0%
15   28    6      Good  6524    36%         15%
16   28    6      Bad   6103    0%          14%
17   28    6      Other  820    0%          2%

计算全部Good Result%的百分比的第一列

df['FirstCol'] = np.where(df['Result'].isin(['Good']),df['Vol']/df[df['Result']=='Good']['Vol'].sum(),0)

第二列，用于计算全部Result

df['SecondCol'] = df['Vol']/df['Vol'].sum()

对于另一种，代码必须更加动态，这是我正在努力解决的问题. 第三列应该得到基于每个月的百分比.因此，第0-8行的百分比应该是100%，第9-17行的百分比也应该是100%. 第四列应该得到基于每个日期和月份的百分比.因此，第0-2行的百分比应为100%，第3-5行的百分比也应相同，依此类推.我想要一个动态查询.因为我不想每个月都换.

Desired Output

    Day Month   Result  Vol FirstCol    Second Col 0-17 Third Col 1-9 (Month)   Forth Col 1-3 (Day)
0    26 5   Good    123     1%          0%                  1%                  14%
1    26 5   Bad     716     0%          2%                  5%                  82%
2    26 5   Other    36     0%          0%                  0%                  4%
3    26 6   Good    4721    26%         11%                 31%                 36%
4    26 6   Bad     7148    0%          16%                 48%                 54%
5    26 6   Other   1387    0%          3%                  9%                  10%
6    27 5   Good     196    1%          0%                  1%                  22%
7    27 5   Bad      627    0%          1%                  4%                  71%
8    27 5   Other     60    0%          0%                  0%                  7%
9    27 6   Good    6188    34%         14%                 21%                 44%
10   27 6   Bad     6688    0%          15%                 23%                 48%
11   27 6   Other   1068    0%          2%                  4%                  8%
12   28 5   Good     339    2%          1%                  1%                  22%
13   28 5   Bad     1114    0%          3%                  4%                  73%
14   28 5   Other     72    0%          0%                  0%                  5%
15   28 6   Good    6524    36%         15%                 23%                 49%
16   28 6   Bad     6103    0%          14%                 21%                 45%
17   28 6   Other    820    0%          2%                  3%                  6%

# sort the dataframe to have nicer output: df = df.sort_values(by=['Month', 'Day']) df['Third Col'] = df.groupby('Month')['Vol'].transform(lambda x: (x / x.sum()) *100) df['Fourth Col'] = df.groupby(['Day', 'Month'])['Vol'].transform(lambda x: (x / x.sum())*100) print(df)

Day Month Result Vol FirstCol Second Col Third Col Fourth Col 0 26 5 Good 123 1% 0% 3.746573 14.057143 1 26 5 Bad 716 0% 2% 21.809321 81.828571 2 26 5 Other 36 0% 0% 1.096558 4.114286 6 27 5 Good 196 1% 0% 5.970149 22.197055 7 27 5 Bad 627 0% 1% 19.098386 71.007928 8 27 5 Other 60 0% 0% 1.827597 6.795017 12 28 5 Good 339 2% 1% 10.325921 22.229508 13 28 5 Bad 1114 0% 3% 33.932379 73.049180 14 28 5 Other 72 0% 0% 2.193116 4.721311 3 26 6 Good 4721 26% 11% 11.614633 35.614062 4 26 6 Bad 7148 0% 16% 17.585554 53.922752 5 26 6 Other 1387 0% 3% 3.412306 10.463186 9 27 6 Good 6188 34% 14% 15.223756 44.377510 10 27 6 Bad 6688 0% 15% 16.453859 47.963282 11 27 6 Other 1068 0% 2% 2.627500 7.659208 15 28 6 Good 6524 36% 15% 16.050385 48.516398 16 28 6 Bad 6103 0% 14% 15.014638 45.385588 17 28 6 Other 820 0% 2% 2.017369 6.098014

python dataframe：创建一个列，根据日和月进行动态计算百分比

推荐答案

Python相关问答推荐

如何修复使用turtle和tkinter制作的绘画应用程序的撤销功能

使用from_pandas将GeDataFrame转换为polars失败，ArrowType错误：未传递numpy. dype对象

如何通过多2多字段过滤查询集

如何使用pandasDataFrames和scipy高度优化相关性计算

滚动和，句号来自Pandas列

Python上的Instagram API：缺少client_id参数"

如何在solve()之后获得症状上的等式的值

为什么抓取的HTML与浏览器判断的元素不同？

给定高度约束的旋转角解析求解

Django—cte给出：QuerySet对象没有属性with_cte''''

基于行条件计算(pandas)

未调用自定义JSON编码器

搜索按钮不工作，Python tkinter

计算空值

如何强制向量中的特定元素在Gekko中处于优化解决方案中

如何在一组行中找到循环？

我可以不带视频系统的pygame，只用于游戏手柄输入吗？''

PySpark：如何最有效地读取不同列位置的多个CSV文件

来自Airflow Connection的额外参数

PYTHON中的selenium不会打开 chromium URL