我有一个数据帧df
,如下所示:
import pandas as pd
data = {'A': ['XYZ', 'XYZ', 'XYZ', 'XYZ', 'PQR', 'PQR', 'PQR', 'PQR', 'CVB', 'CVB', 'CVB', 'CVB'], 'B': ['2022-02-16 14:00:31', '2022-02-16 16:11:26', '2022-02-16 17:31:26',
'2022-02-16 22:47:46', '2022-02-17 07:11:11', '2022-02-17 10:43:36',
'2022-02-17 15:05:11', '2022-02-18 18:06:12', '2022-02-19 09:05:46',
'2022-02-19 13:02:16', '2022-02-19 18:05:26', '2022-02-19 22:05:26'], 'C': [1,0,0,0,1,0,1,0,0,0,0,1]}
df = pd.DataFrame(data)
df['B'] = pd.to_datetime(df['B'])
df
| A | B | C |
+-------+----------------------+------------+
| XYZ | 2022-02-16 14:00:31 | 1 |
| XYZ | 2022-02-16 16:11:26 | 0 |
| XYZ | 2022-02-16 17:31:26 | 0 |
| XYZ | 2022-02-16 22:47:46 | 0 |
| PQR | 2022-02-17 07:11:11 | 1 |
| PQR | 2022-02-17 10:43:36 | 0 |
| PQR | 2022-02-17 15:05:11 | 1 |
+-------+----------------------+------------+
我想要实现的是对1和0的出现次数进行计数,并将计数值指定为DataFrame df
的新列,并添加ID
作为新列,以便预期的输出应如下所示.例如,在列C
中,用于前四行的图案1,0,0,0
的计数为4,并且类似地,在最后一行中仅存在计数为1的值1
.
Expected Output :
| A | B | C | Count | ID |
+-------+----------------------+------------+----------+---------+
| XYZ | 2022-02-16 14:00:31 | 1 | 4 | ABC_1 |
| XYZ | 2022-02-16 16:11:26 | 0 | NaN | |
| XYZ | 2022-02-16 17:31:26 | 0 | NaN | |
| XYZ | 2022-02-16 22:47:46 | 0 | NaN | |
| PQR | 2022-02-17 07:11:11 | 1 | 2 | ABC_2 |
| PQR | 2022-02-17 10:43:36 | 0 | NaN | |
| PQR | 2022-02-17 15:05:11 | 1 | 1 | ABC_3 |
+-------+----------------------+------------+----------+---------+
目前,我正试图通过使用下面的代码来实现相同的效果,但我无法获得预期的/期望的结果.
one_index = df[df['C'] == 1].index
zero_index = df[df['C'] == 0].index
df.loc[0, 'Count'] = len(df)
df.loc[one_index, 'ID'] = "ABC_1"
Actual Output :
| A | B | C | Count | ID |
+-------+----------------------+------------+----------+--------+
| XYZ | 2022-02-16 14:00:31 | 1 | 7 | ABC_1 |
| XYZ | 2022-02-16 16:11:26 | 0 | NaN | |
| XYZ | 2022-02-16 17:31:26 | 0 | NaN | |
| XYZ | 2022-02-16 22:47:46 | 0 | NaN | |
| PQR | 2022-02-17 07:11:11 | 1 | NaN | ABC_1 |
| PQR | 2022-02-17 10:43:36 | 0 | NaN | |
| PQR | 2022-02-17 15:05:11 | 1 | NaN | ABC_1 |
+-------+----------------------+------------+----------+--------+
我如何计算Pandas 数据框中1和0的出现次数?