我想删除pandas rame中特定列中重复发生的、未改变的值(按组单独处理),换句话说,如果它没有发生在彼此之后,则保留剩余值.
特定列(在我的情况下,是value
列).该群体是node
.
我让它运行循环.但Python中的循环非常慢.
有没有方法可以在没有循环的大Pandas 身上实现同样的目标?
按时间排序的表格ASC:
time | node | value | comment (not in df) |
---|---|---|---|
2024-05-07 13:39:31.315437 |
ns=4;i=6 |
NaN |
ok |
2024-05-07 13:39:31.327564 |
ns=4;i=7 |
5,514E+09 |
ok |
2024-05-07 13:39:31.328585 |
ns=4;i=8 |
1 |
ok |
2024-05-07 13:39:31.425523 |
ns=4;i=9 |
33 |
ok |
2024-05-07 13:39:31.561920 |
ns=4;i=10 |
False |
ok |
... | ... | ... | |
2024-05-07 14:30:31.425454 |
ns=4;i=9 |
33 |
remove |
... | ... | ... | |
2024-05-07 15:20:45.445578 |
ns=4;i=9 |
34 |
ok |
... | ... | ... | |
2024-05-07 18:24:34.142277 |
ns=4;i=10 |
33 |
ok |
2024-05-07 18:24:40.245277 |
ns=4;i=9 |
33 |
ok |
2024-05-07 18:24:45.845477 |
ns=4;i=9 |
33 |
remove |
node_values = {}
rows_to_delete = []
for index, row in df.iterrows():
if row['node'] in node_values and node_values[row['node']] == row['value']:
rows_to_delete.append(index)
node_values[row['node']] = row['value']
df = df.drop(index=rows_to_delete)
之前的例子:
time | node | value |
---|---|---|
2024-05-07 13:39:31.315437 |
ns=4;i=6 |
NaN |
2024-05-07 13:39:31.327564 |
ns=4;i=7 |
5,514E+09 |
2024-05-07 13:39:31.328585 |
ns=4;i=8 |
1 |
2024-05-07 13:39:31.425523 |
ns=4;i=9 |
33 |
2024-05-07 13:39:31.561920 |
ns=4;i=10 |
False |
2024-05-07 13:39:31.625523 |
ns=4;i=9 |
33 |
2024-05-07 13:39:31.725523 |
ns=4;i=9 |
34 |
2024-05-07 13:39:31.825523 |
ns=4;i=50 |
34 |
2024-05-07 13:39:31.925523 |
ns=4;i=9 |
34 |
2024-05-07 13:39:32.125523 |
ns=4;i=9 |
33 |
2024-05-07 13:39:31.425523 |
ns=4;i=100 |
True |
之后:
time | node | value |
---|---|---|
2024-05-07 13:39:31.315437 |
ns=4;i=6 |
NaN |
2024-05-07 13:39:31.327564 |
ns=4;i=7 |
5,514E+09 |
2024-05-07 13:39:31.328585 |
ns=4;i=8 |
1 |
2024-05-07 13:39:31.425523 |
ns=4;i=9 |
33 |
2024-05-07 13:39:31.561920 |
ns=4;i=10 |
False |
2024-05-07 13:39:31.725523 |
ns=4;i=9 |
34 |
2024-05-07 13:39:31.825523 |
ns=4;i=50 |
34 |
2024-05-07 13:39:32.125523 |
ns=4;i=9 |
33 |
2024-05-07 13:39:33.225523 |
ns=4;i=100 |
True |