Python 遍历pandas列中的嵌套列表，并根据嵌套列表中的值创建字典

发布于09月21日

我有一个数据框，比如:

id        some_binary_col        some_amount_col        nested_lists
123       0                      100                    ['email_rule','phone_rule','score_rule']
456       1                      500                    ['address_rule','zip_rule']
121       1                      300                    ['zip_rule','phone_rule']
122       0                      100                    ['score_rule','phone_rule','new_rule']
133       1                      200                    ['email_rule','address_rule','zip_rule']

可重现性:

ids = [123,456,121,122,133]
some_binary_col = [0,1,1,0,1]
some_amount_col = [100,500,300,100,200]
nested_lists = [
    ['email_rule','phone_rule','score_rule']
    ,['address_rule','zip_rule']
    ,['zip_rule','phone_rule']
    ,['score_rule','phone_rule','new_rule']
    ,['email_rule','address_rule','zip_rule']
]

df = pd.DataFrame()
df['id'] = ids
df['some_binary_col'] = some_binary_col
df['some_amount_col'] = some_amount_col
df['nested_lists'] = nested_lists

我正在制作一本新的词典，它能计算nested_loop列中每个规则的some_binary_col if=1，如下所示:

rule_binary_col_dict = {
    'email_rule': 1
    ,'phone_rule': 1
    ,'score_rule': 0
    ,'address_rule': 2
    ,'zip_rule': 3
    ,'new_rules': 0
}

这nested_list个可以有无限的唯一列表/元素.

我不太擅长的是遍历嵌套列表中的每个元素…… 比如:

for i in df['nested_lists']:
    for j in i:
        **some condition**

我不知道如何访问嵌套列表中的每个元素.

rule_binary_col_dict = {} for row in df.itertuples(index=False): # List of rules for rule in row.nested_lists: # count to 0 if the rule is not in the dictionary if rule not in rule_binary_col_dict: rule_binary_col_dict[rule] = 0 # Increment the count if some_binary_col is 1 if row.some_binary_col == 1: rule_binary_col_dict[rule] += 1 # Print the result print(rule_binary_col_dict)

Python 遍历pandas列中的嵌套列表，并根据嵌套列表中的值创建字典

推荐答案

Python相关问答推荐

当涉及多个产品时，scipy. optimal能否找到最佳输入值？

使用SKLearn KMeans和外部生成的相关矩阵

为什么判断pd.DataFrame的值与判断pd.Series的值存在差异(如果索引中有值)？

Django：如何将一个模型的唯一实例创建为另一个模型中的字段

回归回溯-2D数组中的单词搜索

已安装' owiener ' Python模块，但在导入过程中始终没有名为owiener的模块

当pip为学校作业(job)安装sourcefender时，我没有收到匹配的分发错误.我已经try 过Python 3.8.10和3.10.11

单击Python中的复选框后抓取数据

为什么基于条件的过滤会导致pandas中的空数据框架？

pyautogui.locateOnScreen在Linux上的工作方式有所不同

使用matplotlib pcolormesh，如何停止从一行绘制的磁贴连接到上下行？

在for循环中仅执行一次此操作

我从带有langchain的mongoDB中的vector serch获得一个空数组

不理解Value错误：在Python中使用迭代对象设置时必须具有相等的len键和值

更改键盘按钮进入'

修复mypy错误-赋值中的类型不兼容(表达式具有类型xxx，变量具有类型yyy)

如何使用pytest来查看Python中是否存在class attribution属性？

导入...从...混乱

无法在Docker内部运行Python的Matlab SDK模块，但本地没有问题

将pandas导出到CSV数据，但在此之前，将日期按最小到最大排序