如何从数据帧的两列和三列获得唯一列表的联合?

这是我正在使用的数据帧:

Col1 Extract              Col2 Extract           Col3 Extract      
------------              ------------           ------------
['unclassified']          ['sink', 'fridge']     ['unclassified']
['fridge', 'microwave']   ['fridge', 'stove']    ['sink']          
['unclassified']          ['unclassified']       ['unclassified']

我想要的是用Pandas 的方式把(‘COL1提取物’+‘COL2提取物’)和(‘COL1提取物’+‘COL2提取物’+‘COL3提取物’)的唯一列表合并.这就是我要找的:

Col1+Col2                             Col1+Col2+Col3
------------                          ---------------             
['unclassified', 'sink', 'fridge']    ['unclassified', 'sink', 'fridge']      
['fridge', 'microwave', 'stove']      ['fridge', 'microwave', 'stove', 'sink']          
['unclassified']                      ['unclassified']  

推荐答案

set秒内联接列并删除重复项:

df['Col1+Col2'] = (df['Col1 Extract'] + df['Col2 Extract']).apply(lambda x: list(set(x)))
df['Col1+Col2+Col3'] = (df['Col1 Extract'] + df['Col2 Extract'] + df['Col3 Extract']).apply(lambda x: list(set(x)))
print (df)
          Col1 Extract     Col2 Extract    Col3 Extract  \
0       [unclassified]   [sink, fridge]  [unclassified]   
1  [fridge, microwave]  [fridge, stove]          [sink]   
2       [unclassified]   [unclassified]  [unclassified]   

                      Col1+Col2                    Col1+Col2+Col3  
0  [fridge, unclassified, sink]      [fridge, unclassified, sink]  
1    [stove, fridge, microwave]  [stove, fridge, microwave, sink]  
2                [unclassified]                    [unclassified] 

如果订购很重要,那么使用dict.fromkeys个小窍门:

df['Col1+Col2'] = (df['Col1 Extract'] + df['Col2 Extract']).apply(lambda x: list(dict.fromkeys(x)))
df['Col1+Col2+Col3'] = (df['Col1 Extract'] + df['Col2 Extract'] + df['Col3 Extract']).apply(lambda x: list(dict.fromkeys(x)))
print (df)
          Col1 Extract     Col2 Extract    Col3 Extract  \
0       [unclassified]   [sink, fridge]  [unclassified]   
1  [fridge, microwave]  [fridge, stove]          [sink]   
2       [unclassified]   [unclassified]  [unclassified]   

                      Col1+Col2                    Col1+Col2+Col3  
0  [unclassified, sink, fridge]      [unclassified, sink, fridge]  
1    [fridge, microwave, stove]  [fridge, microwave, stove, sink]  
2                [unclassified]                    [unclassified]  

Python相关问答推荐

具有症状的分段函数:如何仅针对某些输入值定义函数?

剧作家Python没有得到回应

删除所有列值,但判断是否存在任何二元组

如何使用LangChain和AzureOpenAI在Python中解决AttribeHelp和BadPressMessage错误?

Mistral模型为不同的输入文本生成相同的嵌入

如何从.cgi网站刮一张表到rame?

OR—Tools CP SAT条件约束

如何获得每个组的时间戳差异?

将9个3x3矩阵按特定顺序排列成9x9矩阵

给定高度约束的旋转角解析求解

使用Python从URL下载Excel文件

Python Tkinter为特定样式调整所有ttkbootstrap或ttk Button填充的大小,适用于所有主题

pandas:对多级列框架的列进行排序/重新排序

关于两个表达式的区别

如何在海上配对图中使某些标记周围的黑色边框

当单元测试失败时,是否有一个惯例会抛出许多类似的错误消息?

如何在Python中使用Iscolc迭代器实现观察者模式?

根据Pandas中带条件的两个列的值创建新列

应用指定的规则构建数组

高效生成累积式三角矩阵