Python 通过不同列有效地查找 pandas 数据帧过滤中的值

发布于08月30日

我正在做一些Pandas 数据帧的工作，我需要根据不同的列过滤数据.这很容易通过PANAS语法完成，如下面的最小工作示例所示:

import pandas as pd

#Create DataFrame
df = pd.DataFrame({'col1':['one','two','three','four', 'five'],'col2':[1,2,3,4,5],'col3':[0.3,0.4,0.5,0.6,0.7], 'col4':[0.4,0.5,0.6,0.7,0.8]})
print(df)
#Filter wanted values
print(df[(df['col1'] == 'one') & (df['col2'] == 1)])

问题是，当您必须筛选的列数很大(超过4列)时，这个语法很快就会变成一个由&和|组成的非常长的链，上面有很多冗余的元素，之后真的很难阅读.我在这里举一个这样的例子:data_raw[(data_raw['Metales'] == 'MoO3') | (data_raw['Metales_0'] == 'MoO3') | (data_raw['Metales_1'] == 'MoO3')| (data_raw['Metales_2'] == 'MoO3')] (这只是为了说明它变得多长而不能舒适地阅读，从理论上讲，我不需要为两个不同的值判断同一列)

有没有一种更简单、更简洁的方法来 Select 一组不同列中的值？

import pandas as pd #Create DataFrame df = pd.DataFrame({'col1':['one','two','three','four', 'five'],'col2':[1,2,3,4,5],'col3':[0.3,0.4,0.5,0.6,0.7], 'col4':[0.4,0.5,0.6,0.7,0.8]}) print(df) #Create a dictionary with the column names as keys and the wanted values as values filterv = {'col1': 'one', 'col2': 1, 'col3': 0.3, 'col4': 0.4} #Filter de dataframe according to the values print(df.loc[(df[list(filterv)] == pd.Series(filterv)).all(axis=1)])

1. Creating a dictionary `filterv` where the column names are the keys and the desired values are the values 2. Creating a `pd.Series` out of that dictionary 3. Selecting the corresponding columns of the df with df[list(filterv)] 4. Comparing the columns of the df with the series we created.

Python 通过不同列有效地查找 pandas 数据帧过滤中的值

推荐答案

Python相关问答推荐

如何使用entry.bind(FocusIn，self.Method_calling)用于使用网格/列表创建的收件箱

将HLS纳入媒体包

根据给定日期的状态过滤查询集

如何使用pandasDataFrames和scipy高度优化相关性计算

pandas滚动和窗口中有效观察的最大数量

如何将Docker内部运行的mariadb与主机上Docker外部运行的Python脚本连接起来

如何在WSL2中更新Python到最新版本(3.12.2)？

什么是最好的方法来切割一个相框到一个面具的第一个实例？

Python中的变量每次增加超过1

在pandas数据框中计算相对体积比指标，并添加指标值作为新列

如何杀死一个进程，我的Python可执行文件以sudo启动？

如何使用使用来自其他列的值的公式更新一个rabrame列？

剪切间隔以添加特定日期

将一个双框爆炸到另一个双框的范围内

将CSS链接到HTML文件的问题

仅使用预先计算的排序获取排序元素

如何使用matplotlib查看并列直方图

提取最内层嵌套链接

如何在Python中解析特定的文本，这些文本包含了同一行中的所有内容，

如果列包含空值，则PANAS查询不起作用