我有一个数据帧,看起来像这样:

data = {'Region': ['Africa','Africa','Africa','Africa','Africa','Africa','Africa','Africa','Asia','Asia','Asia','Asia'],
         'Country': ['South Africa','South Africa','South Africa','South Africa','South Africa','South Africa','South Africa','South Africa','Japan','Japan','Japan','Japan'],
         'Product': ['ABC','ABC','ABC','ABC','XYZ','XYZ','XYZ','XYZ','DEF','DEF','DEF','DEF'],
         'Year': [2016, 2017, 2018, 2019,2016, 2017, 2018, 2019,2016, 2017, 2018, 2019],
         'Price': [500, 400, 0,450,750,0,0,890,0,0,415,0],
         'Quantity': [1200,1700,0,330,500,0,0,120,300,0,50,0],
         'Value': [600000,680000,0,148500,350000,0,0,106800,0,0,20750,0]}

df = pd.dataframe(data)

我想用NaN替换所有的数值(例如,年份、价格、数量、价值列中的数值),但我想不出一个好方法.

推荐答案

DataFrame.select_dtypes Select 数值列并设置缺失值:

df[df.select_dtypes(np.number).columns] = np.nan

或者,如果可能,某些行具有数字或保存的数字,如字符串,使用to_numeric进行测试,使用DataFrame.where进行设置NaNs:

df = df.where(df.apply(pd.to_numeric, errors='coerce').isna())
print (df)
    Region       Country Product  Year  Price  Quantity  Value
0   Africa  South Africa     ABC   NaN    NaN       NaN    NaN
1   Africa  South Africa     ABC   NaN    NaN       NaN    NaN
2   Africa  South Africa     ABC   NaN    NaN       NaN    NaN
3   Africa  South Africa     ABC   NaN    NaN       NaN    NaN
4   Africa  South Africa     XYZ   NaN    NaN       NaN    NaN
5   Africa  South Africa     XYZ   NaN    NaN       NaN    NaN
6   Africa  South Africa     XYZ   NaN    NaN       NaN    NaN
7   Africa  South Africa     XYZ   NaN    NaN       NaN    NaN
8     Asia         Japan     DEF   NaN    NaN       NaN    NaN
9     Asia         Japan     DEF   NaN    NaN       NaN    NaN
10    Asia         Japan     DEF   NaN    NaN       NaN    NaN
11    Asia         Japan     DEF   NaN    NaN       NaN    NaN

Python相关问答推荐

优化pytorch函数以消除for循环

如何使用表达式将字符串解压缩到Polars DataFrame中的多个列中?

多指标不同顺序串联大Pandas 模型

什么是合并两个embrame的最佳方法,其中一个有日期范围,另一个有日期没有任何共享列?

为什么np. exp(1000)给出溢出警告,而np. exp(—100000)没有给出下溢警告?

lityter不让我输入左边的方括号,'

OpenCV轮廓.很难找到给定图像的所需轮廓

Python—为什么我的代码返回一个TypeError

在Docker容器(Alpine)上运行的Python应用程序中读取. accdb数据库

如何在Great Table中处理inf和nans

使用__json__的 pyramid 在客户端返回意外格式

用两个字符串构建回文

判断Python操作:如何从字面上得到所有decorator ?

如何合并具有相同元素的 torch 矩阵的行?

解决Geopandas和Altair中的正图和投影问题

如何在Python中画一个只能在对角线内裁剪的圆?

将时间序列附加到数据帧

如何判断特定的OPC UA node 是否已经存在Asyncua?

为什么Python多处理.Process()传递队列参数并且读取比函数传递队列参数和读取更快?

对齐多个叠置多面Seborn CAT图