我正在寻找一种方法来进行修改后的pandas插值,以便不将超出限制的连续NaN值填充到数据帧中.
如果这是我开始使用的数据帧:
df = pd.DataFrame({'col1': [0, np.nan, np.nan, np.nan, 3, 4],
'col2': [np.nan, 1, 2, np.nan, 4, np.nan],
'col3': [4, np.nan, np.nan, 7, 10, 11]})
df
col1 col2 col3
0 0.0 NaN 4.0
1 NaN 1.0 NaN
2 NaN 2.0 NaN
3 NaN NaN 7.0
4 3.0 4.0 10.0
5 4.0 NaN 11.0
and I specify that I want to interpolate with a limit of two, with an inside limit area, as seen below:
df.interpolate(method="linear", limit=2, limit_area="inside")
This is the result:
col1 col2 col3
0 0.00 NaN 4.0
1 0.75 1.0 5.0
2 1.50 2.0 6.0
3 NaN 3.0 7.0
4 3.00 4.0 10.0
5 4.00 NaN 11.0
然而,我正在寻找另一种解决方案,以便只有当某一行中的某一特定列的极限NaN等于或小于时,才会发生插值填充.因此,我期望的结果如下所示:
col1 col2 col3
0 0.00 NaN 4.0
1 NaN 1.0 5.0
2 NaN 2.0 6.0
3 NaN 3.0 7.0
4 3.00 4.0 10.0
5 4.00 NaN 11.0
第一列未填充,因为一行中有超过极限(2)个NAN.