Python Pandas：将行的实体向右移动(结束)

发布于07月17日

我有以下数据框("日期"列的数量可能会有所不同):

Customer Date1 Date2 Date3 Date4个个 0 A 10 40.0南60.0

1 B 20 50.0南南

2英寸30毫微米

如果最后一列中有"NaN"(如上所述，列数可能会有所不同)，我希望将所有列右移到数据框的末尾，使其如下所示:

Customer Date1 Date2 Date3 Date4个

0 A 10 40.0南60.0

1 B南南20 50.0

2摄氏度30

所有保留为空的值都可以设置为NaN.

我如何在Python中做到这一点呢？

我try 了以下代码，但不起作用:

import numpy as np
import pandas as pd

data = {
    'Customer': ['A', 'B', 'C'],
    'Date1': [10, 20, 30],
    'Date2': [40, 50, np.nan],
    'Date3': [np.nan, np.nan, np.nan],
    'Date4': [60, np.nan, np.nan]
}

df = pd.DataFrame(data)


for i in range(1, len(df.columns)):
    df.iloc[:, i] = df.iloc[:, i-1].shift(fill_value=np.nan)

print(df)

推荐答案

您可以将非目标列临时设置为索引(或删除它们)，然后使用排序将非NAN推到右侧，并且只更新与特定掩码匹配的行(这里是最后一列中的NaN):

out = (df
   .set_index('Customer', append=True)
   .pipe(lambda d: d.mask(d.iloc[:, -1].isna(),
                          d.transform(lambda x : sorted(x, key=pd.notnull), axis=1)
                         )
        )
   .reset_index('Customer')
)

替代方案:

other_cols = ['Customer']
out = df.drop(columns=other_cols)
m = out.iloc[:, -1].isna()
out.loc[m, :] = out.loc[m, :].transform(lambda x : sorted(x, key=pd.notnull), axis=1)
out = df[other_cols].join(out)[df.columns]

NB. there are several methods to shift non-NaNs, 100 is one, but non-sorting based methods are possible if this is a bottleneck.

输出:

  Customer  Date1  Date2  Date3  Date4
0        A   10.0   40.0    NaN   60.0
1        B    NaN    NaN   20.0   50.0
2        C    NaN    NaN    NaN   30.0