Python Pandas：将值从一列移动到适当的列

发布于03月14日

我的谷歌功能失败了.我有一个简单的框架，看起来像这样:

Sample	Subject	Person	Place	Thing
1-1	Janet
1-1	Boston
1-1	Hat
1-2	Chris
1-2	Austin
1-2	Scarf

我希望主题列中的值移动到相应的列中，这样我就得到了如下所示的结果:

Sample	Subject	Person	Place	Thing
1-1	Janet	Janet	Boston	Hat
1-2	Chris	Chris	Austin	Scarf

我看过旋转和转置，但它们看起来不太对劲.

任何 idea 都将受到赞赏！:)

推荐答案

如果组被排序，并且模式总是相同的(没有缺失值)，则使用numpy重新塑造:

cols = ['Person', 'Place', 'Thing']

out = df.loc[::len(cols), ['Sample']].reset_index(drop=True)

out[cols] = df['Subject'].to_numpy().reshape(-1, len(cols))

对于更通用的方法，仅假设类别在一个组中总是以相同的顺序，标识每个组的位置:groupby.cumcount和map名称，然后pivot:

order = ['Person', 'Place', 'Thing']

out = (df.assign(col=df.groupby('Sample').cumcount()
                       .map(dict(enumerate(order))))
         .pivot(index='Sample', columns='col', values='Subject')
         .reset_index().rename_axis(columns=None)
      )

rename的变种:

order = ['Person', 'Place', 'Thing']

out = (df.assign(col=df.groupby('Sample').cumcount())
         .pivot(index='Sample', columns='col', values='Subject')
         .rename(columns=dict(enumerate(order)))
         .reset_index().rename_axis(columns=None)
      )

输出:

  Sample Person   Place  Thing
0    1-1  Janet  Boston    Hat
1    1-2  Chris  Austin  Scarf

最后，如果你真的想要"主题"一栏，insert它:

out.insert(1, 'Subject', out['Person'])

print(out)

  Sample Subject Person   Place  Thing
0    1-1   Janet  Janet  Boston    Hat
1    1-2   Chris  Chris  Austin  Scarf