我有数据帧a:
TagID Genre
0 0 rock
1 1 pop
2 2 favorites
3 3 alternative
4 4 love
和数据框b:
Tags
0 154
1 20 35 40 65
我想要这样的结果:
Genre
0 wjlb-fm
1 chill, rnb, loved, hip hop
我有数据帧a:
TagID Genre
0 0 rock
1 1 pop
2 2 favorites
3 3 alternative
4 4 love
和数据框b:
Tags
0 154
1 20 35 40 65
我想要这样的结果:
Genre
0 wjlb-fm
1 chill, rnb, loved, hip hop
在加入第一个数据帧之前分解Tags
列:
df2['Genre'] = (df2['Tags'].str.split().explode().astype(df1['TagID'].dtype)
.map(df1.set_index('TagID')['Genre'])
.groupby(level=0).agg(', '.join))
print(df2)
# Output
Tags Genre
0 3 alternative
1 1 4 2 pop, love, favorites
一步一步地:
# 1. Explode your column
>>> out = df2['Tags'].str.split().explode().astype(df1['TagID'].dtype)
0 3
1 1
1 4
1 2
Name: Tags, dtype: int64
# 2. Match genre by tag id
>>> out = out.map(df1.set_index('TagID')['Genre'])
0 alternative
1 pop
1 love
1 favorites
Name: Tags, dtype: object
# 3. Reshape your dataframe
>>> out = out.groupby(level=0).agg(', '.join)
0 alternative
1 pop, love, favorites
Name: Tags, dtype: object