将GroupBy.cumcount
与DataFrame.pivot
一起使用:
out = (anscombe_long.assign(g = anscombe_long.groupby('dataset').cumcount())
.pivot(index='g', columns='dataset'))
print (out)
x y
dataset I II III IV I II III IV
g
0 10.0 10.0 10.0 8.0 8.04 9.14 7.46 6.58
1 8.0 8.0 8.0 8.0 6.95 8.14 6.77 5.76
2 13.0 13.0 13.0 8.0 7.58 8.74 12.74 7.71
3 9.0 9.0 9.0 8.0 8.81 8.77 7.11 8.84
4 11.0 11.0 11.0 8.0 8.33 9.26 7.81 8.47
5 14.0 14.0 14.0 8.0 9.96 8.10 8.84 7.04
6 6.0 6.0 6.0 8.0 7.24 6.13 6.08 5.25
7 4.0 4.0 4.0 19.0 4.26 3.10 5.39 12.50
8 12.0 12.0 12.0 8.0 10.84 9.13 8.15 5.56
9 7.0 7.0 7.0 8.0 4.82 7.26 6.42 7.91
10 5.0 5.0 5.0 8.0 5.68 4.74 5.73 6.89
然后在列表理解中将罗马数字转换为整数:
#pip install roman
import roman
out.columns=[f'{a}{roman.fromRoman(b)}' for a, b in out.columns]
print (out)
x1 x2 x3 x4 y1 y2 y3 y4
g
0 10.0 10.0 10.0 8.0 8.04 9.14 7.46 6.58
1 8.0 8.0 8.0 8.0 6.95 8.14 6.77 5.76
2 13.0 13.0 13.0 8.0 7.58 8.74 12.74 7.71
3 9.0 9.0 9.0 8.0 8.81 8.77 7.11 8.84
4 11.0 11.0 11.0 8.0 8.33 9.26 7.81 8.47
5 14.0 14.0 14.0 8.0 9.96 8.10 8.84 7.04
6 6.0 6.0 6.0 8.0 7.24 6.13 6.08 5.25
7 4.0 4.0 4.0 19.0 4.26 3.10 5.39 12.50
8 12.0 12.0 12.0 8.0 10.84 9.13 8.15 5.56
9 7.0 7.0 7.0 8.0 4.82 7.26 6.42 7.91
10 5.0 5.0 5.0 8.0 5.68 4.74 5.73 6.89
如果总是通过映射字典知道dateset
的个数,则解决方案:
d = {'I':1, 'II':2, 'III':3, 'IV':4}
out.columns=[f'{a}{d[b]}' for a, b in out.columns]
print (out)
x1 x2 x3 x4 y1 y2 y3 y4
g
0 10.0 10.0 10.0 8.0 8.04 9.14 7.46 6.58
1 8.0 8.0 8.0 8.0 6.95 8.14 6.77 5.76
2 13.0 13.0 13.0 8.0 7.58 8.74 12.74 7.71
3 9.0 9.0 9.0 8.0 8.81 8.77 7.11 8.84
4 11.0 11.0 11.0 8.0 8.33 9.26 7.81 8.47
5 14.0 14.0 14.0 8.0 9.96 8.10 8.84 7.04
6 6.0 6.0 6.0 8.0 7.24 6.13 6.08 5.25
7 4.0 4.0 4.0 19.0 4.26 3.10 5.39 12.50
8 12.0 12.0 12.0 8.0 10.84 9.13 8.15 5.56
9 7.0 7.0 7.0 8.0 4.82 7.26 6.42 7.91
10 5.0 5.0 5.0 8.0 5.68 4.74 5.73 6.89