我有两个数据帧,应该连接起来. 两者都是具有相同索引的多索引数据帧,但顺序不同.
所以,第一个数据帧的索引(df)看起来像:
MultiIndex([(11, 1, 1),
(11, 1, 2),
(11, 1, 3),
...
(11, 24, 5),
(11, 24, 6),
(11, 24, 7)],
names=['id_a', 'id_b', 'id_c'], length=168)
第二个看起来像:
MultiIndex([(11, 1, 1),
(11, 2, 1),
(11, 3, 1),
(11, 3, 2),
...
(11, 5, 23),
(11, 6, 23),
(11, 7, 23),
(11, 7, 24)],
names=['id_a', 'id_c', 'id_b'], length=168)
正如您所看到的,索引的顺序不同.
现在,通过运行pd.concat([df, df2]).index.names
,我得到了以下结果:
FrozenList(['id_a', None, None])
如何重现
import pandas as pd
# create first data frame
idx = pd.MultiIndex.from_product(
[['A1', 'A2', 'A3'], ['B1', 'B2', 'B3'], ['C1', 'C2', 'C3']],
names=['a', 'b', 'c'])
cols = ['2010', '2020']
df = pd.DataFrame(1, idx, cols)
# Create second data frame with varying order
idx = pd.MultiIndex.from_product(
[['A1', 'A2', 'A3'], ['C1', 'C2', 'C3'], ['B1', 'B2', 'B3']],
names=['a', 'c', 'b'])
df2 = pd.DataFrame(2, idx, cols)
result = pd.concat([df, df2])
输出
> df
2010 2020
a b c
A1 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
A2 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
A3 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
> df2
2010 2020
a c b
A1 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
A2 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
A3 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
> result
2010 2020
a
A1 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
A2 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
A3 B1 C1 1 1
C2 1 1
C3 1 1
B2 C1 1 1
C2 1 1
C3 1 1
B3 C1 1 1
C2 1 1
C3 1 1
A1 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
A2 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
A3 C1 B1 2 2
B2 2 2
B3 2 2
C2 B1 2 2
B2 2 2
B3 2 2
C3 B1 2 2
B2 2 2
B3 2 2
> result.index.names
FrozenList(['a', None, None])
索引"b"和"c"消失了.