我正在try 交叉合并两个数据帧,但限制了合并,以便仅提供同一组内的组合.大Pandas 的文件上说是When performing a cross merge, no column specifications to merge on are allowed
只.目前,为了实现这一点,我使用了for循环并连接生成的DFS,但是有没有更有效的方法呢?
输入数据示例:
import pandas as pd
df1 = pd.DataFrame({
'group': [1, 1, 2, 2],
'field_a': ['apple', 'pear', 'banana', 'papaya']
})
df2 = pd.DataFrame({
'group': [1, 1, 2, 2],
'field_b': ['apple', 'strawberry', 'coconut', 'papaya']
})
所需输出示例:
pd.DataFrame({'group': [1, 1, 1, 1, 2, 2, 2, 2],
'field_a': ['apple', 'apple', 'pear', 'pear', 'banana', 'banana', 'papaya', 'papaya'],
'field_b': ['apple', 'strawberry', 'apple', 'strawberry', 'coconut', 'papaya', 'coconut', 'papaya']})
当前方法:
cols = ['group', 'field_a', 'field_b']
all_possible_matches = pd.DataFrame({
col: [] for col in cols
})
for group in [1, 2]:
combined = df1[df1['group'] == group].merge(df2[df2['group'] == group][['field_b']], how='cross')
all_possible_matches = pd.concat([all_possible_matches, combined])