我有一个数据源,看起来大致像这样:
location available sold name local_id more_data
0 1001 - BBB 1 0 Alpha 24 'DJQ3DD3y'
1 1001 - BBB 1 0 Alpha 24 'aB3joXQy'
2 1001 - BBB 1 0 Alpha 24 'ZJve572B'
3 1001 - BBB 1 0 Alpha 24 'DJEkx8Dy'
4 1001 - BBB 1 0 Alpha 24 'VyVaWLYp'
5 1001 - BBB 1 0 Bravo 19 'Rpr7AvVy'
6 1001 - BBB 1 0 Bravo 19 'ZJlO0VmB'
7 1001 - BBB 1 0 Bravo 19 'OBb6NrrB'
8 1001 - BBB 1 0 Bravo 19 'ZJvPPEXy'
9 1001 - BBB 1 0 Charlie 6 'Vy9MMOEy'
10 1001 - BBB 1 0 Charlie 6 'MJ8AALKp'
11 1001 - BBB 1 0 Delta 17 'vpWmN1kB'
12 1001 - BBB 1 0 Delta 17 'DJEb9qQp'
13 1001 - BBB 1 0 Echo 7 'qyZ1zn1p'
14 1001 - BBB 1 0 Echo 7 'bBqaYoMB'
15 1001 - BBB 1 0 Golf 22 'AJgLr9qp'
16 1001 - BBB 1 0 Golf 22 'vBdV57Ap'
17 1001 - BBB 1 0 Golf 22 'VJYxnLZB'
18 1001 - GG 1029 237 Charlie 6 'VJYxnGXB'
19 1001 - GG 1029 237 Charlie 6 'Vy9Mo52y'
20 1001 - GG 1029 237 Delta 17 'aB3zxYWy'
21 1001 - GG 1029 237 Delta 17 'MJ8A3z0p'
22 1001 - GG 1029 237 Echo 7 'YpLMPwNy'
23 1001 - GG 1029 237 Echo 7 '8Bwev1ep'
24 1001 - GG 1029 237 Golf 22 'MJXm6bLp'
25 1001 - GG 1029 237 Golf 22 'oye7XR0J'
26 1001 - GG 1029 237 Golf 22 'vpWmDDYB'
27 1001 - P 873 375 Charlie 6 'DJEbjjkp'
28 1001 - P 873 375 Charlie 6 'aB3z66zy'
29 1001 - P 873 375 Delta 17 'Kp4zrrKB'
30 1001 - P 873 375 Delta 17 'oyxqMMAB'
31 1001 - P 873 375 Echo 7 'zJ1KMMZy'
32 1001 - P 873 375 Echo 7 'ZJlOzz6B'
33 1001 - P 873 375 Foxtrot 20 'YpLMbbay'
34 1001 - P 873 375 Foxtrot 20 'ZJnmzzYB'
35 1001 - P 873 375 Golf 22 'Kp4zr5LB'
36 1001 - P 873 375 Golf 22 'oye7jg8J'
37 1001 - P 873 375 Golf 22 'OBb6jE3B'
38 1002 - GG 37 11 Charlie 6 'EyGMWPbJ'
39 1002 - GG 37 11 Charlie 6 'aB3zOoDy'
40 1002 - GG 37 11 Delta 17 'DJQ4laLB'
41 1002 - GG 37 11 Delta 17 'ZJlOvNXB'
42 1002 - GG 37 11 Echo 7 'Rpr7a8Dy'
43 1002 - GG 37 11 Echo 7 'zJjYNR4B'
44 1002 - GG 37 11 Golf 22 'Vy9MqkRy'
45 1002 - GG 37 11 Golf 22 'oye7Y0kJ'
46 1002 - GG 37 11 Golf 22 '8BweZbnp'
47 1002 - P 1854 826 Charlie 6 'Rpr7Z5by'
48 1002 - P 1854 826 Charlie 6 'vBdVK1Ap'
49 1002 - P 1854 826 Delta 17 '4Jkae8Dy'
50 1002 - P 1854 826 Delta 17 'YpLM3nxy'
51 1002 - P 1854 826 Echo 7 'VB7vD6Py'
52 1002 - P 1854 826 Echo 7 'ZJlOXbzB'
53 1002 - P 1854 826 Foxtrot 20 'MpNqezKJ'
54 1002 - P 1854 826 Foxtrot 20 '9pOWo39p'
55 1002 - P 1854 826 Golf 22 'MJXm5qnp'
56 1002 - P 1854 826 Golf 22 'oy5vxd4B'
57 1002 - P 1854 826 Golf 22 'DJQ4qz3B'
如您所见,available
和sold
列中的数据与location
列相关地呈现,并且与该列中数字和字母的独特组合相关. 我试图做的是按照location
列的第一部分将这些分组(即,"available
1 - XXX"的"available
1"部分),并对available
和sold
的唯一值进行相加,同时保留各行唯一的其他数据,并包括属于键的数字类型,我不想更改.
因此,输出如下所示:
location available sold name local_id more_data
0 1001 1903 612 Alpha 24 'DJQ3DD3y'
1 1001 1903 612 Alpha 24 'aB3joXQy'
2 1001 1903 612 Alpha 24 'ZJve572B'
3 1001 1903 612 Alpha 24 'DJEkx8Dy'
4 1001 1903 612 Alpha 24 'VyVaWLYp'
5 1001 1903 612 Bravo 19 'Rpr7AvVy'
6 1001 1903 612 Bravo 19 'ZJlO0VmB'
7 1001 1903 612 Bravo 19 'OBb6NrrB'
8 1001 1903 612 Bravo 19 'ZJvPPEXy'
9 1001 1903 612 Charlie 6 'Vy9MMOEy'
10 1001 1903 612 Charlie 6 'MJ8AALKp'
11 1001 1903 612 Delta 17 'vpWmN1kB'
12 1001 1903 612 Delta 17 'DJEb9qQp'
13 1001 1903 612 Echo 7 'qyZ1zn1p'
14 1001 1903 612 Echo 7 'bBqaYoMB'
15 1001 1903 612 Golf 22 'AJgLr9qp'
16 1001 1903 612 Golf 22 'vBdV57Ap'
17 1001 1903 612 Golf 22 'VJYxnLZB'
18 1001 1903 612 Charlie 6 'VJYxnGXB'
19 1001 1903 612 Charlie 6 'Vy9Mo52y'
20 1001 1903 612 Delta 17 'aB3zxYWy'
21 1001 1903 612 Delta 17 'MJ8A3z0p'
22 1001 1903 612 Echo 7 'YpLMPwNy'
23 1001 1903 612 Echo 7 '8Bwev1ep'
24 1001 1903 612 Golf 22 'MJXm6bLp'
25 1001 1903 612 Golf 22 'oye7XR0J'
26 1001 1903 612 Golf 22 'vpWmDDYB'
27 1001 1903 612 Charlie 6 'DJEbjjkp'
28 1001 1903 612 Charlie 6 'aB3z66zy'
29 1001 1903 612 Delta 17 'Kp4zrrKB'
30 1001 1903 612 Delta 17 'oyxqMMAB'
31 1001 1903 612 Echo 7 'zJ1KMMZy'
32 1001 1903 612 Echo 7 'ZJlOzz6B'
33 1001 1903 612 Foxtrot 20 'YpLMbbay'
34 1001 1903 612 Foxtrot 20 'ZJnmzzYB'
35 1001 1903 612 Golf 22 'Kp4zr5LB'
36 1001 1903 612 Golf 22 'oye7jg8J'
37 1001 1903 612 Golf 22 'OBb6jE3B'
38 1002 1891 837 Charlie 6 'EyGMWPbJ'
39 1002 1891 837 Charlie 6 'aB3zOoDy'
40 1002 1891 837 Delta 17 'DJQ4laLB'
41 1002 1891 837 Delta 17 'ZJlOvNXB'
42 1002 1891 837 Echo 7 'Rpr7a8Dy'
43 1002 1891 837 Echo 7 'zJjYNR4B'
44 1002 1891 837 Golf 22 'Vy9MqkRy'
45 1002 1891 837 Golf 22 'oye7Y0kJ'
46 1002 1891 837 Golf 22 '8BweZbnp'
47 1002 1891 837 Charlie 6 'Rpr7Z5by'
48 1002 1891 837 Charlie 6 'vBdVK1Ap'
49 1002 1891 837 Delta 17 '4Jkae8Dy'
50 1002 1891 837 Delta 17 'YpLM3nxy'
51 1002 1891 837 Echo 7 'VB7vD6Py'
52 1002 1891 837 Echo 7 'ZJlOXbzB'
53 1002 1891 837 Foxtrot 20 'MpNqezKJ'
54 1002 1891 837 Foxtrot 20 '9pOWo39p'
55 1002 1891 837 Golf 22 'MJXm5qnp'
56 1002 1891 837 Golf 22 'oy5vxd4B'
57 1002 1891 837 Golf 22 'DJQ4qz3B'
我知道我错过了一些超级简单的事情,因为这种问题是Pandas 的主导权.但遗憾的是,这正是我处于学习曲线上的阶段;我希望有人能 bootstrap 我走向正确的方向.