Input Dataframe With TerminalID,TName,XY cordinate, PeopleID

import pandas as pd

data = {
        'TerminalID': ['5','21','21','2','21','2','5','22','22','22','2','32','41','41','42','50','50'],
        'TName': ['AD','AMBO','AMBO','PS','AMBO','PS','AD','AM','AM','AM','PS','BO','BA','BA','BB','AZ','AZ'],
        'xy': ['1.12731,1.153756','0.12731,0.153757','0.12731,0.153757','1.989385,1.201941','0.12731,0.153757','1.989385,1.201941','1.12731,1.153756','2.12731,1.153756','2.12731,1.153756','2.12731,1.153756','1.989385,1.201941','1.989385,1.201941','2.989385,1.201941','2.989385,1.201941','2.989385,3.201941','3.989385,3.201941','3.989385,3.201941'],
        'Pcode': [ 'None','Z014','Z015','Z016','Z017','Z018','None','Z020','Z021','Z022','Z023','Z024','Z025','Z026','Z027','Z028','Z029']
    }

df = pd.DataFrame.from_dict(data)

Out[55]:

output of DF1

   TerminalID TName                 xy Pcode
0           5    AD   1.12731,1.153756  None
1          21  AMBO   0.12731,0.153757  Z014
2          21  AMBO   0.12731,0.153757  Z015
3           2    PS  1.989385,1.201941  Z016
4          21  AMBO   0.12731,0.153757  Z017
5           2    PS  1.989385,1.201941  Z018
6           5    AD   1.12731,1.153756  None
7          22    AM   2.12731,1.153756  Z020
8          22    AM   2.12731,1.153756  Z021
9          22    AM   2.12731,1.153756  Z022
10          2    PS  1.989385,1.201941  Z023
11         32    BO  1.989385,1.201941  Z024
12         41    BA  2.989385,1.201941  Z025
13         41    BA  2.989385,1.201941  Z026
14         42    BB  2.989385,3.201941  Z027
15         50    AZ  3.989385,3.201941  Z028
16         50    AZ  3.989385,3.201941  Z029

DF2,

T\u cap是终端ID的容量要求,T\u load是负载细节,Tcap是运行计数增量,T\u load是终端的实际请求,

data2= {
        'BusID': ['18','18','18','18','18','18','18','18','18'],
        'Tcap': ['0','2','3','6','7','8','10','12','12'],
        'T_Load': ['0','2','1','2','2','1','2','2','0'],
        'TerminalID': [ '5','21','33','2','32','42','41','50','5'],
        
        'TName':['AD','AMBO','AM','PS','BO','BB','BA','AZ','AD']
    }

df2 = pd.DataFrame.from_dict(data2)

Out[59]:

  BusID Tcap T_Load TerminalID TName
0    18    0      0          5    AD
1    18    2      2         21  AMBO
2    18    3      1         33    AM
3    18    6      2          2    PS
4    18    7      2         32    BO
5    18    8      1         42    BB
6    18   10      2         41    BA
7    18   12      2         50    AZ
8    18   12      0          5    AD
    

Data Frame # the Final output requested

输出基于T\U负载连续应变.

data3 = {
        'BusID': ['18','18','18','18','18','18','18','18','18'],
        'Tcap': ['0','2','3','6','7','8','10','12','12'],
        'T_Load': ['0','2','1','3','1','1','2','2','0'],
        'TerminalID': [ '5','21','33','2','32','42','41','50','5'],
        
        'TName':['AD','AMBO','AM','PS','BO','BB','BA','AZ','AD'],
        'Pcode':['None','Z013,Z019','Z020','Z016,Z018,Z023','Z024','Z027','Z025,Z026','Z028,Z029','None']
    }
    
    df3 = pd.DataFrame.from_dict(data3)

Out[61]:

  BusID Tcap T_Load TerminalID TName           Pcode
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO       Z013,Z019
2    18    3      1         33    AM            Z020
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None

谢谢你

推荐答案

您可以 for each TName添加map个聚合字符串:

df2['Plist'] = df2['TName'].map(df.groupby('TName')['Pcode'].agg(','.join))

或者,如果要将多个字符串None替换为单个字符串:

df2['Plist'] = df2['TName'].map(df.groupby('TName')['Pcode']
                                  .agg(lambda x: ','.join(e for e in x if e != 'None'))
                                  .replace('', 'None')
                                )

输出:

  BusID Tcap T_Load TerminalID TName           Plist
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO  Z014,Z015,Z017
2    18    3      1         22    AM  Z020,Z021,Z022
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None
update: limiting the 输出:

然后可以使用正则表达式修剪列,我们可以使用groupby从每个组中的矢量化字符串操作中获益(如果组很少,行很多,这最有意思):

df2['P_list'] = (df2.groupby('T_Load')['P_list']
                    .apply(lambda c: c.str.extract(rf'((?:[^,]+,?){{,{str(c.name)}}})',
                                                   expand=False)
                                    .str.strip(',')
                          )
                    .replace('', 'None')
                )

输出:

  BusID Tcap T_Load TerminalID TName          P_list
0    18    0      0          5    AD            None
1    18    2      2         21  AMBO       Z014,Z015
2    18    3      1         22    AM            Z020
3    18    6      3          2    PS  Z016,Z018,Z023
4    18    7      1         32    BO            Z024
5    18    8      1         42    BB            Z027
6    18   10      2         41    BA       Z025,Z026
7    18   12      2         50    AZ       Z028,Z029
8    18   12      0          5    AD            None

Python-3.x相关问答推荐

Numpy argmin()以查找最近的元组

数据类对象列表的字典获取方法-在数据类列表中查找具有特定变量值的数据类

如何在 python 中将带有时区信息的时间戳转换为 utc 时间

基于组/ID从原始数据框中创建两个子数据框

Pandas groupby 然后 for each 组添加新行

如何在不使用循环的情况下根据另一个数组的索引值将 numpy 数组中不同通道的值设置为零?

是否可以将多个 if 转换为数组?

如何并行化文件下载?

python asyncio - 如何等待取消的屏蔽任务?

在 Python 3 中获取所有超类

在python中,如果一个函数没有return语句,它会返回什么?

将 args、kwargs 传递给 run_in_executor

如何等待 create_task() 创建的任务完成?

Pyodbc:登录超时错误

TypeError: write() 参数必须是 str,而不是字节(Python 3 vs Python 2)

当默认 pip 为 pip2 时,升级 pip3 的正确格式是什么?

如何在python中创建代码对象?

print(... sep='', '\t' ) 是什么意思?

如何在 Python 3.4 中使用 pip 3?

如何使用 Celery 和 Django 将任务路由到不同的队列