我正在循环访问DataFrame中的记录,需要将每一行转换为单独的嵌套JSON
from pprint import pprint
pprint(pd.to_dict())
{'id':{0: 'A'},
'col1':{0: 'B'},
'address_id':{0: ['123','ABC']},
'address_1':{0: ['Street 123','Street ABC']},
'address_2':{0: ['Road 123','Road ABC']},
'city':{0: ['Dallas','Houston']},
'state':{0: ['Texas','Texas']},
'addition_details':{0: ['XYZ','LMP']},
}
REACH记录的预期JSON格式如下,我需要帮助才能转换为所需的输出:
{
'id': 'A',
'col1': 'B',
'address': [{
'address_id': '123',
'address_1': 'Street 123',
'address_2': 'Road 123',
'city': 'Dallas',
'state': 'Texas'
},
{
'address_id': 'ABC',
'address_1': 'Street ABC',
'address_2': 'Road ABC',
'city': 'Houston',
'state': 'Texas'
}
],
'criteria': [{
'addition_details': 'XYZ'
},
{
'addition_details': 'LMP'
},
]
}
我try 组合地址字段:
json_output=(pd.groupby(['id','col1'])
.apply(lambda x: x[['address_id','address_1','address_2','city','state']].to_dist('list'))
.reset_index(name='address').to_json(orient='records'))
print(json.dumps(json.loads(json_output),index=2, sort_keys=True))
我没有得到所需的输出:
[
{
"id":"A",
"col1":"B",
"address":{
"address_id":[
[
'123',
'ABC'
]],
"address_1":[
[
'Street 123',
'Street ABC'
]],
"address_2":[
[
'Road 123',
'Road ABC'
]],
....