您必须手动更改双列表理解中的数据:
L = [b['Conversion'] for k, v in input['data'].items() for a, b in v.items()]
print (L)
[{'id': '1', 'datetime': '2024-03-26 08:30:00'},
{'id': '50', 'datetime': '2024-03-27 09:00:00'}]
out = pd.json_normalize(L)
print (out)
id datetime
0 1 2024-03-26 08:30:00
1 50 2024-03-27 09:00:00
这里是json_normalize
不必要的,工作DataFrame
构造函数:
out = pd.DataFrame(L)
print (out)
id datetime
0 1 2024-03-26 08:30:00
1 50 2024-03-27 09:00:00
谢谢你chepner的另一个 idea 与.values
:
out = pd.json_normalize((b['Conversion'] for v in input['data'].values()
for b in v.values()))
print (out)
id datetime
0 1 2024-03-26 08:30:00
1 50 2024-03-27 09:00:00
out = pd.DataFrame((b['Conversion'] for v in input['data'].values()
for b in v.values()))
print (out)
id datetime
0 1 2024-03-26 08:30:00
1 50 2024-03-27 09:00:00
在json_normalize
中是参数max_level
,但工作方式不同:
要归一化的最大级别数(dict的深度).如果为无,则将所有级别正常化.
out = pd.json_normalize(input['data'], max_level=1)
print (out)
data.1 \
0 {'Conversion': {'id': '1', 'datetime': '2024-0...
data.50
0 {'Conversion': {'id': '50', 'datetime': '2024-...
out = pd.json_normalize(input['data'], max_level=2)
print (out)
data.1.Conversion \
0 {'id': '1', 'datetime': '2024-03-26 08:30:00'}
data.50.Conversion
0 {'id': '50', 'datetime': '2024-03-27 09:00:00'}
out = pd.json_normalize(input['data'], max_level=3)
print (out)
data.1.Conversion.id data.1.Conversion.datetime data.50.Conversion.id \
0 1 2024-03-26 08:30:00 50
data.50.Conversion.datetime
0 2024-03-27 09:00:00