I have a list of "dictionary of dictionaries" that looks like this:

lis = [{'Health and Welfare Plan + Change Notification': {'evidence_capture': 'null',
   'test_result_justification': 'null',
   'latest_test_result_date': 'null',
   'last_updated_by': 'null',
   'test_execution_status': 'Not Started',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Computations': {'evidence_capture': 'null',
   'test_result_justification': 'null',
   'latest_test_result_date': 'null',
   'last_updated_by': 'null',
   'test_execution_status': 'Not Started',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Data Agreements': {'evidence_capture': 'null',
   'test_result_justification': 'Due to the Policy',
   'latest_test_result_date': '2019-10-02',
   'last_updated_by': 'null',
   'test_execution_status': 'In Progress',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Data Elements': {'evidence_capture': 'null',
   'test_result_justification': 'xxx',
   'latest_test_result_date': '2019-10-02',
   'last_updated_by': 'null',
   'test_execution_status': 'In Progress',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Data Quality Monitoring': {'evidence_capture': 'null',
   'test_result_justification': 'xxx',
   'latest_test_result_date': '2019-08-09',
   'last_updated_by': 'null',
   'test_execution_status': 'Completed',
   'test_result': 'xxx'}},
 {'Health and Welfare Plan + HPU Source Reliability': {'evidence_capture': 'null',
   'test_result_justification': 'xxx.',
   'latest_test_result_date': '2019-10-02',
   'last_updated_by': 'null',
   'test_execution_status': 'In Progress',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Lineage': {'evidence_capture': 'null',
   'test_result_justification': 'null',
   'latest_test_result_date': 'null',
   'last_updated_by': 'null',
   'test_execution_status': 'Not Started',
   'test_result': 'null'}},
 {'Health and Welfare Plan + Metadata': {'evidence_capture': 'null',
   'test_result_justification': 'Valid',
   'latest_test_result_date': '2020-07-02',
   'last_updated_by': 'null',
   'test_execution_status': 'Completed',
   'test_result': 'xxx'}},
 {'Health and Welfare Plan + Usage Reconciliation': {'evidence_capture': 'null',
   'test_result_justification': 'Test out of scope',
   'latest_test_result_date': '2019-10-02',
   'last_updated_by': 'null',
   'test_execution_status': 'In Progress',
   'test_result': 'null'}}]

I would like to convert the list into a dataframe that looks like this:

                        evidence_capture last_updated_by latest_test_result_date test_execution_status test_result test_result_justification            test_category
Change Notification                 null            null                    null           Not Started        null                      null  Health and Welfare Plan
Computations                        null            null                    null           Not Started        null                      null  Health and Welfare Plan
Data Agreements                     null            null              2019-10-02           In Progress        null         Due to the Policy  Health and Welfare Plan
Data Elements                       null            null              2019-10-02           In Progress        null                       xxx  Health and Welfare Plan
Data Quality Monitoring             null            null              2019-08-09             Completed         xxx                       xxx  Health and Welfare Plan
HPU Source Reliability              null            null              2019-10-02           In Progress        null                      xxx.  Health and Welfare Plan
Lineage                             null            null                    null           Not Started        null                      null  Health and Welfare Plan
Metadata                            null            null              2020-07-02             Completed         xxx                     Valid  Health and Welfare Plan
Usage Reconciliation                null            null              2019-10-02           In Progress        null         Test out of scope  Health and Welfare Plan

My code to build the dataframe is using a for-loop to concat the records column by column. After that to process the column names, and then transpose it. The final output would have the repeated string "Health and Welfare Plan" removed from each row index, but appended as a new column.

df3 = pd.DataFrame(lis[0])
for i in range(1, len(lis)):
    df3 = pd.concat([df3, pd.DataFrame(lis[i])], axis=1)
df3.columns = [col.split(' + ')[1] for col in df3.columns]
df3 = df3.T
df3['test_category'] = 'Health and Welfare Plan'
print(df3)

The code is able to produce the final output, but using "expensive" functions of both for-loop and dataframe concat. So I was wondering if there is a better way to output the same results?

推荐答案

Let us do dict comp to flatten the list of dictionaries

pd.DataFrame({k.split(' + ')[1]: v for d in lis for k, v in d.items()}).T

                        evidence_capture test_result_justification latest_test_result_date last_updated_by test_execution_status test_result
Change Notification                 null                      null                    null            null           Not Started        null
Computations                        null                      null                    null            null           Not Started        null
Data Agreements                     null         Due to the Policy              2019-10-02            null           In Progress        null
Data Elements                       null                       xxx              2019-10-02            null           In Progress        null
Data Quality Monitoring             null                       xxx              2019-08-09            null             Completed         xxx
HPU Source Reliability              null                      xxx.              2019-10-02            null           In Progress        null
Lineage                             null                      null                    null            null           Not Started        null
Metadata                            null                     Valid              2020-07-02            null             Completed         xxx
Usage Reconciliation                null         Test out of scope              2019-10-02            null           In Progress        null

Python相关问答推荐

我想在笔记下方添加一个小按钮栏,当我这样做时,为什么按钮不出现?

Pandas - 将多个组行组合成一行

如何在知道起始坐标、线长 x Angular 和 y、Angular 的情况下画线

GEKKO if3 变量在优化中给出了意想不到的结果

python - 用关键字重新拆分字符串,除非它前面有特定的关键字

在 Pytorch 张量和 numpy 索引中 ... 和 : 有什么区别

try 从字符串中删除 unicode 表情符号时遇到问题

访问数组中的左侧和右侧索引,其中元素相差 1

如何从 API 获取纯文本?

Pytest - 重用固定装置,为不同的用户输出不同的结果

如何在两个 M2M 值上使用 prefetch_related?

一次替换多个字符

Linux命令行中Python对象类的子类

如何有效地找到一个平方等于另一个立方的数对?

使用 set() 提取包含关键字的句子

提高组合的性能

Python中的类对象和类类型

如何在 Python 中将多个嵌入图像添加到邮箱中?

python中每个线程使用的Numba并行时间

threading.get_ident() 在运行 pytest 时在不同线程之间返回相同的 ID