我有一个文本文件,其中的数据由缩进分隔.我想从数据中创建一个数据帧.缩进分隔id,然后在每个id中有不同的集合,每个集合都有其数据,其中的字段如"Main\u id"、"Secondary\u id"、"Type"、"Value"、"Start\u Date"、"End Date"将重复"Category"和"Capacity"(如果多次出现).
我需要一个循环,但我不知道如何重复一些数据,而不是其他数据.请问有什么指示/建议吗?
with open('test.txt') as file:
data = file.read()
list_sep=data.split('\i')
list_1=list_sep[1].split('\n')
q_columns=["Id", "set","Main_id", "Secondary_id","Quantity", "Type",
"Value", "Category", "Capacity", "Start Date", "End Date"]
df_q=pd.DataFrame(columns=q_columns)
示例文本文件:
Total ids entered : 3
Id : 6050
set count = 1
set : 256
Main_id : 5677002
Secondary_id : 34248
Quantity : 6
Type : AB12
Value : crypt
Category : DFR
Capacity : 100
Category: BAS
Capacity: 500
Start_Date : 04/01/2022
End_Date : 1/31/2023
Id : 123
set count = 2
set : 456
Main_id : 123456789
Secondary_id : 101112131
Quantity : 4
Type : AB12
Value : crypt
Category : DFR
Capacity : 100
Start_Date : 04/01/2022
End_Date : 1/31/2023
set : 789
Main_id : 675497072432
Secondary_id : 265902648927
Quantity : 1
Type : MAC12
Value : nocl
Category : DFR
Capacity : 200
Quantity : 2
Type : GC56
Value : crypt
Category : BAS
Capacity : 300
Start_Date : 01/01/2023
End_Date : 01/31/2023
Id : 897
set count = 1
set : 956
Main_id : gy4567890
Secondary_id : ky234248
Quantity : 6
Type : MAB13
Value : nocl
Category : gcl
Capacity : 100
Start_Date : 04/01/2022
End_Date : 1/31/2023