我有下面的. txt文件,我想读到一个Pandas Rame中:
#Date run angle NAME
#----- _______ ________ _______
2023-02-15 10:00:00 120716 -1.75493 4.5x10-4 Al 40um
2023-02-15 10:38:48 120716 -1.75493 JD70-103 50um 0/90 deg
2023-02-15 18:25:41 120723 -0.658 JD70-103 50um 45/135 deg
我正在try 使用匹配任意数量的空格的正则表达式分隔符读取. txt文件,除非前面有日期("\d\d:\d\d\d")、"Al"、"\d\dum"、"deg"或类似于"[0—999]/[0—999]"的内容:
df = pd.read_csv("file.txt", engine='python', sep='\s+(?!\d\d:\d\d:\d\d|Al|\d\dum|deg|((\d|\d\d|\d\d\d)\/(\d|\d\d|\d\d\d)))')
出于某种原因,这创建了一个将三列NaN值插入到我想要的每一列之间:
Date NaN None.1 None.2 run None.3 None.4 None.5 angle ...
0 2023-02-15 10:00:00 NaN NaN NaN 120716 NaN NaN NaN -1.75493 ...
1 2023-02-15 10:38:48 NaN NaN NaN 120716 NaN NaN NaN -1.75493 ...
2 2023-02-15 18:25:41 NaN NaN NaN 120723 NaN NaN NaN -0.658 ...
你知道为什么会这样吗?我最好的猜测是分隔符正在分隔一行中的多个空格,但这不是我期望从上面提到的regex中得到的行为.