我有一个包含以下行的文件:

/home/Plugins/file1 e:222 k:dir (327/1)
/home/Plugins/file2 e:100 k:dir (326/1)

我想获取路径和元素id.

with open('output_file.txt', 'r') as output_file:
    for line in output_file:
        file_path = line.split()[0]
        eId = line.split()[1].split(":")[1]
        logging.info("file path:"+file_path)
        logging.info("eId:"+eId)

但是,问题是文件名(第一个元素)本身的路径可能包含空格,因为磁盘上的foldersfiles是用名称中的空格创建的(这是常见情况).

/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)
/home/account validator e:227 k:dir (247/1)

所以路径总是第一个元素,但有时它包含空格.由于这些例子,我上面的脚本将失败.

AMSProvider (subfolder name)

账户验证器(file name at the end of the path)

因为在这种情况下,路径包含空格(在子文件夹名称中,但也在路径末尾的文件名中),所以我仍然可以检索文件的路径.

Note:不幸的是,我在服务器上只能使用python 2.7.

推荐答案

我会使用正则表达式:

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(.*) (e:\d*) (k:.*) \((\d{3}/\d)\)$"

test_str = ("/home/Plugins/file1 e:222 k:dir (327/1)\n"
            "/home/Plugins/file2 e:100 k:dir (326/1)\n"
            "/home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)\n"
            "/home/account validator e:227 k:dir (247/1)")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):
    
    print ("Match {matchNum} was found at {start}-{end}: {match}".
           format(matchNum = matchNum, start = match.start(),
                  end = match.end(), match = match.group()))
    
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1
        
        print ("Group {groupNum} found at {start}-{end}: {group}".
               format(groupNum = groupNum, start = match.start(groupNum),
                      end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex
#       and u"" to prefix the test string and substitution.

输出:

Match 1 was found at 0-39: /home/Plugins/file1 e:222 k:dir (327/1)
Group 1 found at 0-19: /home/Plugins/file1
Group 2 found at 20-25: e:222
Group 3 found at 26-31: k:dir
Group 4 found at 33-38: 327/1
Match 2 was found at 40-79: /home/Plugins/file2 e:100 k:dir (326/1)
Group 1 found at 40-59: /home/Plugins/file2
Group 2 found at 60-65: e:100
Group 3 found at 66-71: k:dir
Group 4 found at 73-78: 326/1
Match 3 was found at 80-134: /home/tools/AMS Provider/file3.txt e:224 k:dir (127/1)
Group 1 found at 80-114: /home/tools/AMS Provider/file3.txt
Group 2 found at 115-120: e:224
Group 3 found at 121-126: k:dir
Group 4 found at 128-133: 127/1
Match 4 was found at 135-178: /home/account validator e:227 k:dir (247/1)
Group 1 found at 135-158: /home/account validator
Group 2 found at 159-164: e:227
Group 3 found at 165-170: k:dir
Group 4 found at 172-177: 247/1

Playground.

Python相关问答推荐

Flask:如何在完整路由代码执行之前返回验证

将从Python接收的原始字节图像数据转换为C++ Qt QIcon以显示在QStandardProject中

在Pandas框架中截短至固定数量的列

使用GEKKO在简单DTE系统中进行一致初始化

如果索引不存在,pandas系列将通过索引获取值,并填充值

Python 3.12中的通用[T]类方法隐式类型检索

将特定列信息移动到当前行下的新行

连接两个具有不同标题的收件箱

如何获取TFIDF Transformer中的值?

如何在虚拟Python环境中运行Python程序?

如何过滤包含2个指定子字符串的收件箱列名?

将pandas Dataframe转换为3D numpy矩阵

如何从数据库上传数据到html?

未知依赖项pin—1阻止conda安装""

多处理队列在与Forking http.server一起使用时随机跳过项目

python中的解释会在后台调用函数吗?

Python中的变量每次增加超过1

如何将数据帧中的timedelta转换为datetime

判断Python操作:如何从字面上得到所有decorator ?

从一个df列提取单词,分配给另一个列