如何使用re.group()
方法对从调用input_text
的字符串中提取的每个子串应用所指示的修改过程,然后将它们替换为原始字符串input_text
中的子串.
我想我应该放一个for
循环来迭代名为civil_time_unit_list
的列表中的列表
import re
def numerical_time_corrector(input_text):
hh, mm, am_pm, output = "", "", "", ""
input_text = re.sub(r"([0-9])h.s.", r"\1 ", input_text)
input_text = re.sub(r"([0-9])\s*h.s.", r"\1 ", input_text)
input_text = re.sub(r"([0-9])hs", r"\1 ", input_text)
input_text = re.sub(r"([0-9])\s*hs", r"\1 ", input_text)
civil_time_pattern = r'(\d{1,2})[\s|:]*(\d{0,2})\s*(am|pm)?'
civil_time_unit_list = re.findall(civil_time_pattern, input_text)
print(civil_time_unit_list)
# civil_time_unit_list[list of detected schedules][lists with the 3 elements of each of the schedules]
#process to be applied to all times detected within the input string and stored in the list
#-----------------------------------------
try:
hh = civil_time_unit_list[0][0]
if (hh == ""): hh = "00"
except IndexError: hh = "00"
try:
mm = civil_time_unit_list[0][1]
if (mm == ""): mm = "00"
except IndexError: mm = "00"
try:
am_pm = civil_time_unit_list[0][2]
if (am_pm == ""): am_pm = "am"
except IndexError: am_pm = "am"
if (len(hh) < 2):
hh = "0" + hh
if (len(mm) < 2):
mm = "0" + mm
output = (hh + ":" + mm + " " + am_pm).strip()
#-----------------------------------------
#Here the program should replace this new value in the original input string
#input_text = ""
#return input_text #It should return the new input_string with all values replaced
return output #I am returning the output just to test that it works, since it should return the original string but with the replacements already done.
input_text = "el cine esta abierto hasta 23:30 pm o 01:00 am, 1:00 hs am 1:00 pm, : p m, 1: pm 5: h.s. pm"
input_text = numerical_time_corrector(input_text)
print(repr(input_text))
您可以看到它是如何错误地只打印原始字符串中要更正的一个值,而不是打印原始字符串及其所有更正的值.这是错误的输出:
[('23', '30', 'pm'), ('01', '00', 'am'), ('1', '00', 'am'), ('1', '00', 'pm'), ('1', '', 'pm'), ('5', '', 'pm')]
'23:30 pm'
这是我真正需要的输出:
[('23', '30', 'pm'), ('01', '00', 'am'), ('1', '00', 'am'), ('1', '00', 'pm'), ('1', '', 'pm'), ('5', '', 'pm')]
'el cine esta abierto hasta 23:30 pm o 01:00 am, 01:00 am 01:00 pm, 00:00 pm, 01:00 pm 05:00 pm'
我应该在代码中做什么更改,以便将此修复应用于所有提取的子字符串,然后替换input_text
字符串中的所有内容,然后从函数返回它并将其打印到控制台?