我希望下面的代码避免在双引号内拆分,但它确实做到了:
import csv
from io import StringIO
contents = """
gene "Tagln2"; note "putative; transgelin 2 (MGD|MGI:1312985 GB|BC049861, evidence: BLASTN, 99%, match=1379)"; product "transgelin-2"; protein_id "NP_848713.1"; tag "RefSeq Select"; exon_number "4";
"""
for l in csv.reader(StringIO(contents), delimiter=";", quotechar='"', skipinitialspace=True, quoting=csv.QUOTE_MINIMAL):
print(l)
输出:
['gene "Tagln2"', 'note "putative', 'transgelin 2 (MGD|MGI:1312985 GB|BC049861, evidence: BLASTN, 99%, match=1379)"', 'product "transgelin-2"', 'protein_id "NP_848713.1"', 'tag "RefSeq Select"', 'exon_number "4"', '']
您可以看到它在双引号内拆分,因此note "putative; transgelin 2"
变成['note "putative', 'transgelin 2']
.我该怎么解决这个问题?