我的merging-multiple-files-based-on-the-common-column和https://superuser.com/questions/1245094/merging-multiple-files-based-on-the-common-column有类似的问题.我非常接近解决方案,但我对Python还是个新手.我需要帮助调整加入多个文件的代码. 我的个人文件ID和列如下所示:
File1.txt文件1.txt
id SRR1071717
chr1:15039:-::chr1:15795:- 2
chr1:15948:-::chr1:16606:- 6
File2.txt文件2.txt
id SRR1079830
chr1:11672:+::chr1:12009:+ 10
chr1:11845:+::chr1:12009:+ 7
chrY:9756574:+::chrY:9757796:+ 0
我想要的输出
id SRR1071717 SRR1079830
chr1:15039:-::chr1:15795:- 2 0
chr1:15948:-::chr1:16606:- 6 0
chr1:11672:+::chr1:12009:+ 0 10
chr1:11845:+::chr1:12009:+ 0 7
chrY:9756574:+::chrY:9757796:+ 0 0
我的代码:Matrix.py
import sys
columns = []
data = {}
ids = set()
for filename in sys.argv[1:]:
with open(filename, 'rU') as f:
key = next(f).strip().split()[1]
columns.append(key)
data[key] = {}
for line in f:
if line.strip():
id, value = line.strip().split()
try:
data[key][int(id)] = value
except ValueError as exc:
raise ValueError(
"Problem in line: '{}' '{}' '{}'".format(
id, value, line.rstrip()))
ids.add(int(id))
print('\t'.join(['ID'] + columns))
for id in sorted(ids):
line = []
for column in columns:
line.append(data[column].get(id, '0'))
print('\t'.join([str(id)] + line))
如图所示,我运行了一段python代码,但它不能正常工作(这是一种新的python).电流输出(仅两行!).
python3 matrix.py File\*.txt
个
电流输出
id SRR1071717 SRR1079830
chrY:9756574:+::chrY:9757796:+ 0 0