经过一段时间的研究,并try 了Sub或gSub,我并没有找到我想要的.
输入:
structure(list(submitter_id = c("TCGA-B6-A0RH-01A-21R-A115-07",
"TCGA-BH-A1FU-11A-23R-A14D-07", "TCGA-BH-A1FU-01A-11R-A14D-07",
"TCGA-AR-A0TX-01A-11R-A084-07", "TCGA-A1-A0SE-01A-11R-A084-07",
"TCGA-BH-A1FC-11A-32R-A13Q-07", "TCGA-OL-A5D6-01A-21R-A27Q-07",
"TCGA-E2-A1IK-01A-11R-A144-07", "TCGA-AC-A2FM-11B-32R-A19W-07",
"TCGA-AN-A0FT-01A-11R-A034-07"), sample_type = c("Primary Tumor",
"Solid Tissue Normal", "Primary Tumor", "Primary Tumor", "Metastatic",
"Solid Tissue Normal", "Primary Tumor", "Primary Tumor", "Solid Tissue Normal",
"Primary Tumor")), row.names = c(NA, 10L), class = "data.frame")
我想做的是,如果字符串中存在"肿瘤"和"正常",则仅保留"肿瘤"和"正常",并删除所有内容.此外,我只想 Select 由"肿瘤"和"正常"组成的行.
预期输出:
structure(list(submitter_id = c("TCGA-B6-A0RH-01A-21R-A115-07",
"TCGA-BH-A1FU-11A-23R-A14D-07", "TCGA-BH-A1FU-01A-11R-A14D-07",
"TCGA-AR-A0TX-01A-11R-A084-07", "TCGA-BH-A1FC-11A-32R-A13Q-07",
"TCGA-OL-A5D6-01A-21R-A27Q-07", "TCGA-E2-A1IK-01A-11R-A144-07",
"TCGA-AC-A2FM-11B-32R-A19W-07", "TCGA-AN-A0FT-01A-11R-A034-07"
), sample_type = c("Tumor", "Normal", "Tumor", "Tumor", "Normal",
"Tumor", "Tumor", "Normal", "Tumor")), row.names = c(NA, 9L), class = "data.frame")
谢谢你
我try 了gSub或Sub和substra,但由于字符长度不同而失败.