我有以下数据框:

df1 <- structure(list(group = c("KO", "WT", "KO", "KO", "KO", "KO", 
"WT", "KO", "KO", "WT", "WT", "WT", "WT", "WT", "WT", "WT", "WT", 
"WT", "WT", "KO", "KO"), name = c("rike", "rabe", "smake", "rike", 
"rike", "rike", "rabe", "rike", "rike", "due", "rabe", "ene", 
"ene", "due", "ene", "rabe", "due", "rabe", "due", "smake", "kum"
), type = c("C", "A", "A", "A", "C", "B", "A", "B", "B", "A", 
"B", "A", "C", "C", "C", "C", "B", "C", "A", "C", "A"), posit = c(10, 
2, 21, 5, 12, 22, 18, 19, 81, 22, 33, 31, 80, 40, 16, 16, 7, 
9, 26, 27, 7)), row.names = c(NA, -21L), class = "data.frame")

我想以这种方式组合两列,一个字符("类型")和一个数字("位置"),所有类别(字母)将与相应的假设(数字)连接,例如"A"和"37"作为"A37",给定的"名称"的所有类型-位置对将按升序粘贴到新的列中(从小到大).另外,我希望用":"分隔它们.所需输出如下所示:

df2 <-structure(list(group = c("WT", "WT", "WT", "KO", "KO", "KO"), 
    name = c("ene", "due", "rabe", "kum", "rike", "smake"), type_posit = c("C16:A31:C80", 
    "B7:A22:A26:C40", "A2:C9:C16:A18:B33", "A7", "A5:C10:C12:B19:B22:B81", 
    "A21:C27")), class = "data.frame", row.names = c(NA, -6L))

我可以通过使用一组dplyr函数并创建中间数据帧来实现这一点,如下所示:

df2 <- df1 %>% 
  dplyr::mutate(t_p = paste0(type,posit)) %>% 
  dplyr::arrange(name,posit) %>% 
  dplyr::select(-type, -posit) %>% 
  dplyr::group_by(group, name) %>% 
  dplyr::summarise(tag_pos =paste0(t_p, collapse = ":"))

然而,我想知道,是否有更有效和/或更Clean 方法来做到这一点?我想写一个干净的,可理解的代码.

推荐答案

您需要按posit对行进行排序,然后用paste0(..., collapse = ':')对每组进行汇总.

library(dplyr)

df1 %>%
  group_by(group, name) %>%
  arrange(posit, .by_group = TRUE) %>% 
  summarise(type_posit = paste0(type, posit, collapse = ':'), .groups = 'drop')

# # A tibble: 6 × 3
#   group name  type_posit            
#   <chr> <chr> <chr>                 
# 1 KO    kum   A7                    
# 2 KO    rike  A5:C10:C12:B19:B22:B81
# 3 KO    smake A21:C27               
# 4 WT    due   B7:A22:A26:C40        
# 5 WT    ene   C16:A31:C80           
# 6 WT    rabe  A2:C9:C16:A18:B33

R相关问答推荐

混淆矩阵,其中每列和等于1

对lme 4对象运行summary()时出错(diag中的错误(from,names = RST):对象unpackedMatrix_diag_get找不到)

使用R的序列覆盖

R Tidymodels textercipes-使用spacyR进行标记化-如何从生成的标记列表中删除标点符号

使用R中相同值创建分组观测指标

R中的哈密顿滤波

在使用tidyModels和XGBoost的二进制分类机器学习任务中,所有模型都失败

当我添加美学时,geom_point未对齐

KM估计的差异:SvyKm与带权重的调查

如何在PDF格式的kableExtra表格中显示管道字符?

减go R中列表的所有唯一元素对

有毒元素与表观遗传年龄的回归模型

Ggplot2如何找到存储在对象中的残差和拟合值?

我正在try 创建一个接近cos(X)的值的While循环,以便它在-或+1-E10范围内

R-如何在ggplot2中显示具有不同x轴值(日期)的多行?

如何在shiny 的应用程序 map 视图宣传单中可视化单点

R dplyr::带有名称注入(LHS of:=)的函数,稍后在:=的RHS上引用

如果缺少时间,如何向日期-时间列添加时间

通过分析特定列中的字符串在数据框中创建新的行和列

如何从矩阵绘制环弦图