R 用于迭代索引列表并更新相应矩阵的矢量化解决方案

发布于08月23日

我有一份以下形式的列表

xx = list("a_1" = list("A", "C"), 
          "a_2" = list("B", "C"), 
          "a_3" = list("B", "B"), 
          "a_4" = list("C", "B"), 
          "a_5" = list("B", "A"),
          "a_6" = list("B", "A"))

请注意，该列表包含重复的子列表，如上例中的"a_5"和"a_6".使用这个列表列表，我想要更新表单的矩阵

m = matrix(data = 0, nrow = 3, ncol = 3)         # initialise matrix of zeros
rownames(m) = c("A", "B", "C")                   # name rows
colnames(m) = c("A", "B", "C")                   # name columns

使得我们将与给定对相对应的矩阵的索引加1.使用for循环很容易做到这一点

for (item in xx) {
  
  # add one to the matrix index if item in xx
  m[item[[1]], item[[2]]] = m[item[[1]], item[[2]]] + 1
  
}

这产生了预期的yields

不过，对于大型列表来说，这可能会有些麻烦.我希望使用apply()个方法以向量化的方式完成这项工作，但我无法使用lapply()的嵌套组合迭代列表，使用sapply()的嵌套组合将值更新到矩阵.

所以我想知道如何使用apply()的变体(S)来做到这一点？

标杆

将各种方法定义为函数.

floop <- function(m, xx) { for (item in xx) m[item[[1]], item[[2]]] = m[item[[1]], item[[2]]] + 1L m } fsapply <- function(mm, xx) { sapply(xx, \(x) mm[x[[1]], x[[2]]] <<- mm[x[[1]], x[[2]]] + 1L) mm } faggregate <- function(m, xx) { sel <- data.frame(t(sapply(xx, unlist))) sel <- aggregate(cbind(sel[0], value=1), sel, FUN=sum) m[as.matrix(sel[1:2])] <- m[as.matrix(sel[1:2])] + sel$value m } fvectorized <- function(m, xx) { x <- matrix(unlist(xx), length(xx), 2, 1) m + tabulate( match(x[,1], row.names(m)) + nrow(m)*(match(x[,2], colnames(m)) - 1L), length(m) ) } ftable <- function(m, xx) { m + c(table(as.data.frame(matrix(unlist(xx), length(xx), 2, 1)))) }

创建一个更大的测试示例:

xx <- lapply(1:1e4, \(i) as.list(sample(LETTERS, 2, 1))) m <- matrix(0L, 26, 26, 0, rep(list(LETTERS), 2))

基准:

microbenchmark::microbenchmark( floop = floop(m, xx), fsapply = fsapply(m, xx), faggregate = faggregate(m, xx), fvectorized = fvectorized(m, xx), ftable = ftable(m, xx), check = "equal" ) #> Unit: microseconds #> expr min lq mean median uq max neval #> floop 8611.9 9051.65 11519.122 9877.15 12723.70 54497.1 100 #> fsapply 15160.5 15813.15 17868.492 17069.90 19383.50 28591.2 100 #> faggregate 14888.5 15540.90 16733.166 16100.20 17810.55 20828.0 100 #> fvectorized 910.6 1019.10 1201.803 1078.90 1226.75 5886.9 100 #> ftable 1297.0 1573.40 1765.658 1691.40 1823.95 4364.7 100

R 用于迭代索引列表并更新相应矩阵的矢量化解决方案

推荐答案

标杆

R相关问答推荐

确定邻国

如何通过Exams2黑板对非整数字的问题进行评分

如何设置搜索栏来搜索整个Shiny应用程序页面？

如何在x轴下方画一条带有箭头的线？

插入指示行之间时间间隔的新行

根据固定值范围在tible中添加新行

根据列表中项目的名称多次合并数据框和列表

找出疾病消失的受试者

从开始时间和结束时间导出时间

gt()从gt为相同内容的单元格 colored颜色不同？

即使硬币没有被抛出，也要保持对其的跟踪

如何在分组条形图中移动相关列？

根据现有列的名称和字符串的存在进行变异以创建多个新列

使用rvest从多个页面抓取时避免404错误

DEN扩展包中的RECT树形图出现异常行为

Ggplot2中geom_tile的动态zoom

更改STAT_VALLES/STAT_PEAKS中的箭头线宽/大小

将选定的索引范围与阈值进行比较

将箭头绘制在图形外部，而不是图形内部

变长向量的矢量化和