分组函数(tapply、by、aggregate)和 *apply 系列

发布于08月22日

每当我想在R中"映射"py时，我通常会try 使用apply族中的函数.

然而，我从来没有完全理解它们之间的区别——如何{sapply、lapply等}将函数应用于输入/分组输入，输出将是什么样子，甚至输入可以是什么——所以我经常只会通读它们，直到得到我想要的.

有人能解释一下什么时候用哪个吗？

我目前(可能不正确/不完整)的理解是...

sapply(vec, f):输入是一个向量.输出是一个向量/矩阵，其中元素i是f(vec[i])，如果f有多元素输出，则给出一个矩阵
lapply(vec, f):与sapply相同，但输出是一个列表？
apply(matrix, 1/2, f):输入是一个矩阵.输出是一个向量，其中元素i是f(矩阵的行/列i)
tapply(vector, grouping, f):输出是一个矩阵/数组，其中矩阵/数组中的一个元素是向量分组g处的f值，g被推送到行/列名称
by(dataframe, grouping, f):让g成为一个分组.对组/数据帧的每一列应用f.漂亮地打印分组和每列f的值.
aggregate(matrix, grouping, f):与by类似，但aggregate并没有很好地打印输出，而是将所有内容粘贴到一个数据帧中.

附带问题:我还没有学会plyr或reshape ——plyr或reshape会完全取代所有这些吗？

# Two dimensional matrix M <- matrix(seq(1,16), 4, 4) # apply min to rows apply(M, 1, min) [1] 1 2 3 4 # apply max to columns apply(M, 2, max) [1] 4 8 12 16 # 3 dimensional array M <- array( seq(32), dim = c(4,4,2)) # Apply sum across each M[*, , ] - i.e Sum across 2nd and 3rd dimension apply(M, 1, sum) # Result is one-dimensional [1] 120 128 136 144 # Apply sum across each M[*, *, ] - i.e Sum across 3rd dimension apply(M, c(1,2), sum) # Result is two-dimensional [,1] [,2] [,3] [,4] [1,] 18 26 34 42 [2,] 20 28 36 44 [3,] 22 30 38 46 [4,] 24 32 40 48

x <- list(a = 1, b = 1:3, c = 10:100) #Note that since the advantage here is mainly speed, this # example is only for illustration. We're telling R that # everything returned by length() should be an integer of # length 1. vapply(x, FUN = length, FUN.VALUE = 0L) a b c 1 3 91

#Sums the 1st elements, the 2nd elements, etc. mapply(sum, 1:5, 1:5, 1:5) [1] 3 6 9 12 15 #To do rep(1,4), rep(2,3), etc. mapply(rep, 1:4, 4:1) [[1]] [1] 1 1 1 1 [[2]] [1] 2 2 2 [[3]] [1] 3 3 [[4]] [1] 4

# Append ! to string, otherwise increment myFun <- function(x){ if(is.character(x)){ return(paste(x,"!",sep="")) } else{ return(x + 1) } } #A nested list structure l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), b = 3, c = "Yikes", d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5))) # Result is named vector, coerced to character rapply(l, myFun) # Result is a nested list like l, with values altered rapply(l, myFun, how="replace")

分组函数(tapply、by、aggregate)和 *apply 系列

推荐答案

R相关问答推荐

在之前合并的数据.tables中分配新列后.internal.selfref无效

R通过字符串中的索引连接数据帧r

将带有范围的字符串转换为R中的数字载体

通过绘图 Select 线串几何体并为其着色

从R中的另一个包扩展S3类的正确方法是什么

如果索引重复，聚合xts核心数据

从R导出全局环境中的所有sf(numrames)对象

在R中使用数据集名称

如何在R中对深度嵌套的tibbles中的非空连续行求和？

将饼图插入条形图

在数组索引上复制矩阵时出错

如何通过ggplot2添加短轴和删除长轴？

如何用书面利率绘制geom_bar图

如何根据R中其他变量的类别汇总值？

如何移除GGPlot中超出与面相交的任何格网像元

减go R中列表的所有唯一元素对

按组和连续id计算日期差

按镜像列值自定义行顺序

R-使用stri_trans_General()将其音译为德语字母

R try Catch in the loop-跳过缺少的值并创建一个DF，显示跳过的内容