使用替代语法方法在 R 中定义函数

发布于05月04日

在我的工作中，我总是用R编写用户定义的函数，如下所示:

f <- function(x){
  x ^ 2
}

f(10)
# [1] 100

我最近遇到了在R中调用函数的另一种方法:

(function(x) x ^ 2)(10)

[1] 100

我不确定发生了什么，所以经过一番搜索，我找到了a wonderful answer provided by Allan Cameron个像我这样的非程序员都能理解的东西:

R解析器将其识别为意思是"使用这些参数调用函数".

这澄清了我对what's的理解，但不是why，或者我是否应该 Select 一种语法而不是另一种(除了个人偏好).

我经常在各种模拟和模型中使用UDF，这些模拟和模型会生成大量数据，有时还会运行一段时间，所以我总是希望进行优化.除了只是一种替代语法之外，我还想看看是否有功能、编程或基于机器的理由来编写一种格式而不是另一种格式.在比较了一些简单的函数(如下所示)之后，似乎我的"通常"编写方式(f <- function(x)(...))在几种类型的简化函数上要快得多，对于这些简单的例子，速度大约是两倍.

除了句法/个人偏好之外，is there a reason or a use-case/relevant example of when the "new-to-me" way of writing a function (100) would be superior to the "usual" way (101?

换句话说:为什么这个选项存在/为什么有人会使用这个语法？

在searching、a few、ways和this、this、this和this之后，我找不到任何东西--事实上，我在网上几乎找不到关于"替代"语法的任何东西.

比较

# Function 1
f1 <- function(x) {
  x <- as.numeric(x)
  x[x < 10] <- x[x < 10] ^ 2 / pi
  x
}

# Function 2
f2 <- Vectorize(function(x) {
  paste0("num_", 1:x)
  })

# Function 3
f3 <- Vectorize(function(x){
  if(x < 10)
    x < x + 1
    while(x <10) {
      x <- x+1
    }
  x
})

比较

microbenchmark::microbenchmark(
  `f1` =  f1(c("10", 20, "5")),
  `(function (x)(f1))` = (function(x) {
                            x <- as.numeric(x)
                            x[x < 10] <- x[x < 10] ^ 2 / pi
                            x
                          })(c("10", 20, "5")),
  `f2` =  f2(c(5,10)),
  `(function (x)(f2))` = Vectorize((function(x) paste0("num_", 1:x)))(c(5,10)),
  `f3` = f3(1:15),
  `(function (x)(f3))` = Vectorize((function(x){
                            if(x < 10)
                              x < x + 1
                               while(x <10) {
                                  x <- x+1
                                  }
                             x}))(1:15),
  times = 1e4
)

结果

Unit: microseconds
               expr    min      lq      mean  median      uq      max neval
                 f1  2.446  3.9700  5.220236  5.1720  6.0460  113.914 10000
 (function (x)(f1))  3.270  5.3725  6.741182  6.6260  7.7385   46.308 10000
                 f2 26.388 30.1885 34.328340 32.7105 37.1725  227.455 10000
 (function (x)(f2)) 53.808 60.6005 71.588443 65.7905 75.2055 5997.770 10000
                 f3 30.294 34.5735 42.077120 37.5160 42.4010 6121.705 10000
 (function (x)(f3)) 58.492 65.1845 78.551417 70.5040 80.6610 6945.062 10000

bench::mark( anon = sapply(mtcars, function(z) sum(z %in% c(4,6,21))), named = sapply(mtcars, func), iterations = 100000 ) # # A tibble: 2 × 13 # expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time result memory time gc # <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm> <list> <list> <list> <list> # 1 anon 27µs 32µs 29343. 12.92KB 9.39 99968 32 3.41s <int [11]> <Rprofmem> <bench_tm> <tibble> # 2 named 27.1µs 32.6µs 28943. 3.27KB 8.98 99969 31 3.45s <int [11]> <Rprofmem> <bench_tm> <tibble>

使用替代语法方法在 R 中定义函数

推荐答案

R相关问答推荐

x[[1]]中的错误：脚注越界

从开始时间和结束时间导出时间

如何自定义Shapviz图？

pickerInput用于显示一条或多条geom_hline，这些线在图中具有不同 colored颜色

在使用ggroove后，将图例合并在gplot中

如何编辑gMarginal背景以匹配绘图背景？

一小时满足条件的日期的 Select

从所有项的 struct 相同的两级列表中，将该第二级中的所有同名项绑定在一起

TreeNode打印 twig 并为其上色

在带有`R`中的`ggmosaic`的马赛克图中使用图案而不是 colored颜色

调换行/列并将第一行(原始数据帧的第一列)提升为标题的Tidyr类似功能？

如何筛选截止年份之前最后一个测量年度的所有观测值以及截止年份之后所有年份的所有观测值

使用ggplot2绘制具有边缘分布的坡度图

R中的Desolve：返回的导数数错误

把代码写成dplyr中的group_by/摘要更简洁吗？

使用显式二元谓词子集化sfc对象时出错

将字符变量出现次数不相等的字符框整形为pivot_wider，而不删除重复名称或嵌套字符变量

基于已有ID列创建唯一ID

使用条件格式R替换字符串中的字符

将一个二次函数叠加到一个被封装为facet的ggplot2对象的顶部