Boost实现为什么比R语言慢510倍优化标题：比较Boost实现和R语言的性能差异

发布于06月10日

我正在构建一个应用程序，计算regularized incomplete beta function很重.这款应用是用C++编写的，可以调用R::pbeta().当我试图将这款应用程序多线程时，来自R::pbeta()的一些警告消息打破了堆栈.

所以我转到了boost::math::ibeta().一切都很好，直到我测量了速度.下面的C++文件whyIsBoostSlower.cpp使用R::pbeta()或boost::math::ibeta()实现正则化的不完全Beta函数.

// [[Rcpp::plugins(cpp17)]]
#include <boost/math/special_functions/beta.hpp>
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
using namespace Rcpp;


// Compute the regularized incomplete Beta function.
// [[Rcpp::export]]
NumericVector RIBF(NumericVector q, NumericVector a, NumericVector b, 
                   bool useboost = false)
{
  NumericVector rst(q.size());
  for (int i = 0, iend = q.size(); i < iend; ++i)
  {
    if (useboost) rst[i] = boost::math::ibeta( a[i], b[i], q[i] );
    else          rst[i] = R::pbeta( q[i], a[i], b[i], 1, 0 );
  }
  return rst;
}

在R中，我们测量对随机数调用函数300000次的速度:

Rcpp::sourceCpp("whyIsBoostSlower.cpp")


set.seed(123)
N = 300000L
q = runif(N) # Generate quantiles.
a = runif(N, 0, 10) # Generate a in (0, 10)
b = runif(N, 0, 10) # Generate b in (0, 10)


# Use R's pbeta(). This function calls a C wrapper of toms708.c:
#   https://svn.r-project.org/R/trunk/src/nmath/toms708.c
system.time({ Rrst = RIBF(q, a, b, useboost = F) })
# Windows 10 (seconds):
# user  system elapsed 
# 0.11    0.00    0.11 

# RedHat Linux:
# user  system elapsed 
# 0.097   0.000   0.097 


# Use Boost's implementation, which also depends on TOMS 708 by their claim:
#  https://www.boost.org/doc/libs/1_41_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_beta/ibeta_function.html
system.time({ boostRst = RIBF(q, a, b, useboost = T) })
# Windows 10:
# user  system elapsed 
# 0.52    0.00    0.52 

# RedHat Linux:
# user  system elapsed 
# 0.988   0.001   0.993 


range(Rrst - boostRst)
# -1.221245e-15  1.165734e-15

要重现该示例，需要安装R和包Rcpp.在Windows上，用户还需要安装包含GCC发行版的Rtools.优化标志默认为-O2.

R::pbeta()和boost::math::ibeta()都基于ACM TOMS 708，但boost::math::ibeta()在Windows上慢5倍，在Linux上慢10倍.

我认为这可能与在boost::math::ibeta()中设置Policy参数有关，但如何呢？

谢谢!

仅供参考，R-4.2.3/src/nmath/pbeta.c中定义了R::pbeta().R::pbeta()调用在R-4.2.3/src/nmath/toms708.c中定义的bratio()，即https://svn.r-project.org/R/trunk/src/nmath/toms708.c.里面的代码是TOMS 708 Fortran代码的C翻译.翻译是由R的核心团队完成的.

相反，Boost则声明"此实现紧密基于"算法708；"不完全贝塔函数比率的有效位数计算"，DiDonato和Morris，ACM，1992.在boost::math::ibeta()上

// [[Rcpp::depends(BH)]] #include <boost/math/special_functions/beta.hpp> // this also ensure linking with the GSL // [[Rcpp::depends(RcppGSL)]] #include <gsl/gsl_sf_gamma.h> #include <Rcpp.h> using namespace Rcpp; // [[Rcpp::export]] NumericVector bfR(NumericVector a, NumericVector b) { int n = a.size(); NumericVector rst(n); for (int i = 0; i<n; i++) { rst[i] = R::beta(a[i], b[i]); } return rst; } // [[Rcpp::export]] NumericVector bfB(NumericVector a, NumericVector b) { int n = a.size(); NumericVector rst(n); for (int i = 0; i<n; i++) { rst[i] = boost::math::beta( a[i], b[i] ); } return rst; } // [[Rcpp::export]] NumericVector bfG(NumericVector a, NumericVector b) { int n = a.size(); NumericVector rst(n); for (int i = 0; i<n; i++) { rst[i] = gsl_sf_beta( a[i], b[i] ); } return rst; } /*** R set.seed(123) N <- 300000L a <- runif(N, 0, 10) # Generate a in (0, 10) b <- runif(N, 0, 10) # Generate b in (0, 10) summary(bfR(a,b) - bfB(a,b)) summary(bfR(a,b) - bfG(a,b)) microbenchmark::microbenchmark(R = bfR(a, b), Boost = bfB(a, b), GSL = bfG(a, b), times=10) */

> Rcpp::sourceCpp("answer.cpp") > set.seed(123) > N <- 300000L > a <- runif(N, 0, 10) # Generate a in (0, 10) > b <- runif(N, 0, 10) # Generate b in (0, 10) > summary(bfR(a,b) - bfB(a,b)) Min. 1st Qu. Median Mean 3rd Qu. Max. -3.64e-12 0.00e+00 0.00e+00 -5.00e-17 0.00e+00 1.36e-12 > summary(bfR(a,b) - bfG(a,b)) Min. 1st Qu. Median Mean 3rd Qu. Max. -1.18e-11 -4.00e-17 0.00e+00 2.10e-16 0.00e+00 1.09e-11 > microbenchmark::microbenchmark(R = bfR(a, b), Boost = bfB(a, b), GSL = bfG(a, b), times=10) Unit: milliseconds expr min lq mean median uq max neval cld R 44.9314 45.2773 46.9782 46.1237 49.0056 50.6273 10 a Boost 166.0146 167.2552 171.0441 169.5741 175.0108 180.6520 10 b GSL 58.3259 58.5101 61.0364 59.6556 62.4862 67.5316 10 c >

Boost实现为什么比R语言慢510倍优化标题：比较Boost实现和R语言的性能差异

推荐答案

Code

Output

R相关问答推荐

在ComplexHeatmap中，如何更改anno_barplot()标题的Angular ？

将Multilinetring合并到一个线串中，使用sf生成规则间隔的点

从多个前置日期中获取最长日期

根据模式将一列拆分为多列，并在R中进行拆分

在特定Quarto(reveal.js)幻灯片上隐藏徽标

如何将移除事件分配给动态创建的按钮？

在另存为PNG之前隐藏htmlwidget绘图元素

如果可能，将数字列转换为整数，否则保留为数字

线性模型斜率在减少原始数据时提供NA

用两种 colored颜色填充方框图

从多层嵌套列表构建Tibble？

如何在使用箭头R包(箭头：：OPEN_DATASSET)和dplyr谓词时编写具有整齐计算的函数？

'使用`purrr：：pwalk`从嵌套的嵌套框架中的列表列保存ggplots时出现未使用的参数错误

减少雨云面之间的间距并绘制所有统计数据点

为什么将负值向量提升到分数次方会得到NaN

排序R矩阵的行和列

R dplyr：：带有名称注入(LHS of：=)的函数，稍后在：=的RHS上引用

具有由向量定义的可变步长的序列

Ggplot2：添加更多特定 colored颜色的线条

图中显示错误 colored颜色的图例geom_sf