假设我有一组用于数据处理的函数,例如:
procA <- function(input){
cat('\n Now processing #A') # message just to log pipeline flow
# Actual data processing, may include some diagnostic messaging:
cat('\n #A: ', dim(input))
input$procA <- 'procA'
return(input)
}
procB <- function(input){
cat('\n Now processing #B') # message just to log pipeline flow
# Actual data processing, may include some diagnostic messaging:
cat('\n #B: ', dim(input))
input$procB <- 'procB'
return(input)
}
procC <- function(input){
cat('\n Now processing #C') # message just to log pipeline flow
# Actual data processing, may include some diagnostic messaging:
cat('\n #C: ', dim(input))
input$procC <- 'procC'
return(input)
}
我将它们组合成一条管道,例如:
data(iris)
iris_processed <-
iris %>%
procA %>%
procB %>%
procC
消息传送输出如下所示:
Now processing #C
Now processing #B
Now processing #A
#A: 150 5
#B: 150 6
#C: 150 7
由于延迟计算,这些日志(log)消息以相反的顺序发送,这使得我更难调试管道.到目前为止,我的解决方案是在每个函数的开头添加input <- eval(input)
.是否有更好的解决方案、良好的实践标准等?