我正在测试data.table::groupingsets
,但一直收到意想不到的警告.
首先,我创建一个用作输入的数据表:
## create "long" iris data set
long_iris <-melt(as.data.table(iris),
measure.vars = c('Sepal.Width', 'Sepal.Length', 'Petal.Width', 'Petal.Length'),
id.vars = 'Species',
variable.factor = FALSE,
value.factor = FALSE)
然后我将其作为参数传递给groupingsets
:
groupingsets(long_iris,
.(N = .N,
Min = min(value),
Max = max(value),
Mean = mean(value),
SD = sd(value)),
by = c('Species', 'variable'),
## group by 'variable' and 'Species' separately as well as the interaction, plus the overall stats (character())
sets = list('variable', 'Species', c('variable', 'Species'), character()))
结果看起来还不错
Species variable N Min Max Mean SD
1: <NA> Sepal.Width 150 2.0 4.4 3.057333 0.4358663
2: <NA> Sepal.Length 150 4.3 7.9 5.843333 0.8280661
3: <NA> Petal.Width 150 0.1 2.5 1.199333 0.7622377
4: <NA> Petal.Length 150 1.0 6.9 3.758000 1.7652982
5: setosa <NA> 200 0.1 5.8 2.535500 1.8483429
6: versicolor <NA> 200 1.0 7.0 3.573000 1.7623850
7: virginica <NA> 200 1.4 7.9 4.285000 1.9153899
8: setosa Sepal.Width 50 2.3 4.4 3.428000 0.3790644
9: versicolor Sepal.Width 50 2.0 3.4 2.770000 0.3137983
10: virginica Sepal.Width 50 2.2 3.8 2.974000 0.3224966
11: setosa Sepal.Length 50 4.3 5.8 5.006000 0.3524897
12: versicolor Sepal.Length 50 4.9 7.0 5.936000 0.5161711
13: virginica Sepal.Length 50 4.9 7.9 6.588000 0.6358796
14: setosa Petal.Width 50 0.1 0.6 0.246000 0.1053856
15: versicolor Petal.Width 50 1.0 1.8 1.326000 0.1977527
16: virginica Petal.Width 50 1.4 2.5 2.026000 0.2746501
17: setosa Petal.Length 50 1.0 1.9 1.462000 0.1736640
18: versicolor Petal.Length 50 3.0 5.1 4.260000 0.4699110
19: virginica Petal.Length 50 4.5 6.9 5.552000 0.5518947
20: <NA> <NA> 600 0.1 7.9 3.464500 1.9754900
但我也得到了以下警告:
Warning messages:
1: In min(value) : no non-missing arguments to min; returning Inf
2: In max(value) : no non-missing arguments to max; returning -Inf
我使用的是虹膜数据框,所以没有遗漏的值.我知道这些错误通常是由于将长度为0的向量传递给min
或max
而导致的,但我不明白这是如何发生的.
我得到了预期的结果,但我不明白为什么我会收到这个警告.