我想使用命名列表过滤特定变量组合的嵌套数据帧,但我无法排除一些不需要的组合.下面是一个例子:
library(tidyverse)
# Create fake data
set.seed(1234)
data <- tibble(
c1 = rep(letters[1:3], each = 10),
c2 = sample(letters[4:6], size = 30, replace = T),
var1 = rnorm(30),
var2 = rnorm(30)
)
nested_data <- data %>%
nest(.by = c(c1, c2))
# Create list of the specific combinations I want
criteria <- list(a = c("d", "e"), b = "d")
我try 使用函数names()
和unique()
来执行此操作,但结果并不排除带有重叠标准的不需要的组合.
# Filter for the specific combinations
c1_criteria <- names(criteria)
c2_criteria <- unique(unlist(criteria))
nested_data %>%
filter(c1 %in% c1_criteria,
c2 %in% c2_criteria) %>%
unnest(data)
这是输出
# A tibble: 4 × 3
c1 c2 data
<chr> <chr> <list>
1 a e <tibble [5 × 2]>
2 a d <tibble [3 × 2]>
3 b e <tibble [6 × 2]>
4 b d <tibble [1 × 2]>
I intended to have only the following combinations :
c1 == "a" & c2 == "d"
,
c1 == "a" & c2 == "e"
,
c1 == "b" & c2 == "d
"
但是,输出还包括组合c1 == "b" & c2 == "e"
.因此,预期的输出如下:
# A tibble: 3 × 3
c1 c2 data
<chr> <chr> <list>
1 a e <tibble [5 × 2]>
2 a d <tibble [3 × 2]>
3 b d <tibble [1 × 2]>
我认为可能有一种方法可以从命名列表criterias
生成特定逻辑条件的列表,并将其作为参数提供给过滤器函数,但我不确定如何做到这一点.