我们可以按国家分组,创建NA
个元素的计数等于组大小的逻辑列,取消分组,基于逻辑列将相应的列替换为NA,并移除select
中的那些列
library(dplyr)
library(stringr)
df1 %>%
group_by(country) %>%
mutate(across(everything(), ~ sum(is.na(.x)) == n(),
.names = "{.col}_lgl")) %>%
ungroup %>%
mutate(across(names(df1)[-1], ~ if(any(get(str_c(cur_column(),
"_lgl")) )) NA else .x)) %>%
select(c(where(~ !is.logical(.x) && any(complete.cases(.x)))))
-输出
# A tibble: 4 × 3
country sector 数据1
<chr> <int> <int>
1 France 1 7
2 France 2 10
3 belgium 1 12
4 belgium 2 14
如果我们不使用GROUP_BY,步骤可以简化,如Maël's
帖子中所示,即使用select
内的基R函数进行分组,即tapply
或ave
都可以工作
df1 %>%
select(where(~ !any(tapply(is.na(.x), df1[["country"]],
FUN = all))))
数据
df1 <- structure(list(country = c("France", "France", "belgium", "belgium"
), sector = c(1L, 2L, 1L, 2L), 数据1 = c(7L, 10L, NA, 14L), 数据2 = c(NA,
NA, 7L, 8L)), row.names = c(NA, -4L), class = "数据.frame")