我需要做三件事:
1. Count the rowwise non NA's values in a table and sum them (in a single column "check_na")
[我把我的解决方案放在下面,如果有人能用map解决这个问题,我很感兴趣.我已经判断过了https://stackoverflow.com/questions/50680413/count-na-in-given-columns-by-rows【关于这个问题的答案】
2. For those values that are not NA, create a column that concatenates these the unique values in a new column "block detail".
[我不知道怎么做]
3. If "check_na" has a value then pull in the column name(s) and concatenate them in a new column ("block type")
[我不知道怎么做]
这就是最终产品的外观.
w x y z na_check block_detail block_type
<dbl> <chr> <chr> <chr> <int> <chr> <chr>
1 NA a NA NA 1 a x
2 NA NA b b 2 b y|z
3 NA NA b c 2 b|c y|z
4 NA NA NA NA 0 NA NA
5 NA NA NA b 1 b z
以下是示例数据和我对第1部分的解决方案:
#sample data
df <- tibble(w=rep(NA_real_,5),
x=c(1,rep(NA_real_,4)),
y=c(NA_real_,1,rep(NA_real_,3)),
z=c(NA_real_,1,rep(NA_real_,2),1)
)
#my solution to the first part, interested if someone can do this more efficiently or can do this with map as I have 100s columns that I need to do this with
df_na_check <- df %>%
mutate(across(everything(),
list(na_check=~!is.na(.)),
.names="{.col}_{.fn}")) %>%
rowwise() %>%
mutate(na_check=sum(c_across(contains("na_check")))) %>%
select(w:z,na_check)
谢谢你的帮助.理想情况下,如果解决方案可以使用tidyverse,但可以使用其他方法(data.table或base r)