我在家庭花名册中有数据,如下面的数据框所示

hhroster <- data.frame(HHID = c(1, 1,   1,  2,  2,  3,  3,  3,  3,  4,  4,  4,  5,  5,  6),                     
                    INDID = c(1,    2,  3,  1,  2,  1,  2,  3,  4,  1,  2,  3,  1,  2,  1),
                    response_1 = c("yes",   "no",   "yes",  "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no",   "yes",  "yes",  "no"),
                    response_2 = c("no",    "no",   "yes",  "no",   "no",   "no",   "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no"))

并且想要在家庭级别创建一个伪变量,其中值1表示至少有一个来自个人的是响应.所需的输出为

hh <- data.frame(HHID = c(1,    2,  3,  4,  5,  6),
                       HH_response_1 = c(1, 1,  0,  1,  1,  0),
                       HH_response_2 = c(1, 0,  1,  0,  1,  0))

推荐答案

Here is a solution.
Use across to get all columns of interest and check if there are any yes values by checking if the sum of logical values .x == "yes" is greater than zero.
You can keep the results as logical, R will coerce F/T to 0/1 if and when necessary.

hhroster <- data.frame(HHID = c(1, 1,   1,  2,  2,  3,  3,  3,  3,  4,  4,  4,  5,  5,  6),                     
                       INDID = c(1,    2,  3,  1,  2,  1,  2,  3,  4,  1,  2,  3,  1,  2,  1),
                       response_1 = c("yes",   "no",   "yes",  "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no",   "yes",  "yes",  "no"),
                       response_2 = c("no",    "no",   "yes",  "no",   "no",   "no",   "yes",  "no",   "no",   "no",   "no",   "no",   "yes",  "yes",  "no"))

suppressPackageStartupMessages(
  library(dplyr)
)

hhroster %>%
  summarise(
    across(starts_with("response"), ~ sum(.x == "yes") > 0L),
    .by = HHID
  )
#>   HHID response_1 response_2
#> 1    1       TRUE       TRUE
#> 2    2       TRUE      FALSE
#> 3    3      FALSE       TRUE
#> 4    4       TRUE      FALSE
#> 5    5       TRUE       TRUE
#> 6    6      FALSE      FALSE

创建于2024-02-10,共reprex v2.0.2

R相关问答推荐

从多个前置日期中获取最长日期

更改绘图上的x轴断点,而不影响风险?

R Markdown中的交叉引用表

terra nearest()仅为所有`to_id`列返回NA

使用tidyverse方法绑定行并从一组管道列表执行左连接

获取列中值更改的行号

从R导出全局环境中的所有sf(numrames)对象

绘制采样开始和采样结束之间的事件

跨列查找多个时间报告

如何使这些react 表对象相互独立?

根据约束随机填充向量的元素

如何计算每12行的平均数?

Conditional documentr::R中数据帧的summarize()

当由base::限定时,`[.factor`引发NextMethod错误

如何更改包中函数中的参数?

隐藏基于 case 总数的值

分隔日期格式为2020年7月1日

以R表示的NaN值的IS.NA状态

根据向量对列表元素进行排序

在shiny /bslb中,当卡片是从json生成时,如何水平排列卡片?