我正在try 判断一组受伤的数据.数据来自4个来源(医院、全科doctor 、self 报告、死亡),每个来源都有受伤时间(以年为单位)(连续变量).一个人可能有一个或多个来源报告的伤害.我想知道医院的伤害是否有其他来源的报道(任何时间在0.25以内的伤害都被认为是相同的伤害).
因此,我想创建一个名为Hospital_Everywhere_1的列,其中,如果Hospital_1列中有时间,则Hospital_Everywhere_1列将显示"医院",如果任何其他列(不包括医院)中的时间在0.25以内,则它还将包括一个文本,由来源的|分隔.
例如,如果在医院_1中受伤的年龄为65.44岁,在GP_1中的受伤年龄为65.42岁,在Self_Report中的受伤年龄为65.43岁,则应为"Hospital|GP|Self_Report".
我想对每个医院列都这样做,这样就会有一个Hospital_elsewhere_(i)
下面是一个示例数据集
library(tibble)
set.seed(123)
example_data <- tibble(
id = 1:30,
Hospital_1 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
Hospital_2 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
Hospital_3 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
Hospital_4 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_1 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_2 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_3 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_4 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_5 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_6 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_7 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_8 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_9 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
GP_10 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE),
self_report_1 = sample(c(NA, round(runif(15, 1, 80), 2)), 30, replace = TRUE),
self_report_2 = sample(c(NA, round(runif(15, 1, 80), 2)), 30, replace = TRUE),
self_report_3 = sample(c(NA, round(runif(15, 1, 80), 2)), 30, replace = TRUE),
self_report_4 = sample(c(NA, round(runif(15, 1, 80), 2)), 30, replace = TRUE),
death_1 = sample(c(NA, round(runif(25, 1, 80), 2)), 30, replace = TRUE)
)
for (i in 1:10) {
index <- sample(1:30, 1)
gp_value <- round(runif(1, 1, 80), 2)
example_data[index, paste0("GP_", 1:4)] <- gp_value
example_data[index, paste0("Hospital_", 1:4)] <- gp_value + runif(1, -0.25, 0.25)
}