我有大约left-censored data的Creact 蛋白(C-react 蛋白),我想知道如何才能将低于检测限值的值归因于the imputed values would be inside a desired range(此处:0;lt;inputed_Value<;0.2).
我正在try 使用包‘imputeLCMD’来实现这一点,但由于‘imputeLCMD’及其所有依赖项的安装稍微有点复杂,我也愿意听听其他方法.
请考虑以下MWE:
# Load libraries
library(dplyr)
library(imputeLCMD)
# Assign the dputted random data to a data frame
df <- structure(list(participant_id = 1:10, CRP = c("2.9", "<0.2",
"<0.2", "8.8", "9.4", "0.5", "5.3", "8.9", "5.5", "<0.2"), LDL_cholesterol = c(195.7,
145.3, 167.8, 157.3, 110.3, 190, 124.6, 104.2, 132.8, 195.5),
fasting_glucose = c(114.5, 104.6, 102, 119.7, 102.8, 105.4,
97.2, 99.7, 84.5, 77.4), creatinine = c(1.5, 1.4, 1.2, 1.3,
0.5, 1, 1.3, 0.7, 0.8, 0.7)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
上面输入的模拟数据框和下面展示的模拟数据框类似于我的真实数据.
# View the random data
head(df, n=5)
#> # A tibble: 5 × 5
#> participant_id CRP LDL_cholesterol fasting_glucose creatinine
#> <int> <chr> <dbl> <dbl> <dbl>
#> 1 1 2.9 196. 114. 1.5
#> 2 2 <0.2 145. 105. 1.4
#> 3 3 <0.2 168. 102 1.2
#> 4 4 8.8 157. 120. 1.3
#> 5 5 9.4 110. 103. 0.5
然而,我对如何从现在开始继续下go 感到有点迷茫.为了将这些左删失数据的缺失值与包imputeLCMD联系起来,我假设必须首先将左中心值转换为Nas:
df <- df %>%
mutate(CRP = na_if(CRP, "<0.2")) %>%
mutate(CRP = as.numeric(CRP))
head(df, n=5)
#> # A tibble: 5 × 5
#> participant_id CRP LDL_cholesterol fasting_glucose creatinine
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2.9 196. 114. 1.5
#> 2 2 NA 145. 105. 1.4
#> 3 3 NA 168. 102 1.2
#> 4 4 8.8 157. 120. 1.3
#> 5 5 9.4 110. 103. 0.5
如果我现在运行包imputeLCMD中的一个包装器,我确实会得到某种结果:
# Impute the missing data
df_imputed <- impute.wrapper.SVD(df, K = 4) %>% as.data.frame()
# Round the result
df_imputed <- df_imputed %>% mutate_at(vars(CRP), ~round(., digits = 1))
# Place the original CRP next to the imputed one for comparison
df_imputed <- df_imputed %>% mutate(original_CRP = df$CRP)
df_imputed <- df_imputed %>% select(1,2,6,3,4,5)
# Display the result
head(df_imputed, n=5)
#> participant_id CRP original_CRP LDL_cholesterol fasting_glucose creatinine
#> 1 1 2.9 2.9 195.7 114.5 1.5
#> 2 2 5.8 NA 145.3 104.6 1.4
#> 3 3 5.9 NA 167.8 102.0 1.2
#> 4 4 8.8 8.8 157.3 119.7 1.3
#> 5 5 9.4 9.4 110.3 102.8 0.5
创建于2023-05-27,共reprex v2.0.2个
我的问题:
-
我不知道如何为imputeLCMD包设置参数,以便 输入值应为:>0 AND <0.2.
-
如何确保imputeLCMD不将Participant_id本身作为数字数据 归因于计算吗?
我已经在so中看到了大约great alternative approaches for left-censored data个,但如果知道我在‘puteLCMD’上做错了什么(或做对了什么),那就太好了.