I’d like to calculate the internal consistency (alpha and omega) of items by grouping variables (e.g., age and raterType). Ideally I’d be able to do this using dplyr/tidyverse. My question is similar to another question (Using dplyr to nest or group two variables, then perform the Cronbach's alpha function or other statistics to the data), however I can’t get the solution to work in my case.

下面是一个简单的示例:

library("tidyverse")
library("psych")
library("MBESS")

mydata <- expand.grid(ID = 1:100,
                      age = 1:5,
                      raterType = c("self",
                                    "friend",
                                    "parent"))

set.seed(12345)

mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)

mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA

itemNames <- paste("item", 1:6, sep = "")

为了计算整个数据集的内部一致性,我将通过以下代码分别计算alpha和omega:

alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est

然而,我想计算ageraterType的每个组合的α和ω.

以下是我的try :

mydata %>%
  pivot_longer(cols = c(-age, -raterType, -ID)) %>%
  select(-ID) %>% 
  nest_by(age, raterType) %>%
  mutate(alpha = alpha(data)$total$raw_alpha,
         omega = ci.reliability(data, type = "omega", interval.type = "none")$est)

这会引发一个错误.出于某种原因,该代码提供了对ω的错误估计,并抛出了一个alpha错误:

> # This provides the wrong estimates:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise:  age, raterType
     age raterType               data omega
   <int> <fct>     <list<tibble[,2]>> <dbl>
 1     1 self               [600 × 2] 0.218
 2     1 friend             [600 × 2] 0.257
 3     1 parent             [600 × 2] 0.261
 4     2 self               [600 × 2] 0.196
 5     2 friend             [600 × 2] 0.257
 6     2 parent             [600 × 2] 0.209
 7     3 self               [600 × 2] 0.179
 8     3 friend             [600 × 2] 0.225
 9     3 parent             [600 × 2] 0.247
10     4 self               [600 × 2] 0.224
11     4 friend             [600 × 2] 0.252
12     4 parent             [600 × 2] 0.218
13     5 self               [600 × 2] 0.248
14     5 friend             [600 × 2] 0.218
15     5 parent             [600 × 2] 0.202
> 
> # This throws an error:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased  in order to count frequencies. 
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1. 
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.

上述ω值与在相应数据子集上运行ci.reliability()函数获得的值不对应:

> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356

推荐答案

也许这有帮助

out1 <-  mydata %>%
    group_by(age, raterType) %>%    
     summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha, 
     omega = ci.reliability(across(all_of(itemNames)), 
    type = "omega", interval.type = "none")$est, .groups = 'drop')

-输出

> out1
# A tibble: 15 × 4
     age raterType   alpha     omega
   <int> <fct>       <dbl>     <dbl>
 1     1 self      -0.135    2.76   
 2     1 friend     0.138    0.231  
 3     1 parent    -0.229  255.     
 4     2 self      -0.421   NA      
 5     2 friend     0.0650  58.7    
 6     2 parent     0.153   NA      
 7     3 self      -0.302    0.00836
 8     3 friend     0.147    0.334  
 9     3 parent     0.196    0.132  
10     4 self      -0.0699  NA      
11     4 friend     0.118    0.214  
12     4 parent    -0.0303  31.1    
13     5 self      -0.0166   0.246  
14     5 friend    -0.192    0.0151 
15     5 parent     0.0847  NA      

或者可能是这样

out2 <- mydata %>%
   nest_by(age, raterType) %>%
   mutate(alpha = alpha(data[, itemNames])$total$raw_alpha, 
   omega = ci.reliability(data[, itemNames], type = "omega", 
    interval.type = "none")$est)

-输出

out2
# A tibble: 15 × 5
# Rowwise:  age, raterType
     age raterType               data   alpha     omega
   <int> <fct>     <list<tibble[,7]>>   <dbl>     <dbl>
 1     1 self               [100 × 7] -0.135    2.76   
 2     1 friend             [100 × 7]  0.138    0.231  
 3     1 parent             [100 × 7] -0.229  255.     
 4     2 self               [100 × 7] -0.421   NA      
 5     2 friend             [100 × 7]  0.0650  58.7    
 6     2 parent             [100 × 7]  0.153   NA      
 7     3 self               [100 × 7] -0.302    0.00836
 8     3 friend             [100 × 7]  0.147    0.334  
 9     3 parent             [100 × 7]  0.196    0.132  
10     4 self               [100 × 7] -0.0699  NA      
11     4 friend             [100 × 7]  0.118    0.214  
12     4 parent             [100 × 7] -0.0303  31.1    
13     5 self               [100 × 7] -0.0166   0.246  
14     5 friend             [100 × 7] -0.192    0.0151 
15     5 parent             [100 × 7]  0.0847  NA     

R相关问答推荐

如何使用rmarkdown和kableExtra删除包含折叠行的表的第一列的名称

查找图下的面积

随机森林回归:下拉列重要性

格点中指数、双曲和反双曲模型曲线的正确绘制

用值序列对行进行子集化,并标识序列开始的列

如何从像glm这样的模型中提取系数表的相关性?

在R gggplot2中是否有一种方法将绘图轴转换成连续的 colored颜色 尺度?

以相同的方式对每个表进行排序

计算两列中满足特定条件连续行之间的平均值

跨列查找多个时间报告

WRS2包中带有bwtrim的简单ANOVA抛出错误

在点图上绘制置信度或预测区间ggplot2

如何预测原始数据集并将值添加到原始数据集中

对R中的列表列执行ROW Mean操作

在ggplot2图表中通过端点连接点

如何在R中使用因子行求和?

组合名称具有模式的列表的元素

识别部分重复行,其中一行为NA,其重复行为非NA

如何在类应用函数中访问函数本身

如何 suppress 条形图中的零条?