我有一个相框看起来像
Sample | Value | Domain |
---|---|---|
S1 | 12 | Domain_Identified_X13_A |
S1 | 25 | Domain_Identified_X28_B |
S1 | 18 | Domain_Unidentified |
我想聚合包括字符串"Identity"在内的行的值,以获得最终DF
Sample | Value | Domain |
---|---|---|
S1 | 37 | Domain_Identified |
S1 | 18 | Domain_Unidentified |
谢谢你
我有一个相框看起来像
Sample | Value | Domain |
---|---|---|
S1 | 12 | Domain_Identified_X13_A |
S1 | 25 | Domain_Identified_X28_B |
S1 | 18 | Domain_Unidentified |
我想聚合包括字符串"Identity"在内的行的值,以获得最终DF
Sample | Value | Domain |
---|---|---|
S1 | 37 | Domain_Identified |
S1 | 18 | Domain_Unidentified |
谢谢你
使用dplyr
和sub
中的summarize
来获得组.
library(dplyr)
df %>%
group_by(Sample, Domain = sub("(.*dentified).*", "\\1", Domain)) %>%
summarize(Value = sum(Value), .groups="drop")
# A tibble: 2 × 3
Sample Domain Value
<chr> <chr> <int>
1 S1 Domain_Identified 37
2 S1 Domain_Unidentified 18
替代使用aggregate
的base R方法
setNames(aggregate(df$Value, by=
list(df$Sample, sub("(.*dentified).*", "\\1", df$Domain)), \(x)
sum(x)), c(colnames(df)[c(1,3)], "Value"))
Sample Domain Value
1 S1 Domain_Identified 37
2 S1 Domain_Unidentified 18