使用dplyr创建一个包含C+D
总和的汇总df,然后绑定回您的原始df.注意:您的示例数据对于每年/国家/地区的某些行业都有多个条目;我假设这是一个错误,所以我创建了新的示例数据.
set.seed(123)
library(dplyr)
df <- expand.grid(
Country = c("DEU", "FRA", "ITA"),
Year = 1:4,
industry = c("A", "B", "C", "D")
)
df$h_emp <- rnorm(48, 15, 3.5)
df <- df %>%
filter(industry %in% c("C", "D")) %>%
summarize(
industry = "C+D",
h_emp = sum(h_emp),
.by = c(Country, Year)
) %>%
bind_rows(df, .) %>%
arrange(Country, Year)
结果:
#> head(df, 15)
Country Year industry h_emp
1 DEU 1 A 13.038335
2 DEU 1 B 16.402700
3 DEU 1 C 12.812363
4 DEU 1 D 16.938712
5 DEU 1 C+D 29.751074
6 DEU 2 A 15.246779
7 DEU 2 B 21.254196
8 DEU 2 C 15.536806
9 DEU 2 D 13.668351
10 DEU 2 C+D 29.205157
11 DEU 3 A 16.613207
12 DEU 3 B 17.454746
13 DEU 3 C 16.492625
14 DEU 3 D 10.571113
15 DEU 3 C+D 27.063738