底部发布的代码很好地使用包tidyr
填充了一个数据帧,以便所有ID在Period定义为月数的情况下以相同的期间数结束(以下代码中的"Period_1").基本数据帧testDF
的ID为1,有5个周期,ID为50和60,每个只有3个周期.tidyr
代码为ID 50和60创建额外的句点("Period_1"),因此它们也有5个Period_1.该代码向下复制"Bal"和"State"字段,以便所有ID都以相同数量的Period_1结束,这是正确的.
但是,我如何以同样的方式扩展"PERIOD_2"的日历月表达式,如下面所示?
代码:
library(tidyr)
testDF <-
data.frame(
ID = as.numeric(c(rep(1,5),rep(50,3),rep(60,3))),
Period_1 = as.numeric(c(1:5,1:3,1:3)),
Period_2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-01","2012-02","2012-03"),
Bal = as.numeric(c(rep(10,5),21:23,36:34)),
State = c("XX","AA","BB","CC","XX","AA","BB","CC","SS","XX","AA")
)
testDFextend <-
testDF %>%
tidyr::complete(ID, nesting(Period_1)) %>%
tidyr::fill(Bal, State, .direction = "down")
testDFextend
Edit: rolling from one year to the next个
更好的OP示例应该是Period 2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-10","2012-11","2012-12")
,这提供了一个延长Period_2会导致滚动到下一年的示例.下面我添加到下面的tidyr/dplyr答案中,以正确地转到年份:
library(tidyr)
library(dplyr)
testDF <-
data.frame(
ID = as.numeric(c(rep(1,5),rep(50,3),rep(60,3))),
Period_1 = as.numeric(c(1:5,1:3,1:3)),
Period_2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-10","2012-11","2012-12"),
Bal = as.numeric(c(rep(10,5),21:23,36:34)),
State = c("XX","AA","BB","CC","XX","AA","BB","CC","SS","XX","AA")
)
testDFextend <-
testDF %>%
tidyr::complete(ID, nesting(Period_1)) %>%
tidyr::fill(Bal, State, .direction = "down")
testDFextend %>%
separate(Period_2, into = c("year", "month"), convert = TRUE) %>%
fill(year) %>%
group_by(ID) %>%
mutate(month = sprintf("%02d", zoo::na.spline(month))) %>%
unite("Period_2", year, month, sep = "-") %>%
# Now I add the below lines:
separate(Period_2, into = c("year", "month"), convert = TRUE) %>%
mutate(month = as.integer(sprintf("%02d", zoo::na.spline(month)))) %>%
mutate(year1 = ifelse(month > 12, year+trunc(month/12), year)) %>%
mutate(month1 = ifelse(month > 12 & month%%12!= 0, month%%12, month)) %>%
mutate(month1 = ifelse(month1 < 10, paste0(0,month1),month1)) %>%
unite("Period_2", year1, month1, sep = "-") %>%
select("ID","Period_1","Period_2","Bal","State")