我有一个数据集,其中有事件发生的二进制指示符.从此列表中,我想创建未发生事件的连续时间步数的计数.例如(TS =时间步,EV =事件指示符,C =计数):
TS1 -> TS2 -> TS3 -> TS4 -> TS5 ->...
EV0 -> EV0 -> EV1 -> EV0 -> EV0 ->...
C0 -> C1 -> C0 -> C0 -> C1 ->...
作为一个例子,请考虑:
labs <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C", "D", "D", "D", "D", "D")
time <- c(1,2,3,4 ,1,2,3,4 ,1,2,3,4 ,1,2,3,4,5)
event <- c(0,0,0,0, 0,1,0,0, 1,1,0,0, NA,0,0,1,0)
desiredOutcome <- c(0,1,2,3,0,0,0,1,0,0,0,1,NA,0,1,0,0) # goal
exDF <- data.frame(labs,time, event, desiredOutcome)
从最终目标和收件箱,我最终得到了以下代码:
library(dplyr)
exDF <- exDF %>%
group_by(labs) %>%
mutate(pe1 = lag(event, order_by=time)) # create new variable for prior event
exDF$count2 <- ifelse(
((exDF$pe1 == 1) & (exDF$event == 0)), # condition checks for rows where previous timestep is included & had event WHERE event is not ongoing in this timestep
0, # True val
NA) # False val
exDF$count <- ifelse(
(is.na(exDF$pe1) & (exDF$event == 0)), # condition checks for rows where previous timestep is not included & no current event
0, # True val
exDF$count2) # False val
它似乎正确地填写了所有零.但是,我不知道有一种好方法可以从填充适当的0和用NA填充的其他值达到我想要的结果.
我的大部分实验都与组合Mutations 和滞后有关,但它们只会导致填写下一组值(如果零在输入列中,则会单独显示一;如果是一,那么是二).下面的示例不try 处理计数的重置,而是导致上述行为:
exDF <- exDF %>%
group_by(labs) %>%
mutate(countFinal = lag(count, order_by=time) + 1)
因此,我的挑战与事情解决的顺序有关.使用类似于这里的Mutate命令的命令,顺序似乎是:
Pull all cell values by label -> Look at their lags -> Add 1 -> Done, but incorrectly
当我需要它是这样的:
Pull first cell value by label -> Look at lag -> Add 1 or reset -> Pull second cell (filled in prior step) value by label -> Look at their lags -> Add 1 or reset -> Pull third... -> Done
有什么好方法可以用现有的包做到这一点吗?