我对R很陌生,所以我想知道如何才能做得更好.我有一个由两列组成的数据表(Day和Sleepstatus).如何根据day列找到睡眠和觉醒的第一次迭代,并改变另一列以指示人何时开始睡眠(第1行睡眠)和何时停止睡眠(第1行觉醒).睡眠时间的剩余时间,列中应显示不适用.

Day SleepStatus
1 Sleeping
1 Sleeping
1 Sleeping
1 Awake
2 Sleeping
2 Sleeping
2 Sleeping
2 Awake

所需输出

Day SleepStatus Final Status
1 Sleeping Start Sleep
1 Sleeping NA
1 Sleeping Stop Sleep
1 Awake NA
2 Sleeping Start Sleep
2 Sleeping NA
2 Sleeping Stop Sleep
2 Awake NA

推荐答案

这是一个潜在的解决方案:

library(data.table)

dt <- data.table::data.table(
          Day = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
  SleepStatus = c("Sleeping","Sleeping","Sleeping",
                  "Awake","Sleeping","Sleeping","Sleeping","Awake")
)

dt[, `Final Status` := {ifelse(
  cumsum(SleepStatus != "Sleeping") != shift(cumsum(SleepStatus != "Sleeping"), fill = 0, type = "lag"),
  "Stop Sleep", "Start Sleep")}]
dt[, `Final Status` := {ifelse(
  `Final Status` == shift(`Final Status`, fill = "NA", type = "lag"),
  NA, `Final Status`)}]
dt
#>    Day SleepStatus Final Status
#> 1:   1    Sleeping  Start Sleep
#> 2:   1    Sleeping         <NA>
#> 3:   1    Sleeping         <NA>
#> 4:   1       Awake   Stop Sleep
#> 5:   2    Sleeping  Start Sleep
#> 6:   2    Sleeping         <NA>
#> 7:   2    Sleeping         <NA>
#> 8:   2       Awake   Stop Sleep

如果您将代码分解为更小的块,那么代码会更有意义.我在下面使用了tidyverse函数,因为我觉得它更容易理解,但我可以将其更改为数据.表语法,如果您愿意的话.

library(data.table)

dt <- data.table::data.table(
          Day = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
  SleepStatus = c("Sleeping","Sleeping","Sleeping",
                  "Awake","Sleeping","Sleeping","Sleeping","Awake")
)

library(tidyverse)
df <- as.data.frame(dt)

# When the Sleepstatus is not "Sleeping", increment the variable by one
df2 <- df %>%
  mutate(Sleeping = cumsum(SleepStatus != "Sleeping"))
df2
#>   Day SleepStatus Sleeping
#> 1   1    Sleeping        0
#> 2   1    Sleeping        0
#> 3   1    Sleeping        0
#> 4   1       Awake        1
#> 5   2    Sleeping        1
#> 6   2    Sleeping        1
#> 7   2    Sleeping        1
#> 8   2       Awake        2

# If the previous value in "Sleeping" is different to the current value,
# add the "stop sleeping" flag (i.e. show when "Sleeping" changes)
df3 <- df2 %>%
  mutate(Sleep_label = ifelse(Sleeping != lag(Sleeping, default = 0), "Stop sleeping", "Start sleeping"))
df3
#>   Day SleepStatus Sleeping    Sleep_label
#> 1   1    Sleeping        0 Start sleeping
#> 2   1    Sleeping        0 Start sleeping
#> 3   1    Sleeping        0 Start sleeping
#> 4   1       Awake        1  Stop sleeping
#> 5   2    Sleeping        1 Start sleeping
#> 6   2    Sleeping        1 Start sleeping
#> 7   2    Sleeping        1 Start sleeping
#> 8   2       Awake        2  Stop sleeping

# Then, if the value in Sleep_label is equal to the previous label,
# change it to NA
df4 <- df3 %>%
  mutate(Final_status = ifelse(Sleep_label == lag(Sleep_label, default = "NA"), NA, Sleep_label))
df4
#>   Day SleepStatus Sleeping    Sleep_label   Final_status
#> 1   1    Sleeping        0 Start sleeping Start sleeping
#> 2   1    Sleeping        0 Start sleeping           <NA>
#> 3   1    Sleeping        0 Start sleeping           <NA>
#> 4   1       Awake        1  Stop sleeping  Stop sleeping
#> 5   2    Sleeping        1 Start sleeping Start sleeping
#> 6   2    Sleeping        1 Start sleeping           <NA>
#> 7   2    Sleeping        1 Start sleeping           <NA>
#> 8   2       Awake        2  Stop sleeping  Stop sleeping

reprex package(v2.0.1)于2022年5月20日创建

这有意义吗?还是我只是让事情变得更加混乱?

R相关问答推荐

为什么stat_bin在R中的ggplot中显示错误的数字?

是否有R函数来判断一个组中的所有值是否与另一个组中的所有值相同?

使用格式化程序自定义hc_tooltip以添加textColor删除了我的标记并try 将它们带回失败

混淆矩阵,其中每列和等于1

从嵌套列表中智能提取线性模型系数

如何使用列表中多个列表中的第一条记录创建数据框

如何将网站图像添加到带有极坐标的面包裹条形图?

如何将使用rhandsontable呈现的表值格式化为百分比,同时保留并显示完整的小数精度?

R:如果为NA,则根据条件,使用列名模式将缺少的值替换为另一列中的值

有没有可能用shiny 的书签恢复手风琴面板?

如何移除GGPlot中超出与面相交的任何格网像元

ggplot R:X,Y,Z使用固定/等距的X,Y坐标绘制六边形热图

将列表中的字符串粘贴到R中for循环内的dplyr筛选器中

如何在使用Alpha时让geom_curve在箭头中显示恒定透明度

计算Mean by分组和绑定到R中的数据集

自定义交互作用图的标签

名字的模糊匹配

如何在内联代码中添加额外的空格(R Markdown)

如何用不同长度的向量填充列表?

如何计算多个变量的百分比与总和的百分比?