晚上好,我有一个关于R的问题.我有这个相机trap FT,日期和时间(DATAORA)和时间差(秒)之间的场合(时间)指定的框架.

-数据帧:

input <- structure(list(FT = structure(c(4L, 4L, 1L, 2L, 1L, 4L, 3L, 3L, 
1L, 1L, 2L, 1L, 3L, 3L, 2L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 4L, 3L, 
2L, 1L), levels = c("T1", "C1a", "C1b", "C1c", "T2", "C2b", "C2c", 
"T3", "C3a", "C3b", "T4", "C4a", "C4b", "C4c", "T5", "C5a", "C5b", 
"C5c", "T6", "C6a", "C6b", "C6c", "T7", "C7a", "C7b", "C7c", 
"T8", "C8a", "C8b", "C8c", "T9", "C9a", "C9b", "C9c", "T10", 
"C10a", "C10b", "C10c", "T11", "C11a", "C11b", "C11c", "T12", 
"C12a"), class = "factor"), DATAORA = structure(c(1668717618, 
1668717644, 1668717750, 1668717790, 1668717806, 1668719092, 1668719110, 
1668719142, 1668719156, 1668719182, 1668719192, 1668719214, 1668800778, 
1668800808, 1668800832, 1668800846, 1668800856, 1668800860, 1668800872, 
1668800888, 1668800898, 1668800926, 1668809108, 1668809170, 1668809226, 
1668809238), tzone = "", class = c("POSIXct", "POSIXt")), DIFFTIME = c(0, 
26, 106, 40, 16, 1286, 18, 32, 14, 26, 10, 22, 81564, 30, 24, 
14, 10, 4, 12, 16, 10, 28, 8182, 62, 56, 12)), row.names = c(NA, 
-26L), class = c("tbl_df", "tbl", "data.frame"))

我想创建一个列,在该列中指定从数据帧的行识别的组/序列.我希望通过将相差不超过45秒的行分组来指定这些组/序列,但当一行具有FT的副本时,则将其插入到下一组中.如下所示

预期结果:

expected <- structure(list(FT = c("C1c", "C1c", "T1", "C1a", "T1", "C1c", 
"C1b", "C1b", "T1", "T1", "C1a", "T1", "C1b", "C1b", "C1a", "T1", 
"C1b", "C1a", "T1", "C1a", "T1", "T1", "C1c", "C1b", "C1a", "T1"
), DATAORA = c("2022-11-17 21:40:18", "2022-11-17 21:40:44", 
"2022-11-17 21:42:30", "2022-11-17 21:43:10", "2022-11-17 21:43:26", 
"2022-11-17 22:04:52", "2022-11-17 22:05:10", "2022-11-17 22:05:42", 
"2022-11-17 22:05:56", "2022-11-17 22:06:22", "2022-11-17 22:06:32", 
"2022-11-17 22:06:54", "2022-11-18 20:46:18", "2022-11-18 20:46:48", 
"2022-11-18 20:47:12", "2022-11-18 20:47:26", "2022-11-18 20:47:36", 
"2022-11-18 20:47:40", "2022-11-18 20:47:52", "2022-11-18 20:48:08", 
"2022-11-18 20:48:18", "2022-11-18 20:48:46", "2022-11-18 23:05:08", 
"2022-11-18 23:06:10", "2022-11-18 23:07:06", "2022-11-18 23:07:18"
), DIFFTIME = c(0, 26, 106, 40, 16, 1286, 18, 32, 14, 26, 10, 
22, 81564, 30, 24, 14, 10, 4, 12, 16, 10, 28, 8182, 62, 56, 12
), SEQ = c(1, 2, 3, 3, 4, 5, 5, 6, 6, 7, 7, 8, 9, 10, 10, 10, 
11, 11, 11, 12, 12, 13, 14, 15, 16, 16)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -26L))

有人能帮帮我吗?

推荐答案

一些可以try 的东西-如果有帮助,我可以编辑和澄清关于这项工作的更多细节.您可以使用input代替expected作为数据源,但我刚刚创建了一个新列new_seq来与SEQ进行比较,以确保它们是相同的.

从序列1 seq_no开始,您可以分别判断每一行数据.对于每一行,您可以判断FT是否包含在当前活动的SEQ组中,以及DIFFTIME是否超过45秒.如果任一为真,则可以前进到下一个序列号seq_no,并将当前活动组重置为仅包含该FT.如果两者都为假,则只需将FT添加到当前组.最后一条语句将序列号存储在正确的行中.

seq_no <- 1
current_grp <- vector()

for(i in seq_len(nrow(expected))) {
  if (expected[i, "FT"] %in% current_grp | expected[i, "DIFFTIME"] > 45) {
    seq_no <- seq_no + 1
    current_grp <- expected[i, "FT"]
  } else {
    current_grp <- c(current_grp, expected[i, "FT"])
  }
  expected[i, "new_seq"] <- seq_no
}

expected

Output

# A tibble: 26 × 5
   FT    DATAORA             DIFFTIME   SEQ new_seq
   <chr> <chr>                  <dbl> <dbl>   <dbl>
 1 C1c   2022-11-17 21:40:18        0     1       1
 2 C1c   2022-11-17 21:40:44       26     2       2
 3 T1    2022-11-17 21:42:30      106     3       3
 4 C1a   2022-11-17 21:43:10       40     3       3
 5 T1    2022-11-17 21:43:26       16     4       4
 6 C1c   2022-11-17 22:04:52     1286     5       5
 7 C1b   2022-11-17 22:05:10       18     5       5
 8 C1b   2022-11-17 22:05:42       32     6       6
 9 T1    2022-11-17 22:05:56       14     6       6
10 T1    2022-11-17 22:06:22       26     7       7
11 C1a   2022-11-17 22:06:32       10     7       7
12 T1    2022-11-17 22:06:54       22     8       8
13 C1b   2022-11-18 20:46:18    81564     9       9
14 C1b   2022-11-18 20:46:48       30    10      10
15 C1a   2022-11-18 20:47:12       24    10      10
16 T1    2022-11-18 20:47:26       14    10      10
17 C1b   2022-11-18 20:47:36       10    11      11
18 C1a   2022-11-18 20:47:40        4    11      11
19 T1    2022-11-18 20:47:52       12    11      11
20 C1a   2022-11-18 20:48:08       16    12      12
21 T1    2022-11-18 20:48:18       10    12      12
22 T1    2022-11-18 20:48:46       28    13      13
23 C1c   2022-11-18 23:05:08     8182    14      14
24 C1b   2022-11-18 23:06:10       62    15      15
25 C1a   2022-11-18 23:07:06       56    16      16
26 T1    2022-11-18 23:07:18       12    16      16

R相关问答推荐

使用spatVector裁剪网格数据时出现的问题

具有多个依赖变量/LHS的逻辑模型

计算R中的威布尔分布的EDF

从gtsummary包中使用tBL_strata()和tBL_summary()时删除变量标签

将年度数据插入月度数据

如何在一次运行中使用count进行多列计数

将文件保存到新文件夹时,切换r设置以不必创建目录

R—将各种CSV数字列转换为日期

在使用tidyModels和XGBoost的二进制分类机器学习任务中,所有模型都失败

如何将网站图像添加到带有极坐标的面包裹条形图?

更新R中的数据表(使用data.table)

在纵向数据集中创建新行

'使用`purrr::pwalk`从嵌套的嵌套框架中的列表列保存ggplots时出现未使用的参数错误

ggplot R:X,Y,Z使用固定/等距的X,Y坐标绘制六边形热图

向R中的数据帧添加一列,该列统计另一列中每个唯一值的二进制观测值的数量

数据集上的R循环和存储模型系数

有毒元素与表观遗传年龄的回归模型

R没有按顺序显示我的有序系数?

带有Bootswatch Cerulean主题的shiny 仪表板中的浏览&按钮可见性问题

将字符变量出现次数不相等的字符框整形为pivot_wider,而不删除重复名称或嵌套字符变量