我需要帮助确定R中观察组内最长的连续值序列(=1).

我有城镇月降雨量的数据.我需要确定每年月降雨量高于年平均值(rain_over=1)的最长时期.如果每年有两个等长的时段,我想确定总降雨量最大的时段.

一些示例数据:

df1 <- data.frame(cbind(town=c("A","A","A","A","A","A","A","A","A","A","A","A",
                               "A","A","A","A","A","A","A","A","A","A","A","A",
                               "B","B","B","B","B","B","B","B","B","B","B","B",
                               "B","B","B","B","B","B","B","B","B","B","B","B"), 
                        year=c(2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,
                               2001,2001,2001,2001,2001,2001,2001,2001,2001,2001,2001,2001,
                               2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,2000,
                               2001,2001,2001,2001,2001,2001,2001,2001,2001,2001,2001,2001), 
                        month=c(1,2,3,4,5,6,7,8,9,10,11,12,
                                1,2,3,4,5,6,7,8,9,10,11,12,
                                1,2,3,4,5,6,7,8,9,10,11,12,
                                1,2,3,4,5,6,7,8,9,10,11,12), 
                        rain_above =c(0,0,0,1,1,1,1,1,0,0,0,0,
                                      0,0,0,0,1,1,1,1,1,0,0,0,
                                      0,1,1,1,1,0,0,0,1,1,0,0,
                                      1,1,1,0,0,0,1,1,1,0,0,0),
                        rain = c(4.5,4,5,7.1,7.7,8,7.4,7.9,5.1,4.9,4.6,4.4,
                                 4.4,4,4.8,5.1,7.2,7.4,7.4,7.1,7.6,5.4,5.1,5,
                                 7.3,11.3,11.5,11.6,11.1,6.5,6.4,6.2,9.9,10.2,5.4,5.5,
                                 10.4,10.9,11.4,7.8,7.3,7.2,9.8,9.9,10,7.2,6.9,6.6)))

在df,A镇在2000年的雨季在第4个月到第8个月之间.这是唯一一个雨期_=1.

B镇2001年的雨季在第1个月到第3个月之间.尽管有两个等长的时段(3个月),但今年的第一个时段总降雨量较大.

View(df)
df
   town year month rain_above rain
1     A 2000     1          0  4.5
2     A 2000     2          0    4
3     A 2000     3          0    5
4     A 2000     4          1  7.1
5     A 2000     5          1  7.7
6     A 2000     6          1    8
7     A 2000     7          1  7.4
8     A 2000     8          1  7.9
9     A 2000     9          0  5.1
10    A 2000    10          0  4.9
11    A 2000    11          0  4.6
12    A 2000    12          0  4.4
13    A 2001     1          0  4.4
14    A 2001     2          0    4
15    A 2001     3          0  4.8
16    A 2001     4          0  5.1
17    A 2001     5          1  7.2
18    A 2001     6          1  7.4
19    A 2001     7          1  7.4
20    A 2001     8          1  7.1
21    A 2001     9          1  7.6
22    A 2001    10          0  5.4
23    A 2001    11          0  5.1
24    A 2001    12          0    5
25    B 2000     1          0  7.3
26    B 2000     2          1 11.3
27    B 2000     3          1 11.5
28    B 2000     4          1 11.6
29    B 2000     5          1 11.1
30    B 2000     6          0  6.5
31    B 2000     7          0  6.4
32    B 2000     8          0  6.2
33    B 2000     9          1  9.9
34    B 2000    10          1 10.2
35    B 2000    11          0  5.4
36    B 2000    12          0  5.5
37    B 2001     1          1 10.4
38    B 2001     2          1 10.9
39    B 2001     3          1 11.4
40    B 2001     4          0  7.8
41    B 2001     5          0  7.3
42    B 2001     6          0  7.2
43    B 2001     7          1  9.8
44    B 2001     8          1  9.9
45    B 2001     9          1   10
46    B 2001    10          0  7.2
47    B 2001    11          0  6.9
48    B 2001    12          0  6.6

我想为雨季生成一个指标变量,即在总降雨量最大的高于平均降雨量的月份中=1,否则=0:

df1
   town year month rain_above rain season
1     A 2000     1          0  4.5      0
2     A 2000     2          0    4      0
3     A 2000     3          0    5      0
4     A 2000     4          1  7.1      1
5     A 2000     5          1  7.7      1
6     A 2000     6          1    8      1
7     A 2000     7          1  7.4      1
8     A 2000     8          1  7.9      1
9     A 2000     9          0  5.1      0
10    A 2000    10          0  4.9      0
11    A 2000    11          0  4.6      0
12    A 2000    12          0  4.4      0
13    A 2001     1          0  4.4      0
14    A 2001     2          0    4      0
15    A 2001     3          0  4.8      0
16    A 2001     4          0  5.1      0
17    A 2001     5          1  7.2      1
18    A 2001     6          1  7.4      1
19    A 2001     7          1  7.4      1
20    A 2001     8          1  7.1      1
21    A 2001     9          1  7.6      1
22    A 2001    10          0  5.4      0
23    A 2001    11          0  5.1      0
24    A 2001    12          0    5      0
25    B 2000     1          0  7.3      0
26    B 2000     2          1 11.3      1
27    B 2000     3          1 11.5      1
28    B 2000     4          1 11.6      1
29    B 2000     5          1 11.1      1
30    B 2000     6          0  6.5      0
31    B 2000     7          0  6.4      0
32    B 2000     8          0  6.2      0
33    B 2000     9          1  9.9      0
34    B 2000    10          1 10.2      0
35    B 2000    11          0  5.4      0
36    B 2000    12          0  5.5      0
37    B 2001     1          1 10.4      1
38    B 2001     2          1 10.9      1
39    B 2001     3          1 11.4      1
40    B 2001     4          0  7.8      0
41    B 2001     5          0  7.3      0
42    B 2001     6          0  7.2      0
43    B 2001     7          1  9.8      0
44    B 2001     8          1  9.9      0
45    B 2001     9          1   10      0
46    B 2001    10          0  7.2      0
47    B 2001    11          0  6.9      0
48    B 2001    12          0  6.6      0

感谢您的帮助!

推荐答案

你可以试试下面data.tablerleid

library(data.table)

setDT(df1)[
    ,
    `:=`(sum_rain = sum(rain), grplen = .N),
    .(town, year, rleid(rain_above))
][
    , rain_season := +(sum_rain == max(sum_rain) & grplen == max(grplen)),
    .(town, year)
][
    ,
    grplen := NULL
][]

这给了

    town year month rain_above rain sum_rain rain_season
 1:    A 2000     1          0  4.5     13.5           0
 2:    A 2000     2          0  4.0     13.5           0
 3:    A 2000     3          0  5.0     13.5           0
 4:    A 2000     4          1  7.1     38.1           1
 5:    A 2000     5          1  7.7     38.1           1
 6:    A 2000     6          1  8.0     38.1           1
 7:    A 2000     7          1  7.4     38.1           1
 8:    A 2000     8          1  7.9     38.1           1
 9:    A 2000     9          0  5.1     19.0           0
10:    A 2000    10          0  4.9     19.0           0
11:    A 2000    11          0  4.6     19.0           0
12:    A 2000    12          0  4.4     19.0           0
13:    A 2001     1          0  4.4     18.3           0
14:    A 2001     2          0  4.0     18.3           0
15:    A 2001     3          0  4.8     18.3           0
16:    A 2001     4          0  5.1     18.3           0
17:    A 2001     5          1  7.2     36.7           1
18:    A 2001     6          1  7.4     36.7           1
19:    A 2001     7          1  7.4     36.7           1
20:    A 2001     8          1  7.1     36.7           1
21:    A 2001     9          1  7.6     36.7           1
22:    A 2001    10          0  5.4     15.5           0
23:    A 2001    11          0  5.1     15.5           0
24:    A 2001    12          0  5.0     15.5           0
25:    B 2000     1          0  7.3      7.3           0
26:    B 2000     2          1 11.3     45.5           1
27:    B 2000     3          1 11.5     45.5           1
28:    B 2000     4          1 11.6     45.5           1
29:    B 2000     5          1 11.1     45.5           1
30:    B 2000     6          0  6.5     19.1           0
31:    B 2000     7          0  6.4     19.1           0
32:    B 2000     8          0  6.2     19.1           0
33:    B 2000     9          1  9.9     20.1           0
34:    B 2000    10          1 10.2     20.1           0
35:    B 2000    11          0  5.4     10.9           0
36:    B 2000    12          0  5.5     10.9           0
37:    B 2001     1          1 10.4     32.7           1
38:    B 2001     2          1 10.9     32.7           1
39:    B 2001     3          1 11.4     32.7           1
40:    B 2001     4          0  7.8     22.3           0
41:    B 2001     5          0  7.3     22.3           0
42:    B 2001     6          0  7.2     22.3           0
43:    B 2001     7          1  9.8     29.7           0
44:    B 2001     8          1  9.9     29.7           0
45:    B 2001     9          1 10.0     29.7           0
46:    B 2001    10          0  7.2     20.7           0
47:    B 2001    11          0  6.9     20.7           0
48:    B 2001    12          0  6.6     20.7           0

R相关问答推荐

在值和NA的行顺序中寻找中断模式

ggplot的轴标签保存在officer中时被剪切

我想在R中总结一个巨大的数据框架,使我只需要唯一的lat、lon、Date(Year)和Maxium Value""""""""

在df中保留原始变量和新变量

合并DFS列表并将索引提取为新列

将. xlsx内容显示为HTML表

在使用bslb和bootstrap5时,有没有办法更改特定dt行的 colored颜色 ?

在不丢失空值的情况下取消列出嵌套列表

我如何go 掉盒子图底部的数字?

如何计算R glm probit中的线性预测因子?

R -基线图-图形周围的阴影区域

将列的值乘以在不同数据集中找到的值

如何移动点以使它们的打印不重叠

如何在shiny 的应用程序 map 视图宣传单中可视化单点

根据用户输入更改标记大小和 colored颜色 (R)

使用dplyr删除具有条件的行

有没有一种方法可以用非标准参数编写一个定制的ggploy主题函数?

R代码来迭代列表,将它们组合成一个带有分组变量的数据框?

利用R中的ggplot2在geom_errorbar中定位具有不同美感的SE条

使用离散标签自定义图例,用于具有连续但已入库的数据的热图