我在试着计算从DATE到DATE_FORKING的累积利率之和.

例如:

library(tidyverse)
library(bizdays)
library(lubridate)

set.seed(1)
dat <- seq.Date(from = as.Date(as.Date("2023-04-06")- days(10)),
                to = as.Date(as.Date("2023-04-06")),
                by = "day")  %>% 
  data.frame(DATE = .) %>% 
  mutate(Rates = sample(seq(from=1,to=10,by=1), size = length(DATE),replace=TRUE),
         DATE_following = modified.following(DATE %m+% days(3)))

dat
        DATE Rates DATE_following
1  2023-03-27     9     2023-03-30
2  2023-03-28     4     2023-03-31
3  2023-03-29     7     2023-04-01
4  2023-03-30     1     2023-04-02
5  2023-03-31     2     2023-04-03
6  2023-04-01     7     2023-04-04
7  2023-04-02     2     2023-04-05
8  2023-04-03     3     2023-04-06
9  2023-04-04     1     2023-04-07
10 2023-04-05     5     2023-04-08
11 2023-04-06     5     2023-04-09

我想要得到的输出是:

  1. 结果:9+4+7+1=21(2023-03-27至2023-03-30期间差饷总和)
  2. 结果:4+7+1+2=14…
         DATE Rates DATE_following Results
1  2023-03-27     9     2023-03-30      21
2  2023-03-28     4     2023-03-31      14
3  2023-03-29     7     2023-04-01      17
4  2023-03-30     1     2023-04-02      12
5  2023-03-31     2     2023-04-03      14
6  2023-04-01     7     2023-04-04      13
7  2023-04-02     2     2023-04-05      11
8  2023-04-03     3     2023-04-06      14
9  2023-04-04     1     2023-04-07      NA
10 2023-04-05     5     2023-04-08      NA
11 2023-04-06     5     2023-04-09      NA

是否可以使用像rowwise()和umsum()这样的dplyr函数来获得这个结果?我的主要问题是,我不知道如何在这些函数中定义这个条件.

推荐答案

如果您想要连续四个Rates的滚动总和,可以使用zoo/rollsum()函数:

library(dplyr)
library(zoo)

dat %>% 
  mutate(Result = rollsum(Rates, k = 4, fill = NA_real_, align = "left"))

这就是回报

# A tibble: 11 × 5
      no DATE       Rates DATE_following Result
   <dbl> <date>     <dbl> <date>          <dbl>
 1     1 2023-03-27     9 2023-03-30         21
 2     2 2023-03-28     4 2023-03-31         14
 3     3 2023-03-29     7 2023-04-01         17
 4     4 2023-03-30     1 2023-04-02         12
 5     5 2023-03-31     2 2023-04-03         14
 6     6 2023-04-01     7 2023-04-04         13
 7     7 2023-04-02     2 2023-04-05         11
 8     8 2023-04-03     3 2023-04-06         14
 9     9 2023-04-04     1 2023-04-07         NA
10    10 2023-04-05     5 2023-04-08         NA
11    11 2023-04-06     5 2023-04-09         NA

基于勒马克的 comments ,我给出了一个稍微更笼统的答案:

dat2 %>% 
  mutate(days = as.integer(DATE_following - DATE) + 1,
         res = rollapply(data = Rates, width = days, FUN = sum, align = "left", fill = NA_real_))

这就是回报

# A tibble: 11 × 6
      no DATE       Rates DATE_following  days   res
   <dbl> <date>     <dbl> <date>         <dbl> <dbl>
 1     1 2023-03-27     9 2023-03-30         4    21
 2     2 2023-03-28     4 2023-03-31         4    14
 3     3 2023-03-29     7 2023-04-01         4    17
 4     4 2023-03-30     1 2023-04-02         4    12
 5     5 2023-03-31     2 2023-04-10        11    NA
 6     6 2023-04-01     7 2023-04-04         4    13
 7     7 2023-04-02     2 2023-04-05         4    11
 8     8 2023-04-03     3 2023-04-06         4    14
 9     9 2023-04-04     1 2023-04-07         4    NA
10    10 2023-04-05     5 2023-04-08         4    NA
11    11 2023-04-06     5 2023-04-09         4    NA

由于第5行中的DATE_following不在数据中,因此此版本返回NA.此外,这个版本不是连续四天的总和,而是计算DATEDATE_following之间的天数,并将它们应用于滚动总和.

数据

dat <- structure(list(no = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11), DATE = structure(c(19443, 
19444, 19445, 19446, 19447, 19448, 19449, 19450, 19451, 19452, 
19453), class = "Date"), Rates = c(9, 4, 7, 1, 2, 7, 2, 3, 1, 
5, 5), DATE_following = structure(c(19446, 19447, 19448, 19449, 
19450, 19451, 19452, 19453, 19454, 19455, 19456), class = "Date")), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -11L), spec = structure(list(
    cols = list(no = structure(list(), class = c("collector_double", 
    "collector")), DATE = structure(list(format = ""), class = c("collector_date", 
    "collector")), Rates = structure(list(), class = c("collector_double", 
    "collector")), DATE_following = structure(list(format = ""), class = c("collector_date", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

dat2 <- structure(list(no = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11), DATE = structure(c(19443, 
19444, 19445, 19446, 19447, 19448, 19449, 19450, 19451, 19452, 
19453), class = "Date"), Rates = c(9, 4, 7, 1, 2, 7, 2, 3, 1, 
5, 5), DATE_following = structure(c(19446, 19447, 19448, 19449, 
19457, 19451, 19452, 19453, 19454, 19455, 19456), class = "Date")), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -11L), spec = structure(list(
    cols = list(no = structure(list(), class = c("collector_double", 
    "collector")), DATE = structure(list(format = ""), class = c("collector_date", 
    "collector")), Rates = structure(list(), class = c("collector_double", 
    "collector")), DATE_following = structure(list(format = ""), class = c("collector_date", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

R相关问答推荐

R gtsummary tBL_summary,包含分层和两个独立分组变量

geom_raster不适用于x比例中超过2,15的值

基于R中的GPS点用方向箭头替换点

使用spatVector裁剪网格数据时出现的问题

更新合适的R mgcv::bam模型报告无效类型(关闭).'';错误

使用ggplot 2根据R中的类别排列Likert比例gplot

Highcharter多次钻取不起作用,使用不同方法

如何动态更新selectizeInput?

在R中,如何在每个堆叠的条上放置误差条,特别是当使用facet_grid时?

为什么我的基准测试会随着样本量的增加而出现一些波动?

在RStudio中堆叠条形图和折线图

有效识别长载体中的高/低命中

使用RSelenium在R中抓取Reddit时捕获多个标签

计算使一组输入值最小化的a、b和c的值

错误包arrowR:READ_PARQUET/OPEN_DATASET&QOT;无法反序列化SARIFT:TProtocolException:超出大小限制&Quot;

按组和连续id计算日期差

自定义交互作用图的标签

Data.table::Shift type=允许扩展数据(&Q;LAG&Q;)

在GT()中的列之间添加空格

使用dplyr删除具有条件的行