如何在 R 中聚合并同时替换一个数据集中的第二个数据集的值

发布于04月03日

这里是我的数据集的一部分

mydata=structure(list(sales_point_id = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), calendar_id_operday = c(20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 20210102L, 
20210102L, 20210102L, 20210102L, 20210102L), line_fact_amt = c(23749L, 
1000L, 3050L, 1550L, 8900L, 1550L, 0L, 300L, 0L, 499L, 5450L, 
300L, 0L, 499L, 599L, 599L, 6050L, 300L, 599L, 1400L, 300L, 0L, 
2000L, 700L, 0L, 5990L, 8877L, 1999L, 257L, 200L, 361L, 300L, 
1990L, 2453L, 3140L, 0L, 0L, 199L, 599L, 10990L, 7990L, 773L, 
400L, 6000L, 2269L, 2000L, 1999L)), class = "data.frame", row.names = c(NA, 
-47L))

-calendar_id_operday表示一周中的第20210line_fact_amt(YYYYMMDD)天.我需要总计line_fact_amt美元，每天calendar_id_operday美元.

mydata2=structure(list(sales_point_id = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), calendar_id_operday = c(20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 20220102L, 
20220102L, 20220102L, 20220102L, 20220102L), line_fact_amt = c(38586L, 
15837L, 17887L, 16387L, 23737L, 16387L, 14837L, 15137L, 14837L, 
15336L, 20287L, 15137L, 14837L, 15336L, 15436L, 15436L, 20887L, 
15137L, 15436L, 16237L, 15137L, 14837L, 16837L, 15537L, 14837L, 
20827L, 23714L, 16836L, 15094L, 15037L, 15198L, 15137L, 16827L, 
17290L, 17977L, 14837L, 14837L, 15036L, 15436L, 25827L, 22827L, 
15610L, 15237L, 20837L, 17106L, 16837L, 16836L)), class = "data.frame", row.names = c(NA, 
-47L))

也就是line_fact_amt乘calendar_id_operday.

The Greatest Difficulty for me: if the aggregated value of the same day of the next year(2022) is greater than for the same day but of the last year(2021), well, for example, for 20220102 the sum = 815519, while for the same day but 2021 = 118180, we have a difference for this day between 2021 and 2022 more than 40%, and if there is a difference for the same day>40%, then replace for the next year with the values of the previous year of this date + 10%, i.e. 815519 is replaced by 118180 + 10% of it (129998). How to do this procedure for sales_point_id separately?

谢谢你宝贵的帮助.

如何在 R 中聚合并同时替换一个数据集中的第二个数据集的值

推荐答案

R相关问答推荐

基于两个现有列创建新列

过滤Expand.Grid的结果

如何使用Cicerone指南了解R Shiny中传单 map 的元素？

在特定列上滞后n行，同时扩展框架的长度

在ggplot Likert条中添加水平线

根据R中的另一个日期从多列中 Select 最近的日期和相应的结果

如何使用`ggplot2：：geom_segment()`或`ggspatial：：geom_spatial_segment()`来处理不在格林威治中心的sf对象？

如何使用R对每组变量进行随机化？

用预测NLS处理R中生物学假设之上的误差传播

如何根据R中其他列的值有条件地从列中提取数据？

如何对2个列表元素的所有组合进行操作？

派生程序包｜；无法检索'；return()'；的正文

使用不同的定性属性定制主成分分析中点的 colored颜色和形状

仅当后续值与特定值匹配时，才在列中回填Nas

将统计检验添加到GGPUBR中的盒图，在R

计算使一组输入值最小化的a、b和c的值

如何获取R chromote中的当前URL？

使用R、拼图和可能的网格包绘制两个地块的公共垂直线

如何用不同长度的向量填充列表？

如果极点中存在部分匹配，则替换整个字符串