R 将3个连续公式的函数应用于具有相同变量的稳健数据框

发布于12月23日

为了重现该问题，我使用了以下数据框:


library(tidyverse)
library(lubridate)

#Step 1. Load data frame and libraries
df <- data.frame(
  stringsAsFactors = FALSE,
  check.names = FALSE,
  Date = c("01/11/1876","01/12/1876",
           "01/01/1877","01/02/1877","01/03/1877",
           "01/04/1877","01/05/1877","01/06/1877",
           "01/07/1877","01/08/1877","01/09/1877",
           "01/10/1877","01/11/1877","01/12/1877",
           "01/01/1878"),
  `Att-Bissen P [mm]` = c(48.5,111.2,29.7,139.4,90.1,25.9,
                          216,94.6,40.5,NA,64.4,68.8,44.7,
                          34.8,71.9),
  `Att-Bissen PET [mm]` = c(88.4,88.3,80.5,53.4,36.7,20.2,
                            21.6,21.7,21.3,37.6,46.1,66.5,89.8,
                            121.5,87.7),
  `Att-Bissen Q [mm]` = c(13.5,12.6,11.3,12.9,44.6,21.3,
                          194.9,NA,49.1,46.7,63.6,25.4,19.8,
                          15.3,16),
  `Rau. Merl P [mm]` = c(43.7,104.2,25.5,131.3,83.7,21.9,
                         205.2,88.1,35.9,61,59,63.2,40,
                         30.4,66.2),
  `Rau. Merl PET [mm]` = c(91.4,91.3,83.2,54.9,37.5,20.3,
                           21.8,21.8,21.4,38.4,47.3,68.6,NA,
                           125.9,90.7),
  `Rau. Merl Q [mm]` = c(8.7,10.6,8.4,14.3,23.7,14.1,
                         131.6,106.7,40.1,42.4,50.3,24.6,16.7,
                         11.3,13.7),
  `Syre Felsmuhle/Mertert P [mm]` = c(37.8,89.5,22.3,112.7,72,19.2,
                                      175.8,75.8,31.2,52.6,50.9,54.5,34.7,
                                      26.5,57.1),
  `Syre Felsmuhle/Mertert PET [mm]` = c(95.6,95.6,86.9,57.2,38.8,20.7,
                                        22.3,22.3,21.9,39.8,49.2,71.6,97.2,
                                        132,94.9),
  `Syre Felsmuhle/Mertert Q [mm]` = c(16,22,17.9,24,23.1,11.4,91,NA,
                                      NA,45.2,65.6,NA,NA,NA,NA),
  `Wiltz-Winseler P [mm]` = c(50.1,106.9,33,132.4,87.7,29.7,
                              201.8,91.8,42.8,66.4,64.5,68.5,46.7,
                              37.7,71.3),
  `Wiltz-Winseler PET [mm]` = c(87.4,87.3,79.5,52.5,35.8,19.4,
                                20.8,20.8,20.4,36.7,NA,NA,88.8,
                                120.4,86.7),
  `Wiltz-Winseler Q [mm]` = c(7.2,6.3,5,8.6,33.9,32.2,234.2,
                              148.1,68.5,51.5,101.4,25.7,18.7,
                              14.3,12.1))

数据框由四个站点组成，每个站点有三个参数:P、PET和Q. 在步骤2中，我已经创建了具有三个公式的函数，这些公式需要应用于每个站点.请记住，这些公式适用于每个时间点.

# Step 2: Create Anomalies
# Calculate anomalies for P, PET, and Q
formula_1 <- function(P, PET, Q) {
  Anomaly_P = P - mean(P, na.rm = TRUE)
  Anomaly_PET = PET - mean(PET, na.rm = TRUE)
  Anomaly_Q = Q - mean(Q, na.rm = TRUE)
  return(list(Anomaly_P = Anomaly_P, Anomaly_PET = Anomaly_PET, Anomaly_Q = Anomaly_Q))
}

第3步将每个站点的名称子集

#Step 3: Extract the site names from the column names
site_names <- sub(" P \\[mm\\]| PET \\[mm\\]| Q \\[mm\\]", "", names(df)[-1]) |>
  unique()
site_names

#Step 4: Loop through each site and calculate the formula

results <- list()
for (site in site_names) {
  site_data <- df[, grepl(site, names(df))]
  results[[site]] <- formula_1(site_data[[paste0(site, " P [mm]")]], 
                                  site_data[[paste0(site, " PET [mm]")]], 
                                  site_data[[paste0(site, " Q [mm]")]])
}

#Step 5: unlist results
results_sum <- data.frame(Site = names(results), unlist(results))

我不知道我在哪里犯了错.该代码生成了一个只有2列和180个条目的数据框.我想得到的是一个数据框，其中每个站点添加了三个多列，其中包含P、PET和Q(每个时间点)的异常.

任何帮助都将不胜感激.

EDIT The following is what I would like to end up with: a data frame where the anomalies of P, PET and Q (per time step) are added after each site. (The brown/red columns are the result of the anomalies calculation = x-mean(xn)

# In step 2: Change the formula to get back a data frame: formula_1 <- function(P, PET, Q) { Anomaly_P = P - mean(P, na.rm = TRUE) Anomaly_PET = PET - mean(PET, na.rm = TRUE) Anomaly_Q = Q - mean(Q, na.rm = TRUE) return(data.frame(Anomaly_P = Anomaly_P, Anomaly_PET = Anomaly_PET, Anomaly_Q = Anomaly_Q)) } # Step 3 site_names <- sub(" P \\[mm\\]| PET \\[mm\\]| Q \\[mm\\]", "", names(df)[-1]) |> unique() site_names # In step 4 store results in your data frame results <- list() for (site in site_names) { site_data <- df[, grepl(site, names(df))] anomalies <- formula_1(site_data[[paste0(site, " P [mm]")]], site_data[[paste0(site, " PET [mm]")]], site_data[[paste0(site, " Q [mm]")]]) anomalies$Date = df$Date # Add the Date column to each site's anomalies anomalies$Site = site # Add the Site column to each site's anomalies results[[site]] <- anomalies } # Step 5: combine all results do.call(rbind, results)

Anomaly_P Anomaly_PET Anomaly_Q Date Site Att-Bissen.1 -28.678571 29.646667 -25.571429 01/11/1876 Att-Bissen Att-Bissen.2 34.021429 29.546667 -26.471429 01/12/1876 Att-Bissen Att-Bissen.3 -47.478571 21.746667 -27.771429 01/01/1877 Att-Bissen Att-Bissen.4 62.221429 -5.353333 -26.171429 01/02/1877 Att-Bissen Att-Bissen.5 12.921429 -22.053333 5.528571 01/03/1877 Att-Bissen Att-Bissen.6 -51.278571 -38.553333 -17.771429 01/04/1877 Att-Bissen Att-Bissen.7 138.821429 -37.153333 155.828571 01/05/1877 Att-Bissen Att-Bissen.8 17.421429 -37.053333 NA 01/06/1877 Att-Bissen Att-Bissen.9 -36.678571 -37.453333 10.028571 01/07/1877 Att-Bissen Att-Bissen.10 NA -21.153333 7.628571 01/08/1877 Att-Bissen Att-Bissen.11 -12.778571 -12.653333 24.528571 01/09/1877 Att-Bissen Att-Bissen.12 -8.378571 7.746667 -13.671429 01/10/1877 Att-Bissen Att-Bissen.13 -32.478571 31.046667 -19.271429 01/11/1877 Att-Bissen Att-Bissen.14 -42.378571 62.746667 -23.771429 01/12/1877 Att-Bissen...........

R 将3个连续公式的函数应用于具有相同变量的稳健数据框

推荐答案

R相关问答推荐

如果窗口在CLARME或集团之外，则有条件领先/滞后滚动总和返回NA

删除facet_wrap标签之间的水平线

过滤器数据.基于两列的帧行和R中的外部向量

如何根据组大小应用条件过滤？

如何将旋转后的NetCDF转换回正常的纬度/经度网格，并使用R？

根据现有列的名称和字符串的存在进行变异以创建多个新列

使用for循环和粘贴创建多个变量

哪一行和行和 Select 特定行，但是考虑到Nas

方法：：slotName如何处理非类、非字符的参数？

基于R中的间隔扩展数据集行

在点图上绘制置信度或预测区间ggplot2

防止在更新SHINY中的Reactive Value的部分内容时触发依赖事件

有没有办法定制Plot(allEffects())面板标题？

如何筛选截止年份之前最后一个测量年度的所有观测值以及截止年份之后所有年份的所有观测值

随机 Select 的非NA列的行均数

使用geom_sf跨越日期线时的闭合边界

有没有办法将勾选/审查标记添加到R中的累积关联图中？

如何在R中使用因子行求和？

R中从因数到数字的转换

在R中，有没有什么方法可以根据一列中的多个值来过滤行？