我有一个数据框,它代表了一条河流两年的每日温度时间序列.对于这条河,我想知道一年中的哪一天(doy
):
- 持续温度大于或等于10度
- 持续是指一年中的最高温度不再低于10到after,例如在秋季或冬季
- 气温持续低于或等于10度
- 持续是指在下一年之前不会再出现10以上的峰值
当我试图计算2时,我遇到了错误,因为有多个TRUE
个答案可供代码 Select .我想知道我如何才能让代码与前TRUE
个答案,如果有多个TRUE
个答案.
示例数据集
library(ggplot2)
library(lubridate)
library(dplyr)
library(dataRetrieval)
siteNumber <- "01417500"
parameterCd <- "00010" # water temperature
statCd <- "00003" # mean
startDate <- "2015-01-01"
endDate <- "2016-12-31"
dat <- readNWISdv(siteNumber, parameterCd, startDate, endDate, statCd=statCd)
dat <- dat[,c(2:4)]
colnames(dat)[3] <- "temperature"
# Visually inspect the time series
ggplot(data = dat, aes(x = Date, y = temperature)) +
geom_point() +
theme_bw()
%1&;%2的代码,其中%2有问题,因为有多个TRUE
语句可供 Select
dat %>%
mutate(year = year(Date),
doy = yday(Date)) %>%
group_by(year) %>%
mutate(gt_10 = temperature >= 10, # greater than or equal to 10 degrees
lt_10 = temperature <= 10, # less than or equal to 10 degrees
peak_doy = doy[which.max(temperature)], # what doy is max temperature
below_peak = doy < peak_doy, # is the observed doy less than the peak temperature doy
after_peak = doy > peak_doy, # is the observed doy greater than the peak temperature doy
test_above = ave(gt_10, cumsum(!gt_10), FUN = cumsum), # counts number of days above 10 degree threshold
test_below = ave(lt_10, cumsum(!lt_10), FUN = cumsum)) %>% # counts number of days below 10 degree threshold
summarise(first_above_10_sustained = doy[below_peak == T & test_above == 14]-13, # answer to 1
first_below_10_sustained = doy[after_peak == T & test_below == 14]-13) # answer to 2
- 为了回答2,代码查看当温度为年最高温度after(即,
after_peak == T
)and时,温度已连续14天低于10阈值(即,test_below == 14
)的那些时间.test_below == 14
是错误所在,因为发生这种情况的次数很多.是的,您可以将连续天数的阈值更改为某个值,但这不是重点.如果有多个TRUE
个答案,我如何获得接受第一个TRUE
答案的代码?
我的答案是similar SO question here,但我的答案只有在没有多个TRUE
个答案可供 Select 的情况下才有效.