我需要总结一个巨大的CSV文件(nrow = 1102300).这是来自各种气候模型的每日气候数据.
首先,我想总结所有带有"历史"名称的列.我的目标是所有唯一的"lat"和"lon"的日期过滤的最大值(以年(即1950年,1951年等)."
一切帮助将不胜感激.
这是这样的:
df = read.csv(text = '"lat","lon","Date","pr_CMCC.ESM2_historical","pr_GFDL.ESM4_historical_ssp126","pr_BCC.CSM2.MR_historical_ssp126","pr_INM.CM4.8_historical_ssp126","pr_FGOALS.g3_historical_ssp126","pr_TaiESM1_historical_ssp126","pr_NorESM2.MM_historical_ssp126","pr_CanESM5_historical_ssp126","pr_KIOST.ESM_historical_ssp126","pr_NorESM2.LM_historical_ssp126","pr_INM.CM5.0_historical_ssp126"
46.29166646,-62.62500314,1/1/1950 12:00,1.7243347,6.10E-05,6.10E-05,2.5483093,1.7853699,6.10E-05,1.846405,6.10E-05,1.4954529,1.4496765,3.769043
46.29166646,-62.62500314,1/2/1950 12:00,6.10E-05,6.10E-05,6.10E-05,9.24704,6.10E-05,12.741333,6.10E-05,6.424103,0.56463623,6.10E-05,1.1139832
46.29166646,-62.62500314,1/3/1950 12:00,6.10E-05,6.10E-05,6.10E-05,6.10E-05,6.10E-05,1.052948,6.10E-05,1.1445007,6.10E-05,6.10E-05,6.10E-05
46.29166646,-62.62500314,1/4/1950 12:00,7.965271,6.10E-05,6.10E-05,6.5919495,1.9684753,6.10E-05,6.10E-05,1.4191589,6.10E-05,0.70196533,3.9368896',header = TRUE)
我希望我的最后一个输出帧像这样排列:
lat | lon | Value
其中Value =每年的最大值.