我try 使用函数直方图将一个变量(SST)作为另一个变量(TCWV)的函数进行采样,并将权重设置为示例变量,如下所示:
# average sst over bins
num, _ = np.histogram(tcwv, bins=bins)
sstsum, _ = np.histogram(tcwv, bins=bins,weights=sst)
out=np.zeros_like(sstsum)
out[:]=np.nan
sstav = np.divide(sstsum,num,out=out, where=num>100)
重现性的完整代码如下所示.我的问题是,当我绘制原始数据的散点图,然后绘制计算出的平均值时,平均值就像这样位于数据"云"之外(请参见右侧的点):
我想不出为什么会发生这种情况,除非这可能是一个舍入误差?
这是我的全部代码:
import numpy as np
import matplotlib.pyplot as plt
from netCDF4 import Dataset
# if you have a recent netcdf libraries you can access it directly here
url = ('http://clima-dods.ictp.it/Users/tompkins/CRM/data/WRF_1min_mem3_grid4.nc#mode=bytes')
ds=Dataset(url)
### otherwise need to download, and use this:
###ifile="WRF_1min_mem3_grid4.nc"
###ds=Dataset(idir+ifile)
# axis bins
bins=np.linspace(40,80,21)
iran1,iran2=40,60
# can put in dict and loop
sst=ds.variables["sst"][iran1:iran2+1,:,:]
tcwv=ds.variables["tcwv"][iran1:iran2+1,:,:]
# don't need to flatten, just tried it to see if helps (it doesn't)
sst=sst.flatten()
tcwv=tcwv.flatten()
# average sst over bins
num, _ = np.histogram(tcwv, bins=bins)
sstsum, _ = np.histogram(tcwv, bins=bins,weights=sst)
out=np.zeros_like(sstsum)
out[:]=np.nan
sstav = np.divide(sstsum,num,out=out,where=num>100)
# bins centroids
avbins=(np.array(bins[1:])+np.array(bins[:-1]))/2
#plot
subsam=2
fig,(ax)=plt.subplots()
plt.scatter(tcwv.flatten()[::subsam],sst.flatten()[::subsam],s=0.05,marker=".")
plt.scatter(avbins,sstav,s=3,color="red")
plt.ylim(299,303)
plt.savefig("scatter.png")