datos = fits.open('/home/citlali/Documentos/Servicio/Lista.fits')
data = datos[1].data


#Linea [SIII] 9532
Mask_1 = data['flux_[SIII]9531.1_Re_fit'] / data['e_flux_[SIII]9531.1_Re_fit'] > 5
newdata1 = data[Mask_1]
dat_flux = newdata1['flux_[SIII]9069.0_Re_fit']
dat_eflux = newdata1['e_flux_[SIII]9069.0_Re_fit']
Mask_2 = dat_flux / dat_eflux > 5
newdata2 = newdata1[Mask_2]


H1_alpha = newdata1['log_NII_Ha_Re']
H1_beta = newdata1['log_OIII_Hb_Re']
H2_alpha = newdata2['log_NII_Ha_Re']
H2_beta = newdata2['log_OIII_Hb_Re']


M = H1_alpha < -0.9
newx = H1_alpha[M]
newy = H1_beta[M]  
ex = newx 
ey = newy 
#print("Elementos de SIII [9532]: ", len(newx))
m = H2_alpha < -0.9
newxm = H2_alpha[m]
newym = H2_beta[m] 
#print("Elementos de SIII [9069]: ", len(newxm))

sm = heapq.nsmallest(3000, zip(newx, newy)) # zip them to sort together
newx, newy = zip(*sm) # unzip them

plt.figure()
plt.plot(H1_alpha, H1_beta, '*', color ='darkred', markersize="7", label = "SIII [9532]") 
plt.plot(H2_alpha, H2_beta, '.', color ='rosybrown', markersize="3", label = "SIII [9069]")
plt.xlim(-1.5, 0.75)
plt.ylim(-1, 1) 
plt.title('Diagrama de diagnóstico')
plt.ylabel('OIII/Hbeta')
plt.xlabel('NII/Halpha')
plt.grid()
plt.legend()
fig = plt.gcf()
fig.set_size_inches(8, 6)
plt.show()

coding plot In the figure I show my plot, and the black line is what I want to obtain.

代码读取了我绘制的下载数据,它显示了信号/噪声大于5的星系.绘制这条线时要考虑的数据必须是H1_Alpha,H1_Beta和/或H2_Alpha,H2_Beta.

推荐答案

下面的使用测试数据集的示例显示了如何根据密度得出样本权重,然后将这些样本权重提供给曲线拟合算法.

示例数据集是嵌入噪声中的类二次曲线/密度:

enter image description here

我们可以使用skLearning的GaussianMixture对密度进行建模.它 for each 样本提供了一个分数,其中分数越高对应的区域越可能(即密度越大).

enter image description here

使用Scipy的curve_fit()来使用先前计算的采样权重来拟合二次曲线(Scipy将它们解释为误差条,因此使用1/sample_weights):

enter image description here

或者,使用SkLearning的SVR,它直接获取样本权重(尽管结果几乎相同):

enter image description here

叠加各个步骤:

enter image description here

import numpy as np
import matplotlib.pyplot as plt

#Make some test data - a quadratic curve embedded in noise
from sklearn.datasets import make_moons
X1, y1 = make_moons(500, noise=0.7, random_state=0)
X2, y2 = make_moons(500, noise=0.4, random_state=1)
X3, y3 = make_moons(600, noise=0.08, random_state=2)
X = np.concatenate([X1, X2, X3[y3==0]], axis=0)

#View the data
plt.scatter(X[:, 0], X[:, 1], marker='.', s=10)
plt.gca().set(xlabel='x0', ylabel='x1')
plt.gcf().set_size_inches(5, 3)

#Model the density
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=20, random_state=0).fit(X)

xx, yy = np.meshgrid(np.linspace(-2, 3), np.linspace(-3, 3))
log_scores = gmm.score_samples(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
scores = np.exp(log_scores)

plt.contour(xx, yy, scores, levels=8, cmap='plasma_r', alpha=1)

#Get the density score for each datapoint
data_scores = np.exp(gmm.score_samples(X))
# can scale scores to max value = 1
data_scores /= data_scores.max()

#Fit quadratic curve using scipy's curve_fit
# supply 1/data_scores as the sample errors
from scipy.optimize import curve_fit

#Fit a quadratic curve
(A, B, C), _ = curve_fit(lambda x, A, B, C: A*x**2 + B*x + C, X[:, 0], X[:, 1], sigma=1 / data_scores)

x_axis = X[:, 0][np.argsort(X[:, 0])].reshape(-1, 1)

plt.plot(
    x_axis,
    A * x_axis**2 + B * x_axis + C,
    color='tab:brown', lw=10, alpha=0.3, label='curve_fit()'
)
plt.gcf().legend()

#Alternatively, using support vector regressor from sklearn
# Takes data_scores directly as relative sample weights
from sklearn.svm import SVR

#Quadratic curve, as before
svr = SVR(kernel='poly', degree=2).fit(X[:, [0]], X[:, 1], sample_weight=data_scores)
plt.plot(
    x_axis,
    svr.predict(x_axis),
    color='tab:purple', lw=10, alpha=0.35, label='SVR'
)

plt.gcf().legend()

Python相关问答推荐

Django管理面板显示字段最大长度而不是字段名称

当多个值具有相同模式时返回空

时间序列分解

Matlab中是否有Python的f-字符串等效物

如何使用matplotlib在Python中使用规范化数据和原始t测试值创建组合热图?

如何根据参数推断对象的返回类型?

Pandas计数符合某些条件的特定列的数量

使用Python更新字典中的值

形状弃用警告与组合多边形和多边形如何解决

如何合并两个列表,并获得每个索引值最高的列表名称?

Pandas Data Wrangling/Dataframe Assignment

基于形状而非距离的两个numpy数组相似性

基于Scipy插值法的三次样条系数

无法在Spyder上的Pandas中将本地CSV转换为数据帧

如何从比较函数生成ngroup?

python的文件. truncate()意外地没有截断'

极柱内丢失类型信息""

在任何要保留的字段中添加引号的文件,就像在Pandas 中一样

在聚合中使用python-polars时如何计算模式

#将多条一维曲线计算成其二维数组(图像)表示