设置为l1_ratio = 0时,弹性网将减少为岭回归. 然而,我无法比拟从SKLINE的ridgeCVElasticNetCV获得的结果.它们产生的最佳Alpha值似乎非常不同:

import numpy as np
from sklearn.linear_model import ElasticNetCV, RidgeCV
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
import numpy as np

#data generation
np.random.seed(123)
beta = 0.35
N = 120
p = 30

X = np.random.normal(1, 2, (N, p))
y = np.random.normal(5, size=N) + beta * X[:, 0]
#lambdas to try:
l = np.exp(np.linspace(-2, 8, 80))

ridge1 = RidgeCV(alphas = l, store_cv_values=True).fit(X, y)
MSE_cv = np.mean(ridge1.cv_values_, axis =0)#.shape
y_pred = ridge1.predict(X=X)
MSE = mean_squared_error(y_true=y,y_pred=y_pred)

print(f"best alpha: {np.round(ridge1.alpha_,3)}")
print(f"MSE: {np.round(MSE,3)}")

哪一种yield best alpha: 305.368MSE: 0.952

虽然ElasticNetCV最终得到了类似的MSE,但它的惩罚参数似乎在不同的范围内(实际上与R实现一致)

ridge2 = ElasticNetCV(cv=10, alphas = l, random_state=0, l1_ratio=0);
ridge2.fit(X, y) 
y_pred = ridge2.predict(X=X)
MSE = mean_squared_error(y_true=y,y_pred=y_pred)

print(f"best alpha: {np.round(ridge2.alpha_,3)}")
print(f"MSE: {np.round(MSE,3)}")

屈服 best alpha: 2.192MSE: 0.934

处罚的定义有没有不同? 一可以除以N吗? 或者这是因为非常不同的交叉验证策略?

推荐答案

处罚的定义有没有不同?一可以除以N吗?

是的,这就是产生差异的原因.在弹性网络中,cost function的正则化部分由相对于误差项的样本数量来zoom .RidgeCV的情况并非如此.因此,为了使成本函数相等,我们需要将ElasticNetCV个Alpha除以训练文件夹的大小.

这是由于非常不同的交叉验证策略吗?

RidgeCV使用内部高效的厕所方案.我们可以通过在ElasticNetCV中设置cv=LeaveOneOut()来控制CV方案-这意味着两个模型都将使用LOO.否则,ElasticNetCV将默认为5倍的CV.

当使用cv=LeaveOneOut()ElasticNet时,训练折叠大小为n_samples-1,因此这是我们需要扩展的范围.

RidgeCV不同,ElasticNetCV不会保留每个Alpha的CV分数.我添加了ElasticNetCV的"手动"版本,其中我使用loo将ElasticNetGridSearchCV组合在一起,这让我可以访问每个字母的MSE(与RidgeCV相比).

应用必要的zoom 后,结果会对齐:

RidgeCV
  best alpha: 305.368 <---
  MSE: 0.952

ElasticNetCV using LOO
  best alpha | scaled: 2.566 | unscaled: 305.368 <---
  MSE: 0.953

ElasticNet + GridSearchCV using LOO
  best alpha | scaled: 2.566 | unscaled: 305.368 <---
  MSE: 0.953

enter image description here


import numpy as np
from sklearn.linear_model import ElasticNetCV, RidgeCV
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(suppress=True)

#data generation
np.random.seed(123)
beta = 0.35
N = 120
p = 30

X = np.random.normal(1, 2, (N, p))
y = np.random.normal(5, size=N) + beta * X[:, 0]

#lambdas to try:
alphas = np.exp(np.linspace(-2, 8, 80))
#
# RidgeCV with internal efficient LOO CV
#
ridge1 = RidgeCV(alphas=alphas, store_cv_values=True).fit(X, y)
MSE_cv = np.mean(ridge1.cv_values_, axis=0)#.shape
y_pred = ridge1.predict(X=X)
MSE = mean_squared_error(y_true=y,y_pred=y_pred)

print('RidgeCV')
print("  best alpha:", round(ridge1.alpha_, 3), '<---')
print("  MSE:", round(MSE, 3))

#
# ElasticNetCV with LOO CV
#
from sklearn.model_selection import LeaveOneOut

fold_size = len(X) - 1
alphas_scaled = alphas / fold_size #for equivalent cost func to RidgeCV

ridge2 = ElasticNetCV(
    cv=LeaveOneOut(), alphas = alphas_scaled,
    l1_ratio=1e-10, random_state=0
)
ridge2.fit(X, y)
y_pred = ridge2.predict(X=X)
MSE = mean_squared_error(y_true=y, y_pred=y_pred)

print('\nElasticNetCV using LOO')
print(
    '  best alpha | scaled:', round(ridge2.alpha_, 3),
    '| scaled:', round(ridge2.alpha_ * fold_size, 3), '<---'
)
print("  MSE:", round(MSE, 3))

#
# "Manual" version of ElasticNetCV
# i.e. ElasticNet + GridSearchCV using LOO
# Used for getting CV score for each alpha
#
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import GridSearchCV

search = GridSearchCV(
    ElasticNet(l1_ratio=1e-10),
    param_grid=dict(alpha=list( alphas / fold_size )),
    scoring='neg_mean_squared_error',
    cv=LeaveOneOut(),
    n_jobs=-1
).fit(X, y)

ridge3 = search.best_estimator_
y_pred = ridge3.predict(X)
MSE = mean_squared_error(y, y_pred)

print('\nElasticNet + GridSearchCV using LOO')
print(
    "  best alpha | scaled:", round(ridge3.alpha, 3),
    "| unscaled:", round(ridge3.alpha * fold_size, 3), '<---'
)
print("  MSE:", round(MSE, 3))

#
# Plot CV scores vs alpha
#
plt.scatter(
    alphas, ridge1.cv_values_.mean(axis=0),
    color='black', marker='s', s=50, label='RidgeCV'
)
plt.scatter(
    alphas, -search.cv_results_['mean_test_score'],
    color='gray', marker='.', label=r'ElasticNet(L1$\approx$0) + LOO CV'
)
plt.legend(fontsize=10)
plt.xlabel(r'$\alpha$')
plt.ylabel('CV MSE')
plt.gcf().set_size_inches(8, 3)

#Extra formatting
[plt.gca().spines[s].set_visible(False) for s in ['right', 'top']]
plt.gca().spines['bottom'].set_bounds(0, 3000)
plt.gca().spines['left'].set_bounds(1.35, 1.5)

Python相关问答推荐

Pandas 都是(),但有一个门槛

使用setuptools pyproject.toml和自定义目录树构建PyPi包

删除字符串中第一次出现单词后的所有内容

将9个3x3矩阵按特定顺序排列成9x9矩阵

从spaCy的句子中提取日期

NumPy中条件嵌套for循环的向量化

将JSON对象转换为Dataframe

如果满足某些条件,则用另一个数据帧列中的值填充空数据帧或数组

Maya Python脚本将纹理应用于所有对象,而不是选定对象

如何杀死一个进程,我的Python可执行文件以sudo启动?

在Python中控制列表中的数据步长

高效生成累积式三角矩阵

Scipy差分进化:如何传递矩阵作为参数进行优化?

PYTHON中的selenium不会打开 chromium URL

将参数从另一个python脚本中传递给main(argv

在不降低分辨率的情况下绘制一组数据点的最外轮廓

Python:使用asyncio.StreamReader.readline()读取长行

Django REST框架+Django Channel->;[Errno 111]连接调用失败(';127.0.0.1';,6379)

Django查询集-排除True值

Fake pathlib.使用pyfakefs的类变量中的路径'