在为我的ML作业(job)实现SGD算法时,我有一个意外的输出.
This is part of my training data which normally has 320 rows:
我的数据集:https://github.com/Jangrae/csv/blob/master/carseats.csv
我首先做了一些数据预处理:
import pandas as pd
from sklearn.preprocessing import StandardScaler
import numpy as np
train_data = pd.read_csv('carseats_train.csv')
train_data.replace({'Yes': 1, 'No': 0}, inplace=True)
onehot_tr = pd.get_dummies(train_data['ShelveLoc'], dtype=int, prefix_sep='_', prefix='ShelveLoc')
train_data = train_data.drop('ShelveLoc', axis=1)
train_data = train_data.join(onehot_tr)
train_data_Y = train_data.iloc[:, 0]
train_data_X = train_data.drop('Sales', axis=1)
然后按如下方式实现该算法:
learning_rate = 0.01
epoch_num = 50
initial_w = 0.1
intercept = 0.1
w_matrix = np.ones((12, 1)) * initial_w
for e in range(epoch_num):
for i in range(len(train_data_X)):
x_i = train_data_X.iloc[i].to_numpy()
y_i = train_data_Y.iloc[i]
y_estimated = np.dot(x_i, w_matrix) + intercept
grad_w = x_i.reshape(-1, 1) * (y_i - y_estimated)
grad_intercept = (y_i - y_estimated)
w_matrix = w_matrix - 2 * learning_rate * grad_w
intercept = intercept - 2 * learning_rate * grad_intercept
print("Final weights:\n", w_matrix)
print("Final intercept:", intercept)
但输出是
Final weights:
[[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]
[nan]]
Final intercept: [nan]
我用不同的学习速率运行它,我也try 了收敛阈值,但仍然得到了相同的结果.我找不到为什么我的代码给了我NAN..
有人能看出来这个问题吗?