我正在try 对我自己的数据集进行多层感知器二进制分类.但当我改变纪元数和学习率时,我总是得到相同的准确率.

My Multilayer Perceptron class

class MyMLP(nn.Module):
    def __init__(self, num_input_features, num_hidden_neuron1, num_hidden_neuron2, num_output_neurons):
        super(MyMLP, self).__init__()
        self.hidden_layer1 = nn.Linear(num_input_features, num_hidden_neuron1)
        self.hidden_layer2 = nn.Linear(num_hidden_neuron1, num_hidden_neuron2)
        self.output_layer = nn.Linear(num_hidden_neuron2, num_output_neurons)
        self.relu = nn.ReLU()
        self.sigmoid = nn.Sigmoid()

    def forward(self, X):
        X = torch.tensor(X, dtype=torch.float)
        hidden_res1 = self.relu(self.hidden_layer1(X))
        hidden_res2 = self.relu(self.hidden_layer2(hidden_res1))
        output = self.sigmoid(self.output_layer(hidden_res2))
        return output

My Dataset class

class PrincessDataset(Dataset):
    def __init__(self,dataName):
        #dataloading
        xy = np.loadtxt(dataName, delimiter=',', dtype=np.float32, skiprows=1)
        self.x = torch.from_numpy(xy[0:, :-1])
        self.y = torch.from_numpy(xy[:,-1])
        self.n_samples = xy.shape[0]
    def __getitem__(self, index):
        return self.x[index] , self.y[index]
    def __len__(self):
        return self.n_samples

My Code

batch_size = 16
num_workers = 2
test_data = PrincessDataset('cure_the_princess_test.csv')
train_data = PrincessDataset('cure_the_princess_train.csv')
validation_data = PrincessDataset('cure_the_princess_validation.csv')

train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=num_workers)
validation_loader = torch.utils.data.DataLoader(validation_data, batch_size=batch_size, shuffle=False, num_workers=num_workers)

# func parameters num_input_features, num_hidden_neuron1, num_hidden_neuron2, num_output_neurons

num_input_features = 13
num_hidden_neuron1 = 100
num_hidden_neuron2 = 50
num_output_neuron = 1 #binary classification
####
num_epochs = 200
learning_rate = 0.001
patience = 5
patience_counter = 0
###
model = MyMLP(num_input_features,num_hidden_neuron1, num_hidden_neuron2,num_output_neuron)

criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
list_train_loss, list_val_loss = [], []
best_val_loss = None

for epoch in range(num_epochs):
    train_loss = 0.0
    train_count = 0.0
    for inputs, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs.squeeze(), labels)
        loss.backward()
        optimizer.step()
        train_count += 1.0
        train_loss += loss.item()

    validation_loss = 0.0
    with torch.no_grad():
        model.eval()
        for inputs, labels in validation_loader:
            outputs = model(inputs)
            loss = criterion(outputs.squeeze(), labels)
            validation_loss += loss.item()

    model.train()
    train_loss /= train_count
    validation_loss /= len(validation_loader)
    print("Epoch", epoch, "Training loss", train_loss,"Validation Loss :",validation_loss)

    list_train_loss.append(train_loss)
    list_val_loss.append(validation_loss)
    
    val_score = validation_loss
    if best_val_loss is None:
        best_val_loss = val_score # hafızada patience boyu tutmaya başla
        torch.save(model.state_dict(), "bestval.pt")
    elif best_val_loss < val_score: # patience counter
        patience_counter += 1
        print("Earlystopping Patience Counter:",patience_counter)
        if patience_counter == patience:
            break
    else:
        best_val_loss = val_score
        torch.save(model.state_dict(), "bestval.pt") # to keep the best model
        patience_counter = 0
                   

sns.set_style("darkgrid")
plt.plot(list_train_loss, label="Training loss")
plt.plot(list_val_loss, label="Validation loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.show()

Accuracy calculating

model = MyMLP(num_input_features,num_hidden_neuron1, num_hidden_neuron2,num_output_neuron)
model.load_state_dict(torch.load('bestval.pt'))
model.eval()
predicts =[]
real_labels = list()

n_correct = 0
n_samples = 0
with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _,predict = torch.max(outputs.data,1)
        n_samples += labels.size(0)
        predicts.extend(predict.tolist())
        real_labels.extend(labels.tolist())

from sklearn.metrics import f1_score,accuracy_score,classification_report
print("Accuracy score of this model: {}".format(accuracy_score(real_labels,predicts)))
print(classification_report(real_labels,predicts))

Accuracy Result :

Accuracy score of this model: 0.49740932642487046
              precision    recall  f1-score   support

         0.0       0.50      1.00      0.66       384
         1.0       0.00      0.00      0.00       388

    accuracy                           0.50       772
   macro avg       0.25      0.50      0.33       772
weighted avg       0.25      0.50      0.33       772

当我改变纪元数、学习速度时,我得到了一些准确率分数. 我试着用3天的时间来解决这个问题.你能帮帮我吗?

我的CSV文件如下所示

Phoenix Feather,Unicorn Horn,Dragon's Blood,Mermaid Tears,Fairy Dust,Goblin Toes,Witch's Brew,Griffin Claw,Troll Hair,Kraken Ink,Minotaur Horn,Basilisk Scale,Chimera Fang,Cured
10.0,15.3,27.1,13.3,18.1,12.3,4.8,24.0,10.0,17.5,5.9,27.6,8.6,0
31.6,1.9,25.2,17.9,16.4,2.4,4.2,6.4,32.5,21.9,19.7,12.4,17.4,1
22.4,9.2,23.7,14.9,18.2,10.5,6.8,15.3,21.0,16.8,31.6,19.4,11.6,0
24.5,2.3,2.2,26.2,7.3,2.8,20.6,7.8,23.0,17.0,2.7,7.6,26.0,1
3.2,20.2,12.9,13.3,7.7,29.6,2.6,12.9,12.7,13.8,8.9,6.5,9.1,0
15.7,17.5,14.4,12.2,11.9,4.2,1.7,6.4,20.9,12.5,21.1,15.6,12.4,1
.
.
.

第一行是标签名称,最后一列是分类0或1,其他列是输入值

推荐答案

这是一个二进制分类(你的输出是一维的),你should nottorch.max它将总是返回相同的输出,也就是0.相反,您应该将输出与阈值进行比较,如下所示:

threshold = 0.5
preds = (outputs >threshold).to(labels.dtype)

Python相关问答推荐

如何在图片中找到这个化学测试条?OpenCV精明边缘检测不会绘制边界框

我在使用fill_between()将最大和最小带应用到我的图表中时遇到问题

SQLGory-file包FilField不允许提供自定义文件名,自动将文件保存为未命名

对某些列的总数进行民意调查,但不单独列出每列

Gekko:Spring-Mass系统的参数识别

Pandas 都是(),但有一个门槛

我们可以为Flask模型中的id字段主键设置默认uuid吗

如何在给定的条件下使numpy数组的计算速度最快?

mypy无法推断类型参数.List和Iterable的区别

计算分布的标准差

字符串合并语法在哪里记录

Python Pandas获取层次路径直到顶层管理

从Windows Python脚本在WSL上运行Linux应用程序

如何在Python中使用Iscolc迭代器实现观察者模式?

如何获得3D点的平移和旋转,给定的点已经旋转?

Python—在嵌套列表中添加相同索引的元素,然后计算平均值

如何在Gekko中处理跨矢量优化

如何使用matplotlib查看并列直方图

Polars定制函数返回多列

如何在python tkinter中绑定键盘上的另一个回车?