X_train = df_train["Base_Reviews"].values
X_test  = df_test["Base_Reviews"].values

y_train = df_train['category'].values
y_test  = df_test['category'].values

num_words = 20000 #Max. workds to use per toxic comment
max_features = 15000 #Max. number of unique words in embeddinbg vector
max_len = 200 #Max. number of words per toxic comment to be use
embedding_dims = 128 #embedding vector output dimension 
num_epochs = 5 # (before 5)number of epochs (number of times that the model is exposed to the training dataset)
val_split = 0.2
batch_size2 = 256

tokenizer = tokenizer = Tokenizer(num_words = num_words, lower = False)
tokenizer.fit_on_texts(list(X_train))


X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)
X_train = sequence.pad_sequences(X_train, max_len)
X_test  = sequence.pad_sequences(X_test,  max_len)
print('X_train shape:', X_train.shape)
print('X_test shape: ', X_test.shape)

这是我们数据集的形状:X_火车形状:(11419200),X_测试形状:(893200)

X_tra, X_val, y_tra, y_val = train_test_split(X_train, y_train, train_size =0.8, random_state=233)
early = EarlyStopping(monitor="val_loss", mode="min", patience=4)

nn_model = Sequential([
    Embedding(input_dim=max_features, input_length=max_len, output_dim=embedding_dims),
    GlobalMaxPool1D(),
    Dense(50, activation = 'relu'),
    Dropout(0.2),
    Dense(5, activation = 'softmax')
])

def mean_pred(y_true, y_pred):
return K.mean(y_pred)
nn_model.compile(loss="categorical_crossentropy", optimizer=Adam(0.01), metrics=['accuracy', mean_pred, fmeasure, precision, auroc, recall])

当我运行下面的代码时,我得到了上面的错误.

nn_model.compile(loss="categorical_crossentropy", optimizer=Adam(0.01), metrics=['accuracy', mean_pred, fmeasure, precision, auroc, recall])

当我把数据输入NN模型时,我得到了上面的错误.如何解决错误?这就是错误:

ValueError                               


Traceback (most recent call last)
<ipython-input-51-a3721a91aa0b> in <module>
----> 1 nn_model_fit = nn_model.fit(X_tra, y_tra, batch_size=batch_size2, epochs=num_epochs, validation_data=(X_val, y_val), callbacks=[early])

~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

~\anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
   1145           except Exception as e:  # pylint:disable=broad-except
   1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
   1148             else:
   1149               raise

ValueError: in user code:
**ValueError: Shapes (None, 1) and (None, 5) are incompatible**

推荐答案

必须将标签映射到整数值:

import numpy as np

labels_index = dict(zip(["issue", "supporting", "decision", "neutral", "attacking"], np.arange(5)))

y_train = [labels_index[y] for y in y_train]

Python相关问答推荐

将DF中的名称与另一DF拆分并匹配并返回匹配的公司

' osmnx.shortest_track '返回有效源 node 和目标 node 的'无'

如何列举Pandigital Prime Set

UNIQUE约束失败:customuser. username

如何防止Pandas将索引标为周期?

跳过嵌套JSON中的级别并转换为Pandas Rame

(Python/Pandas)基于列中非缺失值的子集DataFrame

并行编程:同步进程

从源代码显示不同的输出(机器学习)(Python)

如何在Python Pandas中填充外部连接后的列中填充DDL值

Python—在嵌套列表中添加相同索引的元素,然后计算平均值

如何编辑此代码,使其从多个EXCEL文件的特定工作表中提取数据以显示在单独的文件中

为什么后跟inplace方法的`.rename(Columns={';b';:';b';},Copy=False)`没有更新原始数据帧?

Seaborn散点图使用多个不同的标记而不是点

分解polars DataFrame列而不重复其他列值

在任何要保留的字段中添加引号的文件,就像在Pandas 中一样

大型稀疏CSR二进制矩阵乘法结果中的错误

某些值的数值幂和**之间的差异

如何使用Polars从AWS S3读取镶木地板文件

如何在开始迭代自定义迭代器类时重置索引属性?