这是衡量分类问题性能的最简单方法,其中输出可以是两种或多种类型的类。混淆矩阵不过是具有二维的表。 "实际"和"预测",此外,这两个维度均具有"真阳性(TP)","真阴性(TN)","假阳性(FP)","假阴性(FN)",如下所示-
与混淆矩阵相关的术语的解释如下-
True Positives (TP) − It is the case when both actual class & predicted class of data point is 1.
True Negatives (TN) − It is the case when both actual class & predicted class of data point is 0.
False Positives (FP) − It is the case when actual class of data point is 0 & predicted class of data point is 1.
False Negatives (FN) − It is the case when actual class of data point is 1 & predicted class of data point is 0.
我们可以借助sklearn的 confusion_matrix()函数找到混淆矩阵。借助以下脚本,我们可以找到上面构建的二进制分类器的混淆矩阵-
from sklearn.metrics import confusion_matrix
输出
[[ 73 7] [ 4 144]]
它可以定义为我们的ML模型做出的正确预测的数量。我们可以借助以下公式轻松地通过混淆矩阵进行计算-
$$准确性 =\frac{TP+TN}{TP+FP+FN+TN}$$
对于以上构建的二进制分类器,TP + TN=73 + 144=217和TP + FP + FN + TN=73 + 7 + 4 + 144=228。
Hence, 准确性=217/228=0.951754385965 which is same as we have calculated after creating our binary classifier.
来源:LearnFk无涯教程网
精确, used in document retrievals, may be defined as the number of correct documents returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −
$$精确 =\frac{TP}{TP+FP}$$
对于上面构建的二进制分类器,TP=73,TP + FP=73 + 7=80。
Hence, 精确=73/80=0.915
召回率可以定义为我们的ML模型返回的肯定数。我们可以借助以下公式轻松地通过混淆矩阵进行计算-
$$Recall =\frac {TP} {TP + FN} $$
对于上述内置的二进制分类器,TP=73,TP + FN=73 + 4=77。
Hence, 精确=73/77=0.94805
特异性, in contrast to recall, may be defined as the number of negatives returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −
$$特异性 =\frac{TN}{TN+FP}$$
对于上面构建的二进制分类器,TN=144和TN + FP=144 + 7=151。
Hence, 精确=144/151=0.95364
祝学习愉快!(内容编辑有误?请选中要编辑内容 -> 右键 -> 修改 -> 提交!)