关注我们

分类算法中的 Confusion Matrix函数

这是衡量分类问题性能的最简单方法,其中输出可以是两种或多种类型的类。混淆矩阵不过是具有二维的表。 "实际"和"预测",此外,这两个维度均具有"真阳性(TP)","真阴性(TN)","假阳性(FP)","假阴性(FN)",如下所示-

与混淆矩阵相关的术语的解释如下-

True Positives (TP) − It is the case when both actual class & predicted class of data point is 1.
True Negatives (TN) − It is the case when both actual class & predicted class of data point is 0.
False Positives (FP) − It is the case when actual class of data point is 0 & predicted class of data point is 1.
False Negatives (FN) − It is the case when actual class of data point is 1 & predicted class of data point is 0.

我们可以借助sklearn的 confusion_matrix()函数找到混淆矩阵。借助以下脚本,我们可以找到上面构建的二进制分类器的混淆矩阵-

from sklearn.metrics import confusion_matrix

输出

[[ 73 7]
[ 4 144]]

准确性

它可以定义为我们的ML模型做出的正确预测的数量。我们可以借助以下公式轻松地通过混淆矩阵进行计算-

$$准确性 =\frac{TP+TN}{TP+FP+FN+TN}$$

对于以上构建的二进制分类器,TP + TN=73 + 144=217和TP + FP + FN + TN=73 + 7 + 4 + 144=228。

Hence, 准确性=217/228=0.951754385965 which is same as we have calculated after creating our binary classifier.

链接：https://www.learnfk.comhttps://www.learnfk.com/python-machine-learning/machine-learning-with-python-confusion-matrix.html

来源：LearnFk无涯教程网

精确

精确, used in document retrievals, may be defined as the number of correct documents returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −

$$精确 =\frac{TP}{TP+FP}$$

对于上面构建的二进制分类器,TP=73,TP + FP=73 + 7=80。

Hence, 精确=73/80=0.915

召回或敏感性

召回率可以定义为我们的ML模型返回的肯定数。我们可以借助以下公式轻松地通过混淆矩阵进行计算-

$$Recall =\frac {TP} {TP + FN} $$

对于上述内置的二进制分类器,TP=73,TP + FN=73 + 4=77。

Hence, 精确=73/77=0.94805

特异性

特异性, in contrast to recall, may be defined as the number of negatives returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −

无涯教程网

$$特异性 =\frac{TN}{TN+FP}$$

对于上面构建的二进制分类器,TN=144和TN + FP=144 + 7=151。

Hence, 精确=144/151=0.95364

祝学习愉快！(内容编辑有误？请选中要编辑内容 -> 右键 -> 修改 -> 提交！)

技术教程推荐

徐昊 · TDD项目实战70讲 -〔徐昊〕