Confusion Matrix

TL;DR

A confusion matrix is a tabular summary of a classifier’s prediction outcomes.
It shows counts of true positives, true negatives, false positives, and false negatives.
Use it to identify where a classifier makes correct predictions and where it makes mistakes.

Definition

A confusion matrix is a tool used in the evaluation of the performance of a classification algorithm. It is a table that displays the number of true positive, true negative, false positive, and false negative predictions made by the algorithm.

Explanation

The confusion matrix organizes predictions by actual class versus predicted class, making it straightforward to see which instances were classified correctly (true positives and true negatives) and which were misclassified (false positives and false negatives). By examining these counts, you can identify specific types of errors the classifier makes and target improvements.

Examples

Spam detection

Predicted Spam	Predicted Not Spam
Actual Spam	TP
Actual Not Spam	FP

TP: emails correctly identified as spam.
FN: emails that were spam but not identified as spam.
FP: emails incorrectly identified as spam but actually not spam.
TN: emails correctly identified as not spam.

Medical diagnosis

Predicted Dis	Predicted No Dis
Actual Dis	TP
Actual No Dis	FP

TP: patients correctly identified as having the disease.
FN: patients who had the disease but were not identified.
FP: patients incorrectly identified as having the disease but did not.
TN: patients correctly identified as not having the disease.

Use cases

Evaluating the performance of a classification algorithm by revealing where it makes mistakes and where it makes correct predictions.
Analyzing the confusion matrix to identify areas for improvement and fine-tune the algorithm to achieve more accurate predictions.

True positive (TP)
True negative (TN)
False positive (FP)
False negative (FN)
Classification algorithm