Accuracy Score

TL;DR

Measures the proportion of a classification model’s predictions that are correct.
Simple to compute and interpret but can be misleading when classes are imbalanced.
When class distribution is skewed, also consider precision, recall, and F1 score for a more complete evaluation.

Definition

Accuracy score is a metric used to evaluate the performance of a classification model. It is the ratio of the number of correct predictions made by the model to the total number of predictions made.

\text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}}

Explanation

Accuracy quantifies how many predictions a model gets right out of all predictions. A higher accuracy score indicates the model makes more correct predictions overall. However, accuracy does not account for the underlying class distribution in the dataset; when one class dominates, a model can achieve a high accuracy by always predicting the majority class while neglecting the minority class. In such situations, complementary metrics—precision, recall, and F1 score—should be used to assess model performance more thoroughly.

Examples

Churn prediction example

A classification model trained to predict whether a customer will churn makes 100 predictions, and 90 of those predictions are correct. The accuracy score is:

90/100 = 0.9

Imbalanced classes example

A dataset has two classes: Class A with 90% of the data and Class B with 10% of the data. A model that always predicts Class A will have an accuracy score of:

0.9 This high accuracy masks the model’s failure to predict the minority class (Class B).

Spam detection example (precision, recall, F1)

A model predicts whether an email is spam. The model makes 100 predictions and predicts 80 as spam. Of those 80 spam predictions, 70 are actually spam.

Precision = 70/80 = 0.875

Out of 100 emails, 80 are actually spam. The model correctly predicts 70 of them.

Recall = 70/80 = 0.875

F1 score is the harmonic mean of precision and recall. With precision = 0.875 and recall = 0.875:

F1 = 2 * (0.875 * 0.875) / (0.875 + 0.875) = 0.875

Notes or pitfalls

Accuracy can be misleading on imbalanced datasets because it does not consider class distribution.
In cases of class imbalance, use precision, recall, and F1 score for a more comprehensive evaluation.

Precision
Recall
F1 score