Log Loss

TL;DR

A classification metric that evaluates predicted probabilities, not just hard labels.
Penalizes confident incorrect predictions more than uncertain ones; lower values are better.
Commonly used in machine learning competitions because it is sensitive to probabilities near 0 and 1.

Definition

Log loss, also known as cross entropy loss, is a performance metric used in classification tasks. It measures the difference between the predicted probability and the actual outcome.

For a binary classification example where the actual outcome is “positive”, the log loss is:

$-\log(\text{predicted probability of positive class})$

For a multi-class classification task, log loss is given as:

$-\sum(\text{actual outcome} * \log(\text{predicted probability of each class}))$

Explanation

Log loss evaluates how well predicted probability distributions match the actual outcomes. A lower log loss indicates predicted probabilities are closer to the true outcome. Because the metric uses the logarithm of predicted probabilities, predictions that assign probabilities near 0 to the true class incur large penalties, while assigning high probability to the true class yields a small loss. This sensitivity to probabilities close to 0 and 1 makes log loss especially informative in settings where calibrated probability estimates matter.

Examples

Binary classification example

If the actual outcome is “positive”, log loss is computed as:

$-\log(\text{predicted probability of positive class})$

If the predicted probability of the positive class is 0.9, the log loss is:

$-\log(0.9) = 0.10$

Multi-class classification example

For 3 classes (A, B, C), log loss is:

$-\sum(\text{actual outcome} * \log(\text{predicted probability of each class}))$

If the actual outcome is “A” and the predicted probabilities are [0.1, 0.3, 0.6], then:

$-(0 * \log(0.1) + 1 * \log(0.3) + 0 * \log(0.6)) = 0.51$

Use cases

Commonly used in machine learning competitions as a performance metric because it assesses the quality of probability estimates and is sensitive to extreme probabilities.

Notes or pitfalls

Log loss penalizes heavily for incorrect predictions with high confidence (probabilities close to 0 or 1).
It rewards correct predictions given with high probability by producing low loss values.

Cross entropy loss