Skip to content

Mean Squared Error Mse

  • A loss metric that averages the squared differences between model predictions and true values.
  • Commonly used for regression and easily optimized because it is differentiable.
  • Interpretable but sensitive to outliers and produces units squared (e.g., dollars squared).

Mean squared error (MSE) is a loss function defined as the average squared difference between the predicted values and the true values. It measures the average squared error between predictions and ground-truth values.

MSE is calculated by squaring the error (difference between predicted and true value) for each observation, then averaging those squared errors across the dataset. Because the errors are squared, MSE has the same units as the squared units of the predicted and true values (for example, dollars squared for stock prices). MSE is differentiable and computationally simple, which makes it suitable for optimization algorithms such as gradient descent. However, squaring errors makes MSE sensitive to outliers and less robust when the true-value distribution is heavily skewed.

If the true stock price is 100 and the model predicts 105, the squared error is:

(100105)2=25(100 - 105)^2 = 25

If the model predicts 90, the squared error is:

(10090)2=100(100 - 90)^2 = 100

For a dataset of 10 true stock prices where the model’s next-day predictions are: 105, 90, 110, 95, 100, 105, 100, 95, 105, and 110, the MSE is:

(25+100+16+9+1+0+0+9+1+16)/10=21.4(25 + 100 + 16 + 9 + 1 + 0 + 0 + 9 + 1 + 16)/10 = 21.4
  • Regression problems that predict continuous values (examples given: stock price, temperature).
  • Loss function for models trained with optimization methods like gradient descent.
  • Sensitive to outliers: a single large error can disproportionately increase MSE.
  • Not robust to skewed data: errors on larger true values can dominate the average, potentially misrepresenting overall performance.
  • Loss function
  • Regression
  • Gradient descent