Hosmer-Lemeshow test
- A procedure to check how well a binary logistic regression model’s predicted probabilities match observed outcomes.
- The sample is grouped (commonly into deciles) by predicted probability and observed vs. predicted counts are compared.
- A test statistic near zero indicates good fit; a statistic significantly different from zero indicates poor fit.
Definition
Section titled “Definition”The Hosmer-Lemeshow test is a statistical method used to evaluate the goodness of fit of a binary logistic regression model.
Explanation
Section titled “Explanation”The test assesses a logistic model’s predictive ability by comparing observed outcomes to predicted probabilities. It divides the sample into a predetermined number of groups (commonly deciles) formed by sorting on the predicted probabilities. For each group the observed and predicted probabilities are calculated and compared; the group-level comparisons are then combined into a single test statistic by summation.
The test relies on a measure of deviance computed for individuals by subtracting the predicted probability from the observed outcome and taking the square of that difference. These individual deviances are summed to obtain the total deviance, and the Hosmer-Lemeshow test statistic is formed by comparing observed and predicted probabilities across the groups and summing the results.
Example expressions for the individual and total deviance as described in the source:
If the Hosmer-Lemeshow test statistic is not significantly different from zero, the logistic regression model is considered to have a good fit and to be able to accurately predict the likelihood of the outcome. If the test statistic is significantly different from zero, the model is considered not to have a good fit and may not accurately predict the likelihood of the outcome.
Examples
Section titled “Examples”Predicting diabetes
Section titled “Predicting diabetes”A logistic regression model is developed using predictors such as age, body mass index, family history, and lifestyle habits. Predicted probabilities of developing diabetes are calculated for a sample of individuals. The Hosmer-Lemeshow test is used to assess the model’s ability to accurately predict the likelihood of diabetes development in that sample.
Predicting mortality in hospitalized patients
Section titled “Predicting mortality in hospitalized patients”A logistic regression model is developed using predictors such as age, medical history, and severity of illness. Predicted probabilities of mortality are calculated for a sample of hospitalized patients. The Hosmer-Lemeshow test is used to assess the model’s ability to accurately predict the likelihood of mortality in that sample.
Use cases
Section titled “Use cases”- Commonly used in medical research to assess a model’s predictive ability to accurately classify patients into outcome categories (for example, diseased vs. non-diseased).
Notes or pitfalls
Section titled “Notes or pitfalls”- The test requires dividing the sample into a predetermined number of groups (commonly deciles) based on predicted probabilities and comparing observed and predicted probabilities within those groups.
- Interpretation follows: a test statistic not significantly different from zero indicates good fit; a statistic significantly different from zero indicates poor fit.
Related terms
Section titled “Related terms”- Logistic regression
- Deviance
- Deciles
- Predicted probability