Regression
- Uses observed data to estimate how one or more predictor variables relate to a continuous outcome.
- Commonly implemented with a linear functional form and coefficients estimated by methods like least squares or maximum likelihood.
- Enables predictions such as house sale price or employee salary from predictor values.
Definition
Section titled “Definition”Regression is a statistical technique that is used to predict a continuous outcome based on one or more predictor variables.
Explanation
Section titled “Explanation”Regression involves choosing a functional form that expresses the dependent variable as a function of predictor(s) and unknown coefficients. A common choice is the linear model:
In this formulation, y is the dependent variable, x is the predictor variable, and β0 and β1 are coefficients to be estimated. β0 is the intercept (the value of y when x = 0) and β1 is the slope (the change in y for a one-unit change in x).
Coefficients are estimated from data using techniques such as least squares, maximum likelihood, or generalized linear models. Estimation finds coefficient values that minimize the difference between predicted values of y (from the model) and the observed values in the data. Once estimated, the model can be used to produce predictions for new predictor values, subject to the chosen functional form and the assumption that the coefficients accurately capture the relationship.
Examples
Section titled “Examples”Real estate: predicting sale price from square footage
Section titled “Real estate: predicting sale price from square footage”Collect sale price and square footage for a sample of houses, fit a regression model, then predict sale price from square footage. For example:
This prediction assumes a linear relationship and that the estimated coefficients capture that relationship.
Employee salary: predicting salary from years of experience
Section titled “Employee salary: predicting salary from years of experience”Collect salaries and years of experience for a sample of employees, fit a regression model, then predict salary from years of experience. For example:
This prediction likewise assumes a linear relationship and that the estimated coefficients accurately capture it.
Use cases
Section titled “Use cases”- Economics
- Finance
- Marketing
- Applied examples in real estate pricing and employee salary prediction (as illustrated above)
Notes or pitfalls
Section titled “Notes or pitfalls”- Regression predictions rely on the chosen functional form (e.g., linear); if the true relationship is not captured by that form, predictions may be biased.
- Results depend on the assumption that the estimated coefficients accurately represent the relationship between predictors and the outcome.
Related terms
Section titled “Related terms”- Least squares
- Maximum likelihood
- Generalized linear models
- Predictor variable
- Dependent variable
- Intercept (β0)
- Slope (β1)