Skip to content

Regression

  • Uses observed data to estimate how one or more predictor variables relate to a continuous outcome.
  • Commonly implemented with a linear functional form and coefficients estimated by methods like least squares or maximum likelihood.
  • Enables predictions such as house sale price or employee salary from predictor values.

Regression is a statistical technique that is used to predict a continuous outcome based on one or more predictor variables.

Regression involves choosing a functional form that expresses the dependent variable as a function of predictor(s) and unknown coefficients. A common choice is the linear model:

y=β0+β1xy = \beta_0 + \beta_1 x

In this formulation, y is the dependent variable, x is the predictor variable, and β0 and β1 are coefficients to be estimated. β0 is the intercept (the value of y when x = 0) and β1 is the slope (the change in y for a one-unit change in x).

Coefficients are estimated from data using techniques such as least squares, maximum likelihood, or generalized linear models. Estimation finds coefficient values that minimize the difference between predicted values of y (from the model) and the observed values in the data. Once estimated, the model can be used to produce predictions for new predictor values, subject to the chosen functional form and the assumption that the coefficients accurately capture the relationship.

Real estate: predicting sale price from square footage

Section titled “Real estate: predicting sale price from square footage”

Collect sale price and square footage for a sample of houses, fit a regression model, then predict sale price from square footage. For example:

y=β0+β1(1,000)=$250,000y = \beta_0 + \beta_1(1,000) = \$250,000

This prediction assumes a linear relationship and that the estimated coefficients capture that relationship.

Employee salary: predicting salary from years of experience

Section titled “Employee salary: predicting salary from years of experience”

Collect salaries and years of experience for a sample of employees, fit a regression model, then predict salary from years of experience. For example:

y=β0+β1(5)=$60,000y = \beta_0 + \beta_1(5) = \$60,000

This prediction likewise assumes a linear relationship and that the estimated coefficients accurately capture it.

  • Economics
  • Finance
  • Marketing
  • Applied examples in real estate pricing and employee salary prediction (as illustrated above)
  • Regression predictions rely on the chosen functional form (e.g., linear); if the true relationship is not captured by that form, predictions may be biased.
  • Results depend on the assumption that the estimated coefficients accurately represent the relationship between predictors and the outcome.
  • Least squares
  • Maximum likelihood
  • Generalized linear models
  • Predictor variable
  • Dependent variable
  • Intercept (β0)
  • Slope (β1)