Linear Regression
- Models how a dependent variable changes with one or more independent variables using a straight-line relationship.
- Fits the line that best captures that relationship from data, then uses it to make predictions.
- Assumes the relationship is linear; alternatives exist when it is not.
Definition
Section titled “Definition”Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables; it is used to predict the value of a dependent variable based on the values of one or more independent variables.
Explanation
Section titled “Explanation”Linear regression uses collected data on the variables of interest to fit a line that best describes the relationship between the dependent and independent variables. Once the model is fitted, it can produce predictions by plugging new independent-variable values into the model. The method assumes the relationship between dependent and independent variables is linear, meaning changes in the dependent variable are directly proportional to changes in the independent variable(s).
Examples
Section titled “Examples”Predicting house price from size
Section titled “Predicting house price from size”If we want to predict the price of a house based on its size, the dependent variable is the price and the independent variable is the size. Linear regression can model the relationship between these variables and predict the price of a house given its size. For example, after fitting a model, one could predict the price of a house with a size of 2,000 square feet by inputting that value into the fitted model.
Predicting GPA from study hours
Section titled “Predicting GPA from study hours”To predict a student’s grade point average (GPA) from the number of hours they spend studying, the dependent variable is the GPA and the independent variable is the number of study hours. Linear regression can model this relationship and predict GPA from study time.
Use cases
Section titled “Use cases”- Modeling and predicting the value of a dependent variable from one or more independent variables when the relationship is assumed to be linear.
Notes or pitfalls
Section titled “Notes or pitfalls”- Linear regression assumes a linear relationship between dependent and independent variables; this assumption may not hold in all situations.
- When the relationship is non-linear, methods such as polynomial regression or non-parametric regression may be more appropriate.
Related terms
Section titled “Related terms”- Polynomial regression
- Non-parametric regression