Censored Regression Models
- Models for situations where the dependent variable is only partially observed because of lower or upper limits.
- Use observed (uncensored) data to estimate the relationship between dependent and independent variables, then predict or account for censored observations.
- Common approaches include the Tobit model and truncated regression.
Definition
Section titled “Definition”Censored regression models are a type of statistical model used to analyze data where the dependent variable is censored, or not fully observed. This occurs when the data has a lower or upper limit, such as in the case of survey responses where participants can only choose from a limited range of options.
Explanation
Section titled “Explanation”When the dependent variable is censored, some observations are only known to lie beyond a limit (upper or lower) rather than observed exactly. Censored regression models treat censored observations differently from fully observed ones: the model uses the observed data to estimate the underlying relationship between the dependent and independent variables, and then uses that estimated relationship to predict or account for the censored observations.
One common censored regression approach is the Tobit model (named after James Tobin). In a Tobit model, the dependent variable is assumed to follow a normal distribution with a mean and standard deviation that are functions of the independent variables. The observed data are used to estimate that mean and standard deviation, and censored observations are handled using those estimates.
A different approach is the truncated regression model, which treats data as truncated at an upper or lower limit. In truncated regression, censored observations at the limit are treated as if they were observed at that limit, and the model estimates the relationship using only the observed data within the truncation bounds.
Examples
Section titled “Examples”Survey rating scale
Section titled “Survey rating scale”An example of censored data is a survey asking participants to rate their satisfaction with a product on a scale of 1 to 5, with 1 being very dissatisfied and 5 being very satisfied. If a participant responds with a 6, their response is censored as it falls outside the range of options provided.
Tobit model: income and education
Section titled “Tobit model: income and education”In a Tobit model examining the relationship between income and education level, the observed data would be used to estimate the average income for individuals with different levels of education. The censored data, such as individuals with income above the maximum observed in the data, would be predicted using the estimated relationship between income and education.
Truncated regression: salary and job experience
Section titled “Truncated regression: salary and job experience”In a truncated regression model examining the relationship between salary and job experience, the observed data would be used to estimate the average salary for individuals with different levels of job experience. The censored data, such as individuals with salary below the minimum observed in the data, would be treated as if it were observed at the truncation limit and used in the model estimation.
Use cases
Section titled “Use cases”- Survey research
- Clinical trials
- Environmental studies
Related terms
Section titled “Related terms”- Tobit model
- Truncated regression model