Skip to content

Canonical Correlation Analysis

  • Finds relationships between two multivariate sets by creating paired linear combinations (canonical variates) that have maximum mutual correlation.
  • Useful when variables within each set are not perfectly correlated and you want to identify which variables in each set relate most strongly to the other set.
  • Has limitations: assumes linear relationships, assumes normality within sets, and applies only to two sets of variables.

Canonical Correlation Analysis (CCA) is a statistical technique used to investigate the relationship between two sets of variables by constructing two new variables, called canonical variates, which are linear combinations of the original variables. These canonical variates are chosen such that they are maximally correlated with each other, while being uncorrelated with the other variables in their respective sets.

CCA examines the association between two variable sets by forming paired linear combinations—one combination from each set—so that the correlation between the paired combinations is as large as possible. Because each canonical variate is a weighted sum of variables from its set, CCA can reveal which original variables contribute most to the relationship across sets. This approach is particularly helpful when variables within a set are correlated with one another but not necessarily with variables in the other set, a situation where standard regression or simple correlation methods may be inappropriate.

Advantages described in the source:

  • Allows examination of relationships between two sets even when variables within each set are not perfectly correlated.
  • Identifies specific variables within each set that are most strongly related to the other set.

Limitations described in the source:

  • Assumes the relationship between the two sets is linear.
  • Assumes variables within each set are normally distributed.
  • Can only investigate relationships between two sets of variables (not between individual variables within a set in isolation).

Cognitive abilities and personality traits

Section titled “Cognitive abilities and personality traits”

For example, consider a study investigating the relationship between cognitive abilities and personality traits in a sample of individuals. The researchers measure the individuals’ scores on a cognitive ability test and a personality questionnaire. In CCA, the researchers would construct two canonical variates, one representing the cognitive abilities and the other representing the personality traits, such that they are maximally correlated with each other.

For example, in the study investigating the relationship between cognitive abilities and personality traits, the researchers may find that the canonical variate representing cognitive abilities is most strongly related to the personality trait of conscientiousness. This finding suggests that individuals who score high on conscientiousness are likely to have higher cognitive abilities.

  • Often used in psychology and other social sciences to study relationships between constructs such as personality traits and cognitive abilities, or between behavior and environmental factors.
  • CCA assumes linear relationships between the two sets of variables.
  • CCA assumes variables within each set are normally distributed.
  • CCA is limited to examining relationships between exactly two sets of variables.
  • Canonical variates
  • Linear combination
  • Correlation
  • Regression