Skip to content

CART

  • Builds a binary decision tree where internal nodes split on feature thresholds and leaves give final predictions.
  • Used for both classification and regression tasks in predictive modeling and data mining.
  • Helps identify key feature thresholds and splits in complex, non-linear data.

CART, or Classification and Regression Trees, is a decision tree–based machine learning algorithm that creates a binary tree structure: each internal node represents a decision based on a specific feature or attribute, and each leaf node represents a final prediction or outcome.

CART constructs a binary tree by repeatedly splitting the dataset on feature-based decisions (thresholds or conditions) to partition the data into subsets that are more homogeneous with respect to the target. Internal nodes correspond to those decisions, and the terminal (leaf) nodes correspond to the model’s predictions. The algorithm identifies key thresholds and splits in the data to separate outcomes for prediction.

A bank may use CART to predict whether a potential borrower will default on a loan. Input features such as credit score, income, debt-to-income ratio, and loan amount are used to create a decision tree that identifies key thresholds and splits. For instance, the algorithm may determine that borrowers with a credit score below 600 are more likely to default, or that borrowers with a debt-to-income ratio above 40% are also at higher risk. The final leaf nodes represent the predicted default outcome for each borrower.

Customer churn prediction (telecommunications)

Section titled “Customer churn prediction (telecommunications)”

CART can predict customer churn using features like customer tenure, average monthly bill, and call volume to build a decision tree that highlights factors contributing to churn. For example, the algorithm may determine that customers with a tenure of less than one year are more likely to churn, or that customers with a high average monthly bill are less likely to churn. The leaf nodes represent the predicted churn outcome for each customer.

  • Predictive modeling
  • Data mining
  • Credit risk assessment (example above)
  • Customer churn prediction (example above)
  • Particularly useful when data is complex and non-linear, and when relationships between features and outcomes are not well-understood.
  • CART produces a binary tree and identifies explicit feature thresholds and splits.
  • Decision tree
  • Classification
  • Regression
  • Classification and Regression Trees (CART)