Mirror Match Bootstrapping

TL;DR

Create additional training examples by slightly modifying existing data (e.g., rotate, crop, blur, or substitute words).
Helps models learn from more varied examples, especially for underrepresented classes or linguistic variants.
Can improve accuracy and generalization while reducing the risk of overfitting.

Definition

Mirror-match bootstrapping is a method used in artificial intelligence (AI) to improve the performance of a machine learning model by creating new training data from existing data through slight modifications of the original data.

Explanation

The technique generates additional training examples by taking existing samples and applying small alterations. These modified samples are added to the training dataset so the model encounters more varied instances during learning. By expanding the variety of examples, the model can better generalize to new data and improve accuracy, while reducing the likelihood of overfitting.

Examples

Image classification (fruits)

A model trained to recognize fruits (apples, bananas, oranges) may struggle to recognize strawberries due to insufficient strawberry images. A data scientist can take a few strawberry images and make slight modifications—such as rotating, cropping, or adding a blur effect—and add those modified images to the training set so the model learns from more strawberry examples.

Natural language processing (sentiment classification)

A text classification model that labels text as positive or negative may have difficulty with slang or colloquial language. To address this, existing text samples can be modified by replacing certain words with slang or colloquial equivalents. The resulting examples are added to the training data so the model learns to recognize and classify such language correctly.