Mendelian Randomization
- The provided source text does not define Mendelian randomization — it discusses M-estimators instead.
- M-estimators are robust estimators designed to reduce the influence of outliers.
- Two M-estimators covered are the median absolute deviation (MAD) and the Tukey biweight, with an illustrative example of a dataset containing 99 similar values and one outlier.
Definition
Section titled “Definition”The source content defines M-estimators and two specific robust statistics:
- M-estimators: Estimators in statistics defined by their robustness to outliers; they are designed to produce accurate estimates even when the data contain a few unusually large or small values.
- Median absolute deviation (MAD): A measure of dispersion computed by taking the median of the absolute differences between each data value and the median of the dataset.
- Tukey biweight: A measure of location computed by taking, for each value, the difference from the median, squaring that difference, summing those squares, and dividing by a certain constant that depends on the number of values in the dataset.
Explanation
Section titled “Explanation”- M-estimators aim to limit the effect of a small number of extreme observations so that the resulting estimate reflects the central tendency or spread of the bulk of the data.
- To compute the MAD:
- Find the median of the dataset.
- For each value, calculate the absolute difference between that value and the median.
- The MAD is the median of those absolute differences.
- To compute the Tukey biweight (as described in the source):
- Find the median of the dataset.
- For each value, compute the difference between the value and the median and square that difference.
- Sum the squared differences and divide by a constant that depends on the sample size.
- The source contrasts these robust measures with traditional measures (mean and standard deviation), noting that the latter can be strongly influenced by outliers.
Examples
Section titled “Examples”Outlier example (from source)
Section titled “Outlier example (from source)”- Consider a dataset with 99 values that are all very close to each other, and one value that is much larger or smaller than the others.
- Standard deviation: The single outlier will have a large influence, increasing the standard deviation substantially.
- MAD: The MAD will be relatively unaffected because the median of the absolute differences is not greatly influenced by a single extreme value.
- Mean: The outlier will shift the mean away from the median.
- Tukey biweight: The Tukey biweight will be relatively unaffected by the outlier because the sum of the squares of the differences will not be greatly influenced by a single large or small value (per the source description).
Use cases
Section titled “Use cases”- Use M-estimators such as MAD and Tukey biweight when robustness to outliers is required, for example when a dataset may contain a small number of extreme values that should not dominate the estimate.
Notes or pitfalls
Section titled “Notes or pitfalls”- Traditional measures like the mean and standard deviation can be heavily influenced by single outliers; robust measures mitigate this effect.
- The source describes the Tukey biweight calculation in broad terms (summing squared differences and dividing by a constant dependent on sample size) without providing the specific constant or formula.
Related terms
Section titled “Related terms”- M-estimators
- Median absolute deviation (MAD)
- Tukey biweight
- Median
- Mean
- Standard deviation