Skip to content

Box And Whisker Plot

  • Visualizes a dataset’s central tendency and spread using a box (Q1 to Q3) and whiskers (to min/max).
  • Shows the median and interquartile range (IQR) and helps identify values outside the whiskers as potential outliers.
  • Useful for comparing distributions across different datasets.

A box and whisker plot is a graphical representation of a dataset that displays the distribution of the data and its range. It consists of a box, which represents the middle 50% of the data, and two whiskers, which extend from the box to show the range of the data. The plot is used to compare distributions of different datasets and to identify potential outliers.

  • Calculate the median and the first (Q1) and third (Q3) quartiles of the data. The median is the middle value; the quartiles divide the data into four equal parts.
  • The box is a rectangle extending from Q1 to Q3; its height is Q3 − Q1 (the interquartile range, IQR).
  • Whiskers extend from the box to show range: the upper whisker from Q3 to the maximum value, and the lower whisker from Q1 to the minimum value. The length of the whiskers can be determined by the interquartile range.
  • Outliers are values that lie outside the upper and lower whiskers and are typically plotted as individual points.
  • For a dataset with 10 values:
    • The median is the 5th value.
    • The first and third quartiles are the 2nd and 8th values, respectively.
  • If Q1 is 10 and Q3 is 20:
    • The box extends from 10 to 20.
    • The box height (IQR) is 10.
  • If the upper whisker extends from 20 to 30 and the lower whisker extends from 10 to 20, and there is a value of 40 in the dataset, that value (40) would be considered an outlier.

Comparing two datasets (Dataset A and Dataset B)

Section titled “Comparing two datasets (Dataset A and Dataset B)”
  • Dataset A: 10, 20, 30, 40, 50, 60, 70
    • Median: 40
    • Q1 and Q3: 20 and 60
    • Box: extends from 20 to 60; box height: 40
    • Upper whisker: extends from 60 to 70
    • Lower whisker: extends from 10 to 20
    • No values lie outside the whiskers (no outliers)
  • Dataset B: 20, 40, 60, 80, 100, 120, 140
    • Median: 80
    • Q1 and Q3: 40 and 120
    • Box: extends from 40 to 120; box height: 80
    • Upper whisker: extends from 120 to 140
    • Lower whisker: extends from 20 to 40
    • Value 140 lies outside the upper whisker and is identified as an outlier
  • Comparing the distribution (median, spread, and range) of different datasets.
  • Identifying potential outliers within a dataset.
  • Outliers are values that are significantly higher or lower than the rest of the data and appear as points outside the whiskers.
  • The length of the whiskers can be determined by the interquartile range (IQR = Q3 − Q1).
  • Median
  • Quartiles (Q1, Q3)
  • Interquartile Range (IQR)
  • Outlier