Dip Test
- A statistical procedure (also called the duality of pattern test) for detecting multimodality in a distribution.
- Combines a histogram-based visual check with the Hartigans’ dip test statistic to assess whether a distribution has multiple peaks.
- The dip statistic is compared to a threshold to indicate multimodality or unimodality.
Definition
Section titled “Definition”The DIP test, also known as the duality of pattern test, is a statistical method used to determine the presence of multimodality in a dataset. Multimodality refers to the presence of multiple modes, or peaks, in the distribution of data.
Explanation
Section titled “Explanation”- The DIP test begins with a histogram of the data to visually inspect the frequency of data points across value ranges and to identify potential multiple peaks.
- It then applies the Hartigans’ dip test, which yields a dip statistic intended to indicate multimodality.
- According to the described procedure, if the dip statistic is less than a certain threshold, it suggests the presence of multimodality; if it is greater than the threshold, it suggests unimodality.
Examples
Section titled “Examples”Example 1
Section titled “Example 1”Dataset of student grades with the following distribution:
-
60-70: 10 students
-
70-80: 15 students
-
80-90: 20 students
-
90-100: 5 students
-
A histogram shows a unimodal distribution with a single peak at the 80-90 range, suggesting most students scored between 80 and 90.
-
The Hartigans’ dip test dip statistic is described as the difference between the maximum and minimum values in the distribution:
-
Since this value is greater than the threshold, it suggests that the distribution is unimodal.
Example 2
Section titled “Example 2”Dataset of student grades with the following distribution:
-
60-70: 10 students
-
70-80: 5 students
-
80-90: 10 students
-
90-100: 15 students
-
A histogram shows a multimodal distribution with two peaks at the 70-80 and 90-100 ranges, suggesting two subgroups of students.
-
The Hartigans’ dip test dip statistic is described as:
-
Since this value is less than the threshold, it suggests that the distribution is multimodal.
Use cases
Section titled “Use cases”- Identifying whether a distribution has multiple peaks.
- Detecting the presence of multiple subgroups within a population based on observed data.
Related terms
Section titled “Related terms”- Hartigans’ dip test
- Multimodality
- Histogram