Multivariate Hypergeometric Distribution
- Computes the probability of obtaining specified counts for multiple categories when sampling without replacement from a finite population.
- Use binomial coefficients for each category, multiply them, and divide by the binomial coefficient for the total sample.
- Generalizes the two-group hypergeometric case to any number of distinct groups.
Definition
Section titled “Definition”The multivariate hypergeometric distribution is a probability distribution that describes the possible outcomes of drawing samples from a finite population without replacement. It generalizes the standard hypergeometric distribution (which covers two groups) to cases with more than two distinct groups.
Explanation
Section titled “Explanation”Given a finite population and sampling without replacement, the probability of a particular combination of counts across categories is obtained by counting the ways to choose the required number from each category and dividing by the number of ways to choose the total sample. For the two-group case with population size N, K items of type A, N-K items of type B, and a draw of n items yielding X of type A and Y of type B, the probability is computed by multiplying the binomial coefficients for each group and dividing by the binomial coefficient for the total draw.
Examples
Section titled “Examples”Two-category example (red and blue)
Section titled “Two-category example (red and blue)”Suppose a population of 10 items contains 5 red and 5 blue items. Drawing 3 items without replacement, the probability of obtaining 2 red and 1 blue is:
As presented:
Three-category example (red, blue, green)
Section titled “Three-category example (red, blue, green)”Suppose a population of 20 items contains 8 red, 6 blue, and 6 green items. Drawing 6 items without replacement, the probability of obtaining 2 red, 2 blue, and 2 green is:
As presented:
Use cases
Section titled “Use cases”- Calculating the probability of drawing a specific combination of items from a finite population when sampling without replacement.
Related terms
Section titled “Related terms”- Hypergeometric distribution