Hypergometric Distribution

A hypergeometric distribution models situations where you have a fixed number of items, and you are selecting a certain number of them without replacement. Each item has two possible categories, often referred to as successes and failures.

The key part of a hypergeometric distribution is that the probability of success changes as you select items, because you are not replacing them so that means that the number of items and successes left in the pool decreases with each trial.

  • Fixed Group: You always start with a fixed number of items
  • Two type: The items always fall into two categories which are the successes and failures
  • No replacement: After you pick an item, it’s not put back, so the group keeps getting smaller
  • Changing Odds: The chance of picking a certain type of item changes as you pick more items because you're not replacing them

Probability in a Hypergeometric Distribution
\(P(x) = \cfrac{\dbinom{a}{x} \times \dbinom{n-a}{r-x}}{\dbinom{n}{r}}\)

Where:

  • \(a\) represents the number of successful outcomes
  • \(n\) represents the total number of possible outcomes

Even though the trials depend on each other, the average chance of success should still be similar to the proportion of successes in the whole population, \( \cfrac{a}{n} \), (where \(a\) is the number of successes and \(n\) is the total population).


Expectation for a Hypergeometric Distribution

The expectation of a hypergeometric distribution is the sum of an infinite series. In a geometric distribution, the number of trials until the first success can theoretically extend infinitely, meaning each additional trial adds more possible outcomes to the failures before the first success occurs.

This formula is used to represent the average number of successes in a sample size drawn from a finite population containing a certain number of successes.

\(E(X) = \cfrac{r \cdot a}{n}\)

Where:

  • \(a\) represents the number of successful outcomes
  • \(n\) represents the total number of possible outcomes

We use this formula because it accounts for sampling without replacement, where each selection affects the probability of future selections, unlike the Binomial Distribution.


A quality control inspector randomly selects 5 light bulbs from a batch of 20, where 6 bulbs are defective. What is the probability that exactly 2 of the selected bulbs are defective?

From the description above, we can identify the following values:

  • \(a = 6\)
  • \(n = 20\)
  • \(r = 5\)
  • \(k = 2\)

We can use the Hypergeometric Formula to determine the probability:

\(P(X = k) = \cfrac{\dbinom{a}{k} \dbinom{n-a}{r-k}}{\dbinom{n}{r}}\)

\(P(X = 2) = \cfrac{\dbinom{6}{2} \dbinom{14}{3}}{\dbinom{20}{5}} \)

First, we can determine the respective binomial coefficients:

\(\dbinom{6}{2} = \cfrac{6!}{2!(6-2)!} = \cfrac{6 \times 5}{2 \times 1} = 15\)

\(\dbinom{14}{3} = \cfrac{14!}{3!(14-3)!} = \cfrac{14 \times 13 \times 12}{3 \times 2 \times 1} = 364\)

\(\dbinom{20}{5} = \cfrac{20!}{5!(20-5)!} = \cfrac{20 \times 19 \times 18 \times 17 \times 16}{5 \times 4 \times 3 \times 2 \times 1} = 15504\)

Next, we can plug the values into the Hypergeometric formula and solve for the probablity:

\(P(X = 2) = \cfrac{\dbinom{6}{2} \dbinom{14}{3}}{\dbinom{20}{5}}\)

\(P(X = 2) = \cfrac{15 \times 364}{15504}\)

\(P(X = 2) = \cfrac{5460}{15504} \approx 0.352\)

Therefore, we can determine the probability of selecting exactly 2 defective bulbs is 0.352 (or 35.2%).


A deck of 52 playing cards contains 13 hearts. If you randomly draw 10 cards without replacement, what is the expected number of hearts in your selection?

We can use the formula for the Expectation of a Hypergeometric Distribution to solve this problem:

\(E(X) = \cfrac{r \cdot a}{n}\)

From the description above, we can identify the following values:

  • \(r = 10\)
  • \(a = 13\)
  • \(n = 52\)

Next, we can plug the corresponding values into the formula to solve:

\(E(X) = \cfrac{10 \times 13}{52} \)

\(E(X) = \cfrac{130}{52}\)

\(E(X) = 2.5\)

Therefore, we can determine the expected number of hearts in the sample is 2.5.