Chain rule of probability

6 min readSep 19, 2020

In the last article, we discussed the concept of conditional probability and we know that the formula for computing the conditional probability is as follows:

We can re-arrange the terms in this formula

Similarly, we have the formula for P(B | A)

Let’s take an example to understand the above formula, say the sample space is represented by omega and represents the population of a country and set A is the set of people who are infected by COVID-19, and set B is the set of people for whom the test results have come out to be positive

In such scenarios, 4 events are of interest to us:

The first one is A intersection B which represents the people who have the COVID-19 symptoms and their test results have come out to be positive and that lies in the intersection region as highlighted in the below image:

Then we have the set of the population such that the people are not infected but the test results have come out to be positive say because of some error in the test, so that part of B which overlaps with A complement and this region is highlighted in the below image

Then the next one is the set of people who are actually COVID-19 infected but their test results have not come out as positive because of test error and the highlighted region in the below image represents such population:

Lastly, we have the set of people who are neither infected nor their test results came out to be positive and this is represented by the highlighted part in the below image:

And say the given data is provided to us:

This means that the probability of event A that a given person is infected is 0.1

The next given statement is:

This means that if the person is infected(event A has occurred) the probability of event B not happening is 0.01 and using this information and the axiom of probability, we can compute P(B | A)

The next statement is:

A healthy person means it belongs to the set A complement and again using the axiom of probability we can compute the other probability

So, given a person there are two possibilities: either the person is infected or is not infected

given that a person is infected, there are again two possibilities: test result comes out as positive, test result comes out as negative

Similarly, there are two possibilities if the person is not infected

We have the probability of P(A) and P(B | A) and as per the formula discussed at the beginning of this article(and mentioned in the below image as well), we can compute the probability of A intersection B:

Similarly, the below image reflects the labels for each of the four paths where the label denotes the set/event whose probability if known can be used to compute the probability of the event at the end of the path:

So, if we take the product of probabilities respectively along each path, then we get the appropriate probabilities we are interested in

So, we can compute the desired probability value(probability value of the event at the end of the leaf in the above image) using the multiplication rule or the chain rule of probability.

We can think of this as the chain of decisions that are happening and if know the probability of every element in the path, we can compute the probability of the leaf node(nodes at the end of the tree/path).

Let’s see this formula for the intersection of 3 events:

We see in the above image that initially we had two terms for the probability of A intersection B intersection C, but when we replace A intersection B by X, we get one more term in the final formula, so the chain keeps growing as we add more elements

If we call A, B, C, D as A₁, A₂, A₃, and A₄, we can write the above formula as:

And we can write it a more generic manner for ‘n’ events:

Let’s take an example involving the chain rule:

We know how to do this using the counting principle, 3 cards are being chosen from a deck of 52 cards, so there are a total of ⁵²C₃ ways of doing that and we are interested in only those outcomes/cases when all the 3 cards are aces, so we are interested in ⁴C₃ outcomes

We can solve this task using chain rule as well:

P(A₁) — is the event that the first card is an ace, so out of 52 cards only 4 are favorable to us

Now when we have drawn the first card and we know that it is an ace, that means 51 cards are left in the deck of which there are a total of 3 aces, so we can compute the probability of A₂ given A₁

Now when the second card is also drawn, we are left with 50 cards in the deck of which 2 aces are there, so we can compute the probability that the card drawn is an ace given the earlier two cards were aces:

References: PadhAI

Chain rule of probability

Written by Parveen Khurana