In the last article, we discussed the binomial distribution where we are interested in the probability of ‘k’ successes in ’n’ trials.

In binomial distribution, we talked about tossing a coin ’n’ times, in geometric distribution, we generally talk about tossing a coin infinite times, we don’t actually know how many times are we going to toss the coin, we just keep tossing it and we are interested in the number of tosses until we see the first heads and as discussed in the last article, the random variable could take on values from 0 to ’n’ in case of the binomial distribution, here in case of the geometric distribution, the random variable can take on values from 1 to infinite as we may get the heads in the very first toss itself or we may not get the heads even after tossing it 1000 times or for that matter a very large number of times, so practically speaking, the random variable, in this case, can take on values from 1 to infinite.

Image for post
Image for post
Image for post
Image for post

We are interested in assigning the probabilities to all the values that the random variable can take which will give the probability distribution of this random variable and we want this function equation/distribution to be in terms of a few parameters.

We can think of this as repeating Bernoulli's trial infinite times and we are interested in knowing the number of trials after which we get the first success.

Image for post
Image for post

To answer the above questions, let’s look at some examples where this distribution is useful

Consider a hawker selling belts outside a subway station and now there are sort of infinite people walking past the hawker through days, months he is sitting there and he would like to know when he is going to encounter the first person who is going to buy the belt

Image for post
Image for post

And he would want the probability to be very high for a small value of ‘k’ so even with 3–4 customers that pass by his shop, he must be able to sell something.

Another example could be: say a salesman is handing out to pamphlets to passersby and he is actually interested in knowing the probability that the kᵗʰ person will be the first one to actually read the pamphlet. This will give the idea that the salesman must hand out at least that many pamphlets.

Image for post
Image for post

Another example would be:

Image for post
Image for post

So, this distribution is useful in any situation where we have waiting times, we are continuously doing a trial and we want to know after how long we’ll get success(we know we’ll get success after a certain time) which happens in many situations especially in sales situations.

Image for post
Image for post

And these are all independent trials so we are repeating the trials infinite times, so every customer who is passing by is an independent customer, he does not care about what the earlier customer did(purchased or not).

And this is an identical distribution that means every person passing by the shop has the same probability of buying the product

Let’s take some examples and then derive the general formula for geometric distribution:

Say k=5 which means the first four trials resulted in failure and the fifth trial is the one when we got the first success and after happens after the fifth trial does not matter as in geometric distribution we look for the number of trials before the first success.

Image for post
Image for post
Image for post
Image for post

We are relying on the property that all these trials are independent, so the first failure can occur with a probability of (1-p), the second failure occurs with a probability of (1-p), same for the 3rd failure and the 4th one and success in the fifth trial will occur with a probability of ‘p

Image for post
Image for post
Image for post
Image for post

In general, we would have the following formula:

Image for post
Image for post

This distribution can be fully specified with just one parameter ‘p’, once we have the value of ‘p’, we can compute the probability for any value of ‘k

Let’s take the case when ‘p’ equals 0.2 and in the below image, we plot the output for 25 values although the random variable in geometric distribution can take on infinite values

Image for post
Image for post

We leverage the ‘geom’ function from the ‘scipy.stats’ module

Image for post
Image for post

And we have the function’s equation as the following:

Image for post
Image for post

If ‘k’ equals 1, then the value on the right-hand side will just be ‘p’ and we see that the bar corresponding to ‘k’ as 1 point to a probability value of 0.2

Similarly, for ‘k’ as 2, the right-hand side of the equation would be: (1-p)(p), and putting in the value of ‘p’ we get the output as (0.8)(0.2) ~ 0.16

And then the probability is going to continuously decrease as ‘k’ tends towards infinity.

As we have a non-zero probability of success and having the case that the first success occurs a very large number of times later, that probability is going to be very low. If we have a non-zero probability of success, then we are going to encounter success at some point and we’ll not have to go all the way up to infinity.

For example, we see that the probability value is very low for ‘k’ as 12, this means that all the first 11 trials resulted in failure and then we have one success, that’s very unlikely to happen even if we have a low probability of success and that’s why the probability value keeps on decreasing as the value of ‘k’ increases.

In binomial distribution, if the probability of success is very low, we have the tall bars(probability values) towards the left and the same is in geometric distribution as well.

For a high probability of success, the binomial distribution had all the tall bars towards the right, let’s see the plot for a high probability of success for geometric distribution:

Image for post
Image for post

For geometric distribution, we have the tallest bars towards the left only and we can reason out the same using the equation for geometric distribution:

Image for post
Image for post

For ‘k’ as 1, the value would be: (1-p)⁰.(p) which is the same as ‘p

And for every other value of ‘k’, we have the term (1-p) raised to the power (k-1) in the formula which will be less than 1 and as we increase the value of ‘k’, the power term i.e (k-1) would increase and the value that we get eventually will reduce(a quantity less than 1 raised to higher and higher powers will reduce only).

Geometric distribution always has this type of shape where the tallest bars are at the left irrespective of what the value of ‘p’ is.

And here is the plot for ‘p’ as 0.5 and once the trend remains the same as we have the tallest bars towards the left

Image for post
Image for post

Is geometric distribution a valid distribution?

To show that the geometric distribution is a valid distribution, the first thing we need to show is that the probability value for any value that the random variable can take is greater than equal to 0

Image for post
Image for post

And we can prove this by considering the function's equation

Image for post
Image for post

As the terms ‘p’ and (1-p) in the formula are the probability value that means they are always going to be greater than equal to 0, that means (1-p) raised to some power will always be greater than equal to 0, and the product of two positive numbers is going to be a positive number, so we can be sure that the probability of any value that the random variable can take is going to be a positive number.

Image for post
Image for post

The second property we need to show is that the sum of probabilities of all the values that the random variable can take is always going to be 1:

Image for post
Image for post

Let’s just expand this formula and write down all the terms

Image for post
Image for post

Now this is the same as the sum of an infinite geometric progression(shown in red in the below image, terms in yellow correspond to the terms in the geometric distribution as per the formula)

Image for post
Image for post

The Sum of such a series is given as: ‘a / (1-r)’ where ‘r’ is less than 1

Image for post
Image for post

Using the same formula, we have the sum as 1:

Image for post
Image for post

Let’s take an example:

Image for post
Image for post

This is a rare blood group we are talking about and say the doctor/hospital has been given a list of volunteers for donating the blood, and the administrator is going through the list and see if the patient has the matching blood group

Image for post
Image for post

Let’s try to solve it through the plot:

Image for post
Image for post
Image for post
Image for post

Now that we have the plot, we can see the probability for ‘k’ as 7 from the plot which turns out to be ~0.05.

The second part is interesting and here we are looking for the chance that at least one of the first 10 volunteers in the list has the matching blood group, we can tackle this problem using the subtraction principle and can look for the probability that none of the first of 10 persons is successful that means the first 10 persons are going to be failures/not have the matching blood type and we know the probability of failure, so this probability value would be:

(1–p)¹⁰

And since this is the probability of the complement of an event, we can get the probability of asked case as (1-probability of the complement)

Image for post
Image for post

References: PadhAI

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store