# Continuous Random Variable

In the last few articles, we discussed the probability mass function(PMF) and its properties. Below is the PMF plot for a **uniform distribution**(a uniform distribution is the one which has equally likely outcomes one example of the **uniform distribution** is throwing a fair dice)

Then we have the **Cumulative Distribution** function **that tells the probability of the random variable taking on a value less than or equal to a given value** for example for the example of throwing one dice with equally likely outcomes, the probability of the random variable taking on a value less than or equal to 1(meaning the dice outcome is 1) would be 0.167(**1/6**), the probability of random variable taking on values less than or equal to 2(this would be the event that has outcome as 1 and 2 and these two are disjoint events, therefore, we can sum up their probabilities) would be (**0.167 + 0.167**) and so on for other values

Similarly, the probability of the random variable taking on a value less than or equal to 6 would be 1(as a value less than or equal to 6 covers the entire sample space in this case).

So, this is what a Cumulative distribution function would always look like, it would have the **values of the random variable on the x-axis, the corresponding** **probabilities on the y-axis**, and as we go from left to right(x-axis) i.e we cover all the values that the random variable can take, the probability at the end would be the probability of the entire sample space and that would be equal to 1.

Now if we have questions of the form asking the probability that the random variable will take a value between 2 and 4(both inclusive), this event would have {2, 3, 4} in the sample space and we can always sum up their probabilities(as these are disjoint events).

We can also read on this value from the Cumulative Distribution function, we know that the probability of the random variable taking on a value less than or equal to 4 is (0.167*4) and the probability of the random variable taking on a value less than or equal to 1 is (0.167), so the probability that random variable takes on a value between 2 and 4 would be the difference of these two values.

This is how we can compute the probability of the random variable taking on a range of values using the Cumulative Distributive function.

Let’s take the example of two dice:

Below is the Probability mass function where the random variable maps the outcomes to the sum of the numbers on two dice

Using this plot itself, we can draw the plot for the Cumulative distribution function

So, the probability that the random variable will take a value less than or equal to 4(sum of numbers on two dice less than equal to 4) will correspond to the outcomes {2, 3, 4} and we can sum up the corresponding probability value from the Probability mass function which will give the required value for the cumulative distribution plot.

And as we go towards the right the cumulative value keeps on increasing and the probability for the random variable taking on a value less than or equal to 12 would be 1.

When we talk of the **Probability mass function**, we have this **idea that the total probability has unit mass** and the probability mass function tells us how this unit mass is divided among all the values that the random variable can take

For example, in the above image, the value 7 takes on (6/36) i.e (1/6) of the total mass. Similarly, we could compute the mass for all other values.

What if the random variable can take on infinite values? How the total probability is distributed among all the values in such cases?

Let’s take an example to understand it better:

Say we are working with the rainfall data and we are interested in the probability that the rainfall is exactly 2cm, keeping the probability part aside, having exactly 2cm(not a single more molecule of the water drop is received) of rain is almost an impossible case

These sort of questions(asking for the probability of exact value say 2cm of rainfall) makes sense for a discrete random variable, for example, we can ask for the probability that the sum of two dice is exactly 7 but for continuous random variables, such questions don’t make sense because there are infinite values possible(even within a small range for example between 2 and 2.1, there are infinite values in between these two) and the probability of any particular value taking on a share of the probability is going to be 0 as in this case, it’s almost next to impossible to have exactly 2cms of rainfall.

The other way of understanding this would be: say for the case when there are ’**n**’ equally likely outcomes, then the probability of each possible outcome is ‘**1/n**’, **now for continuous random variables, ’n’ tends to infinity** and therefore the probability of any particular value reduces to 0.

For a continuous random variable, the probability of random variable taking on any single value is going to be 0.

The key point here is that we can not have this idea of dividing this unit probability mass among values because the share of any value is going to be 0.

Let’s say we have the data for water intake for the last 512 days

Now the following question does not make sense in this case

Instead, it makes sense to ask a question of the following form:

**It makes sense to ask probability in range intervals**, what is the probability that the random variable will take on a value in a particular interval, even though we say that every individual point in that interval have a 0 probability but the interval might still have some probability(although the probability of any fixed point is 0, the interval might still have some probability)

And we can ask the same question for all other intervals

Here we have drawn intervals of a certain length(fixed-length say a unit length) and we are interested in questions as in how many dots(where each dot represents water consumed on a given day) fit in each of these intervals, how many dots are between 1.9 to 2 liters, how many dots are there in between 2.0 and 2.1 liters and so on.

This is what the data plot looks like, these are the 512 data points in their respective intervals based on their values:

We can see that there are some intervals which are very sparse(for example 1 to 1.1 liters) whereas there are some regions which are very dense(for example we have a lot of values in all intervals between 2 to 3 liters)

For any given point(**x-axis**) in this plot, we can talk about the density around that point for example for value 1 liter, we can say that density around it is a bit low(for now let’s consider density as if the corresponding bar is crowded or not)

Similarly, checking and plotting the density for all the other points, we have the below plot:

The shape of the plot in the above image gives an idea of the density in a unit length around the respective value.

We can do the same thing taking in very very narrow intervals(think of interval size tending to almost 0) and we could plot the density for each of these intervals

The **x-axis** represents the values that the random variable can take and the **y-axis** tells the density at that point for example the red circled point in the below image tells us that the maximum density is around the point ~2.25 liters(if we look around the 2.25 liters that’s where we will find most values)

**The y-axis value given by the first plot in the above image is** not the probability of the random variable taking on the respective **x **value, what it tells **is the density around that point(x-axis)** for example for x = 2.25, the y-axis value is 0.4 that means the density around point 2.25 is 0.4

So, for a continuous random variable, it makes sense to ask questions that look for the probability in a range interval, and from the given continuous data, we can easily come up with a **probability density function** which tells the **density around each value that the random variable can take**.

References: PadhAI