Definition 1.
The (probability) mass function (p.m.f.) of a discrete random variable is the function given by .
Notice that is not discontinuous. The distribution and mass functions are related by
Lemma.
The probability mass function satisfies:
the set of such that is countable,
, where are the values of such that .
Example.Binomial distribution. A coin is tossed times, and a head turns up each time with probability . Then . The total number of heads takes values in the set and is a discrete random variable. Its probability mass function satisfies
The random variable is said to have the binomial distribution with parameters and , It is the sum of Bernoulli variables.
Example.Poisson distribution. If a random variable takes values in the set with mass function
where , then is said to have the Poisson distribution with parameter . Figure 1. shows how p.m.f varies with .
Recall that events and are called independent if and only if .
Definition 2.
Discrete variables and are independent if the events and are independent for all and .
Let and . Then we can write and . Therefore, and are independent if and only if for all and .
Remark.
The equality can be used as the criterion to determine whether two discrete random variables and are independent or not. But we need to be careful with it when dealing with continuous random variables as there will be additional assumptions to determine the independence of continuous random variables using this equality.
Theorem 1.
If and are independent and , then and are independent also.
More generally, we say that a family of (discrete) random variables is independent if the events , are independent for all possible choices of the set of the values of the . That is to say, is an independent family if and only if
for all sets and for all finite subsets of .
Probability density functions
Recall that a random variable is continuous if its distribution function can be written as\footnote{This is just a general integral, may or may not be continuous.}
for some integrable .
Definition 3.
The function is called the (probability) density function (p.d.f.) of the continuous random variable .
Remark.
The function is NOT unique. We can add some separate points or a countable set of points which has zero measure to . This doesn't change the value of the integral. However, if is differentiable at then we shall normally set .
Next, we assume is continuous, then from the basic theorem of calculus, must be differentiable. Recall
Then . is absolutely continuous for continuous random variables. Thus we have
This means the probability of a continuous random variable taking value at a certain point is 0. Very roughly speaking, this lies in the observation that there are uncountably many possible values for ; this number is so large that the probability of taking any particular value cannot exceed zero.
The numerical value is NOT a probability. Check the probability for a very small :
Since is a very small interval rather a number, we cannot say is the probability of something.
Lemma.
If has density function , then
,
for all ,
.
Independence of continuous random variables
Independence of general random variables
Definition 4.
Random variables and are called independent if and are independent events for all .
Note that this definition is the general definition of the independence of any two variables and regardless of their types. The independence of discrete random variables is included in this definition.
Recall the marginalization. If two random variables and are independent, we have
Therefore,
Note that we are dealing with distribution function in Eq.(1). Eq.(1) can be used as the general criterion to determine whether two random variables are independent or not.
Independence of continuous variables
If are continuous, then we have
where is called the marginal probability density function of . Similarly, we can define and .
Assume is continuous, then must be twice differentiable w.r.t. and . Therefore, from \eqref{eq:}, we can derive a *very practical criterion to determine the independence of two continuous random variables:
Remark.
Note that the prerequisite of Eq.(2) is that is continuous.
Example.Uniform distribution. The random variable is uniform on if it has distribution function
The density function is
Example.Exponential distribution. The random variable is exponential with parameter if it has distribution function
The density function is
Note that is not differentiable at . This means has discontinuity at . Thus we need to choose some value for . It doesn't matter what value we choose as it doesn't affect the integral. Figure 2 shows the p.d.f. of Exponential distribution.
Fig.2: p.m.f of exponential distribution.
Remark.
The interpretation of Exponential distribution can be found via Exponential distribution 1 and Exponential distribution 2. Pay attention to the connection between Exponential distribution and Poisson distribution.
Example.Normal (or Gaussian) distribution. The most important continuous distribution, which has two parameters and and density function
It is denoted by . If and , then
is the density of the standard normal distribution. It is easy to generalize 2D case to multivariable case. Suppose , is the mean vector and is the covariance matrix. Then
Remark.
For 2D case, if is diagonal, then . Therefore, is a measure of independence. For multivariable cases, if is diagonal, then all random variables are independent with each other.