Sum of Random Variables

This lecture we study the expectation of the average of $n$ i.i.d. random variables $X_i$, $i=1,\dots, n$, i.e., the expectation of $Q_n = \frac{1}{n} (X_1 + \dots + X_n)$. As $n \to \infty$, we will introduce the central limit theorem and show that $Q_n$ converges to a normal distribution provided $\mathbf{Var}(X_i)$ exists.

Sums of Discrete Random Variables

Suppose $X$ and $Y$ are two independent discrete random variables with distribution functions $p_X(x)$ and $p_Y(y)$. Let $Z = X + Y$, we want to find the the distribution function of $Z$.

Suppose that $X = k$, where $k$ is some integer. Then $Z = z$ if and only if $Y = z-k$. So the event $Z = z$ is the union of the pairwise disjoint events $\begin{equation} (X=k) \quad \text { and } \quad (Y=z-k) \end{equation}$ where $k$ runs over the integers. Since these events are pairwise disjoint, we have $\begin{equation} \label{eq:6.1} \tag{6-1} \mathbf{P}(Z=z)=\sum_{k=-\infty}^{\infty} \mathbf{P}(X=k) \mathbf{P}(Y=z-k) \end{equation}$ which is the distribution function of the random variable $Z$.

Definition. Let $X$ and $Y$ be two independent integer-valued random variables, with distribution functions $p_X(x)$ and $p_Y(y)$ respectively. Then the **convolution** of $p_X(x)$ and $p_Y(y)$ is the distribution function $p_Z = p_X * p_Y$ given by $$\begin{equation} p_Z(z) = \sum_{x} p_X(x) p_Y(z-x) \end{equation}$$ for $x\in \mathbb{Z}$. The function $p_Z(z)$ is the distribution function of the random variable $Z = X + Y$.

Sums of Continuous Random Variables

Definition. Let $X$ and $Y$ be two continuous random variables with density functions $f_X(x)$ and $f_Y(y)$ respectively. Assume that both $f_X(x)$ and $f_Y(y)$ are defined for all real numbers. Then the **convolution** $f * g$ of $f$ and $g$ is the function given by $$\begin{equation} \label{eq:6.2} \tag{6-2} (f_X * f_Y)(z) = \int_{-\infty}^{+\infty} f_X(z-y) f_Y(y) d y = \int_{-\infty}^{+\infty} f_Y(z-x) f_X(x) d x. \end{equation}$$

Theorem. Let $X$ and $Y$ be two independent random variables with density functions $f_X(x)$ and $f_Y(y)$ defined for all $x$ and $y$. Then the sum $Z = X + Y$ is a random variable with density function $f_Z(z)$, where $f_Z$ is the convolution of $f_X$ and $f_Y$.

Theorem. Let $\{X_i\}$, $i=1,\dots, n$ be a sequence of independent random variables with density functions $f_{X_1}(x), \dots, f_{X_n}(x)$ respectively, then we have $$\begin{equation} f_{X_1 + \cdots + X_n}(x) = f_{X_1} * \left( f_{X_2} * ( \cdots * f_{X_n}) \right)(x). \end{equation}$$

Example. [Sum of two independent uniform random variables] Let $X$ and $Y$ be random variables describing our choices and $Z = X + Y$ their sum. Then we have $$\begin{equation} f_{X}(x)=f_{Y}(x)=\left\{\begin{array}{ll}{1} & {\text { if } 0 \leq x \leq 1} \\ {0} & {\text { otherwise }}\end{array}\right. \end{equation}$$ and the density function for the sum is given by $$\begin{equation} f_{Z}(z)=\int_{-\infty}^{+\infty} f_{X}(z-y) f_{Y}(y) d y = \int_{0}^{1} f_{X}(z-y) d y. \end{equation}$$ Now the integrand is $0$ unless $0 \leq z-y \leq 1$ and then it is $1$. So if $0 \leq z \leq 1$, we have $$\begin{equation} f_{Z}(z)=\int_{0}^{z} d y=z, \end{equation}$$ while if $1 < z \leq 2$, we have $$\begin{equation} f_{Z}(z)=\int_{z-1}^{1} d y=2-z, \end{equation}$$ and if $z < 0$ or $z > 2$ we have $f_Z(z) = 0$. Hence $$\begin{equation} f_{Z}(z)=\left\{\begin{array}{ll}{z,} & {\text { if } 0 \leq z \leq 1} \\ {2-z,} & {\text { if } 1<z \leq 2} \\ {0,} & {\text { otherwise }}\end{array}\right. \end{equation}$$

Example. [Sum of two independent exponential random variables] Let $X, Y$, and $Z = X + Y$ denote the relevant random variables, and $f_X$ , $f_Y$ , and $f_Z$ their densities. Then $$\begin{equation} f_{X}(x)=f_{Y}(x)=\left\{\begin{array}{ll}{\lambda e^{-\lambda x},} & {\text { if } x \geq 0} \\ {0,} & {\text { otherwise }}\end{array}\right. \end{equation}$$ If $z > 0$, $$\begin{equation} f_{Z}(z)=\int_{-\infty}^{+\infty} f_{X}(z-y) f_{Y}(y) d y =\int_{0}^{z} \lambda e^{-\lambda(z-y)} \lambda e^{-\lambda y} d y =\int_{0}^{z} \lambda^{2} e^{-\lambda z} d y =\lambda^{2} z e^{-\lambda z}, \end{equation}$$ while if $z < 0$, $f_Z(z) = 0$. Hence $$\begin{equation} f_{Z}(z)=\left\{\begin{array}{ll}{\lambda^{2} z e^{-\lambda z},} & {\text { if } z \geq 0} \\ {0,} & {\text { otherwise }}\end{array}\right. \end{equation}$$