Random variables

Definition 1. A random variable is a function X:ΩR with the property that {ωΩ|X(ω)x}F for each xR. Such a function is said to be F-measurable.

Example 1. Tossing two dice. We define X={1get double (i=j)0not double (ij). Then X=(i,j)R.

Remark. We are interested in g(x)=P{ω|X(ω)=x}. But sometimes it doesn't work very well. Probability triplet is defined as {Ω,F,P}. If Ω is countable, then g(x) is okay. But if Ω is uncountable like intervals, then P(X=x) doesn't make much sense because the cardinality is too large.

Definition 2. The distribution function of a random variable X is the function F:R[0,1] given by F(x)=P(Xx).

Example 2. The distribution function of preceeding example is F(x)={0x0,30360<x1,1x>1.

Notice that ω|X(ω)x defines an event. It is an element in the corresponding σ-field. Denote A(x)=ω|X(ω)x. Along with A(x), we can define Ac(x)={ω|X(ω)>x},A(x,y)=Ac(x)A(y)={ω|x<X(ω)y}. Tow points worth noting:

  1. F must be defined for all xR.
  2. A(x) should belongs to F. Otherwise, we cannot talk about the probability of P(A(x)). Then the definition of distribution function is meaningless.

Lemma 1. A distribution function F has the following properties:

  1. limxF(x)=0, limxF(x)=1,
  2. if x<y, then F(x)F(y),
  3. F is right-continuous, that is F(x+h)F(x) as h0. (left-continuous is not necessary)

Example 3. Indicator functions. A particular class of Bernoulli variables is very useful in probability theory. Let A be an event and let IA:ΩR be the indicator function of A; that is, IA(ω)={1if ωA,0if ωAc. Then IA is a Bernoulli random variable taking the values 1 and 0 with probabilities P(A) and P(Ac) respectively. Suppose {Bi|iI} is a family of disjoint events with AiIBi. Then IA=i=IABi, an identity which is often useful.

Lemma 2. Let F be the distribution function of X. Then

  1. P(X>x)=1F(x),
  2. P(x<Xy)=F(y)F(x),
  3. P(X=x)=F(x)limyxF(y).

A random variable X with distribution function F is said to have two “tails” given by T1(X)=P(X>x)=1F(x),T2(X)=P(Xx)=F(x), where x is large and positive. The rates at which the Ti decay to zero as x have a substantial effect on the existence or non-existence of certain associated quantities called the “moments” of the distribution.

Different random variables

Discrete random variables

Definition 3. The random variable X is called discrete if it takes values in some countable subset {x1,x2,}, only, of R. The discrete random variable X has (probability) mass function f:R[0,1] given by f(x)=P(X=x).

We shall see that the distribution function of a discrete variable has jump discontinuities at the values x1,x2, and is constant in between; such a distribution is called atomic.

Continuous random variables

Definition 4. The random variable X is called continuous if its distribution function can be expressed as F(x)=xf(u)duxR for some integrable function f:R[0,) called the **(probability) density function** of X.

Some point worth noting:

  1. The density function f(x) is not unique. We can add some separate points to f, and it doesn't affect the integration.
  2. F must be **absolutely continuous**. This implies F is continuous. We can also deduce that the probability at certain point must be zero. \emph{i.e.}, P(X=x)=0.

There is another sort of random variable, called “singular”.

Random vectors

Definition

The random vector is a function X:ΩRn. For example, X=(X,Y) for n=2. We can also define the distribution function for such X. But we need to first introduce the ordering in Rn.

Definition 5. By definition, (x1,y1)<(x2,y2) if and only if x1<x2 **AND** y1<y2.

Definition 6. The joint distribution function of a random vector X=(X1,X2,Xn) on the probability space {Ω,F,P} is the function FX:Rn[0,1] given by FX(x)=P(Xx) for xRn.

Remark . The joint probability P(Xx)=P(X1x1,,Xnxn). {Xx} is an abbreviation for the event {ωΩ|X(w)x}.

Lemma 3. The joint distribution function FX,Y of the random vector (X,Y) has the following properties:

  1. limx,yFX,Y(x,y)=0,limx,yFX,Y(x,y)=1,
  2. if (x1,y1)(x2,y2), then FX,Y(x1,y1)FX,Y(x2,y2),
  3. FX,Y is continuous from above, in that FX,Y(x+u,y+v)FX,Y(x,y)asu,v0.

Marginalization

limyFX,Y=FX(x)=P(Xx),limxFX,Y=FY(y)=P(Yy).

The functions FX and FY are called the “marginal” distribution functions of FX,Y. FX,Y can determine two marginals FX and FY, but converse is NOT true.

Discrete and continuous distribution

Definition 7. The random variables X and Y on the probability space {Ω,F,P} are called **(jointly) discrete** if the vector (X,Y) takes values in some **countable** subset of R2 only. The jointly discrete random variables X,Y have **joint (probability) mass function** f:R2[0,1] given by f(x,y)=P(X=x,Y=y).

Definition 8. The random variables X and Y on the probability space {Ω,F,P} are called **(jointly) continuous** if their joint distribution function can be expressed as FX,Y(x,y)=u=xv=yf(u,v)dudvx,yR, for some **integrable** function f:R2[0,) called the **joint (probability) density function** of the pair (X,Y).