This article is supplemental for “Convergence of random variables” and provides proofs for selected results.
Several results will be established using the portmanteau lemma: A sequence converges in distribution to X if and only if any of the following conditions are met:
E[f(Xn)]\toE[f(X)]
f
E[f(Xn)]\toE[f(X)]
f
\limsup\operatorname{Pr}(Xn\inC)\leq\operatorname{Pr}(X\inC)
C
Xn \oversetas → X ⇒ Xn \overset{p} → X
\{Xn\}
X
O=\{\omega\mid\limXn(\omega) ≠ X(\omega)\}
\varepsilon>0
An=cupm\geq\left\{\left|Xm-X\right|>\varepsilon\right\}
This sequence of sets is decreasing (
An\supseteqAn+1\supseteq\ldots
Ainfty=capnAn.
The probabilities of this sequence are also decreasing, so
\lim\operatorname{Pr}(An)=\operatorname{Pr}(Ainfty)
\omega
O
\limXn(\omega)=X(\omega)
\left|Xn(\omega)-X(\omega)\right|<\varepsilon
n\geqN
N
n
\omega
An
Ainfty
Ainfty\subseteqO
\operatorname{Pr}(Ainfty)=0
Finally, by continuity from above,
\operatorname{Pr}\left(|Xn-X|>\varepsilon\right)\leq\operatorname{Pr}(An) \underset{n\toinfty}{ → }0,
Xn
X
If Xn are independent random variables assuming value one with probability 1/n and zero otherwise, then Xn converges to zero in probability but not almost surely. This can be verified using the Borel–Cantelli lemmas.
Xn \xrightarrow{p} X ⇒ Xn \xrightarrow{d} X,
Lemma. Let X, Y be random variables, let a be a real number and ε > 0. Then
\operatorname{Pr}(Y\leqa)\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(|Y-X|>\varepsilon).
Proof of lemma:
\begin{align} \operatorname{Pr}(Y\leqa)&=\operatorname{Pr}(Y\leqa, X\leqa+\varepsilon)+\operatorname{Pr}(Y\leqa, X>a+\varepsilon)\\ &\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(Y-X\leqa-X, a-X<-\varepsilon)\\ &\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(Y-X<-\varepsilon)\\ &\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(Y-X<-\varepsilon)+\operatorname{Pr}(Y-X>\varepsilon)\\ &=\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(|Y-X|>\varepsilon) \end{align}
Shorter proof of the lemma:
We have
\begin{align} \{Y\leqa\}\subset\{X\leqa+\varepsilon\}\cup\{|Y-X|>\varepsilon\} \end{align}
for if
Y\leqa
|Y-X|\leq\varepsilon
X\leqa+\varepsilon
\begin{align} \operatorname{Pr}(Y\leqa)\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(|Y-X|>\varepsilon). \end{align}
Proof of the theorem: Recall that in order to prove convergence in distribution, one must show that the sequence of cumulative distribution functions converges to the FX at every point where FX is continuous. Let a be such a point. For every ε > 0, due to the preceding lemma, we have:
\begin{align} \operatorname{Pr}(Xn\leqa)&\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}(|Xn-X|>\varepsilon)\\ \operatorname{Pr}(X\leqa-\varepsilon)&\leq\operatorname{Pr}(Xn\leqa)+\operatorname{Pr}(|Xn-X|>\varepsilon) \end{align}
So, we have
\operatorname{Pr}(X\leqa-\varepsilon)-\operatorname{Pr}\left(\left|Xn-X\right|>\varepsilon\right)\leq\operatorname{Pr}(Xn\leqa)\leq\operatorname{Pr}(X\leqa+\varepsilon)+\operatorname{Pr}\left(\left|Xn-X\right|>\varepsilon\right).
Taking the limit as n → ∞, we obtain:
FX(a-\varepsilon)\leq\limn\toinfty\operatorname{Pr}(Xn\leqa)\leqFX(a+\varepsilon),
\limn\toinfty\operatorname{Pr}(Xn\leqa)=\operatorname{Pr}(X\leqa),
The implication follows for when Xn is a random vector by using this property proved later on this page and by taking Xn = X in the statement of that property.
Xn \xrightarrow{d} c ⇒ Xn \xrightarrow{p} c,
Proof: Fix ε > 0. Let Bε(c) be the open ball of radius ε around point c, and Bε(c)c its complement. Then
\operatorname{Pr}\left(|Xn-c|\geq\varepsilon\right)=\operatorname{Pr}\left(Xn\in
c\right). | |
B | |
\varepsilon(c) |
\begin{align} \limn\toinfty\operatorname{Pr}\left(\left|Xn-c\right|\geq\varepsilon\right)&\leq\limsupn\toinfty\operatorname{Pr}\left(\left|Xn-c\right|\geq\varepsilon\right)\\ &=\limsupn\toinfty\operatorname{Pr}\left(Xn\in
c\right) | |
B | |
\varepsilon(c) |
\\ &\leq\operatorname{Pr}\left(c\in
c\right) | |
B | |
\varepsilon(c) |
=0 \end{align}
which by definition means that Xn converges to c in probability.
|Yn-Xn| \xrightarrow{p} 0, Xn \xrightarrow{d} X ⇒ Yn \xrightarrow{d} X
Proof: We will prove this theorem using the portmanteau lemma, part B. As required in that lemma, consider any bounded function f (i.e. |f(x)| ≤ M) which is also Lipschitz:
\existsK>0,\forallx,y: |f(x)-f(y)|\leqK|x-y|.
Take some ε > 0 and majorize the expression |E[''f''(''Y<sub>n</sub>'')] − E[''f''(''X<sub>n</sub>'')]| as
\begin{align} \left|\operatorname{E}\left[f(Yn)\right]-\operatorname{E}\left[f(Xn)\right]\right|&\leq\operatorname{E}\left[\left|f(Yn)-f(Xn)\right|\right]\\ &=\operatorname{E}\left[\left|f(Yn)-f(Xn)\right
|1 | |
\left\{|Yn-Xn|<\varepsilon\right\ |
(here 1 denotes the indicator function; the expectation of the indicator function is equal to the probability of corresponding event). Therefore,
\begin{align} \left|\operatorname{E}\left[f(Yn)\right]-\operatorname{E}\left[f(X)\right]\right|&\leq\left|\operatorname{E}\left[f(Yn)\right]-\operatorname{E}\left[f(Xn)\right]\right|+\left|\operatorname{E}\left[f(Xn)\right]-\operatorname{E}\left[f(X)\right]\right|\\ &\leqK\varepsilon+2M\operatorname{Pr}\left(|Yn-Xn|\geq\varepsilon\right)+\left|\operatorname{E}\left[f(Xn)\right]-\operatorname{E}\left[f(X)\right]\right|. \end{align}
\limn\toinfty\left|\operatorname{E}\left[f(Yn)\right]-\operatorname{E}\left[f(X)\right]\right|\leqK\varepsilon.
See main article: Slutsky's theorem.
Xn \xrightarrow{d} X, Yn \xrightarrow{p} c ⇒ (Xn,Yn) \xrightarrow{d} (X,c)
Proof: We will prove this statement using the portmanteau lemma, part A.
First we want to show that (Xn, c) converges in distribution to (X, c). By the portmanteau lemma this will be true if we can show that E[''f''(''X<sub>n</sub>'', ''c'')] → E[''f''(''X'', ''c'')] for any bounded continuous function f(x, y). So let f be such arbitrary bounded continuous function. Now consider the function of a single variable g(x) := f(x, c). This will obviously be also bounded and continuous, and therefore by the portmanteau lemma for sequence converging in distribution to X, we will have that E[''g''(''X<sub>n</sub>'')] → E[''g''(''X'')]. However the latter expression is equivalent to “E[''f''(''X<sub>n</sub>'', ''c'')] → E[''f''(''X'', ''c'')]”, and therefore we now know that (Xn, c) converges in distribution to (X, c).
Secondly, consider |(Xn, Yn) − (Xn, c)| = |Yn − c|. This expression converges in probability to zero because Yn converges in probability to c. Thus we have demonstrated two facts:
\begin{cases} \left|(Xn,Yn)-(Xn,c)\right| \xrightarrow{p} 0,\\ (Xn,c) \xrightarrow{d} (X,c). \end{cases}
Xn \xrightarrow{p} X, Yn \xrightarrow{p} Y ⇒ (Xn,Yn) \xrightarrow{p} (X,Y)
Proof:
\begin{align} \operatorname{Pr}\left(\left|(Xn,Yn)-(X,Y)\right|\geq\varepsilon\right)&\leq\operatorname{Pr}\left(|Xn-X|+|Yn-Y|\geq\varepsilon\right)\\ &\leq\operatorname{Pr}\left(|Xn-X|\geq\varepsilon/2\right)+\operatorname{Pr}\left(|Yn-Y|\geq\varepsilon/2\right) \end{align}