auto-update(nvim): 2025-02-19 16:07:50
Some checks are pending
Deploy Quartz site to GitHub Pages using Nix / build (push) Waiting to run
Deploy Quartz site to GitHub Pages using Nix / deploy (push) Blocked by required conditions

This commit is contained in:
Youwen Wu 2025-02-19 16:07:50 -08:00
parent 69878532bb
commit 8ade952fdf
Signed by: youwen5
GPG key ID: 865658ED1FE61EC3

View file

@ -1,5 +1,4 @@
#import "@youwen/zen:0.1.0": *
#import "@preview/ctheorems:1.1.3": *
#show: zen.with(
title: "PSTAT120A Course Notes",
@ -1993,8 +1992,8 @@ Previously we discussed "raw moments." Be careful not to confuse them with
_central moments_.
#fact[
The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. p_X
(x) is the expected value of the difference about the mean raised to the
The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. $p_X
(x)$ is the expected value of the difference about the mean raised to the
$n^"th"$ power
$
@ -2016,6 +2015,8 @@ $
mu'_2 = E[(X-mu)^2] = sigma^2_X = "Var"(X)
$
Effectively we're centering our distribution first.
#example[
Let $Y$ be a uniformly chosen integer from ${0,1,2,...,m}$. Find the first and
second moment of $Y$.
@ -2103,3 +2104,282 @@ indicator of where the center of the distribution lies.
The median reflects the fact that 90% of the values and probability is in the
range $1,2,...,9$ while the mean is heavily influenced by the $-100$ value.
]
= President's Day lecture
...
= Lecture #datetime(day: 19, month: 2, year: 2025).display()
== Moment generating functions
Like the CDF, the moment generating function also completely characterizes the
distribution. That is, if you can find the MGF, it tells you all of the
information about the distribution. So it is an alternative way to characterize
a random variable.
They are "easy" to use for finding the distributions of:
- sums of independent random variables
- the distribution of the limit of a sequence of random variables
#definition[
Let $X$ be a random variable with all finite moments
$
E[X^k] = mu_k, k = 1,2,...
$
Then the *moment generating function* of a random variable $X$ is defined by
$M_x(t) = E[e^(t x)]$, for the real variable $t$.
]
All of the moments must be defined for the MGF to exist. The MGF looks like
$
sum_("all" x) e^(t x) p(x)
$
in the discrete case, and
$
integral^infinity_(-infinity) e^(t x) f(x) dif x
$
in the continuous case.
#proposition[
It holds that the $n^"th"$ derivative of $M$ evaluated at 0 gives the
$n^"th"$ moment.
$
M_x^((n)) (0) = E[X^n]
$
]
#proof[
$
M_X (t) &equiv E[e^(t x)] = E[1 + (t X) + (t X^2) / 2! + dots.c] \
&= E[1] + E[t X] + E[(t^2 X^2) / 2!] + dots.c \
&= E[1] + t E[X] + t^2 / 2! E[X^2] + dots.c \
&= 1 + t / 1! mu_1 + t^2 / 2! mu_2 + dots.c
$
The coefficient of $t^k/k!$ in the Taylor series expansion of $M_X (t)$ is the
$k^"th"$ moment. So an alternative way to get $mu_k$ is
$
mu_k = lr(((dif^k M(t))/(dif t^k)) |)_(t=0) = "coefficient of" t^k / k!
$
]
#example[Binomial][
Let $X ~ "Bin"(n,p)$. Then the MGF of $X$ is given by
$
sum_(k=0)^n e^(t k) vec(n,k) p^k q^(n-k) = sum_(k=0)^n vec(n,k) underbrace(p (e^t)^k,a) underbrace(q^(n-k), b)
$
Applying the binomial theorem
$
(a + b)^n = sum_(k=0)^n vec(n,k) a^k b^(n-k)
$
So we have
$
(q + p e^t)^n
$
Let's find the first moment
$
mu_1 = lr((dif M(t))/(dif t) |)_(t=0) \
= n p
$
The second moment:
$
mu_2 = lr((dif^2 M(t))/(dif t^2) |)_(t=0) \
= n(n-1) p^2 + n p
$
For example, if $X$ has MGF $(1/3 + 2/3 e^t)^10$, then $X ~ "Bin"(10,2/3)$.
]
#example[Poisson][
Let $X ~ "Pois"(lambda)$. Then the MGF of $X$ is given
$
M_X (t) = E[e^(t X)] \
sum^infinity_(x=0) e^(t x) e^(-lambda) lambda^x / x! \
e^(-lambda) sum^infinity_(x=0) e^(t x)lambda^x / x!
$
Note: $e^a = sum_(x=0) ^infinity a^x / x!$
$
= e^(-lambda) e^(lambda e^t) \
= e^(-lambda (1 - e^t))
$
Then, the first moment can be found by,
$
mu_1 = lr(e^(-lambda (1 - e^t)) (-lambda) (-e^t) |)_(t=0) = lambda
$
]
#example[Exponential][
Let $X ~"Exp"(lambda)$ with PDF
$
f(x) = cases(lambda e^(-lambda x) &"for" x > 0, 0 &"otherwise")
$
Find the MGF of $X$
$
M_X (t) &= integral^infinity_(-infinity) e^(t x) dot lambda e^(-lambda x) dif x \
&= lambda integral_0^infinity e^((t-lambda) x) dif x \
&= lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x \
$
This integral depends on $t$, so we should consider three cases. If $t =
lambda$, then the integral diverges.
If $t != lambda$,
$
E[e^(t X)] = lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x = lambda lim_(b -> infinity) [(e^((t - lambda) x) - 1) / (t - lambda)]^(x=b)_(x=0) \
lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t > lambda, lambda/(lambda - t) &"if" t < lambda)
$
Combining with the $lambda = t$ case,
$
lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t >= lambda, lambda/(lambda - t) &"if" t < lambda)
$
]
#example[Alternative parameterization of the exponential][
Consider $X ~ "Exp"(beta)$ with PDF
$
f(x) = cases(1/beta e^(-x/beta) &"for" x > 0, 0 &"otherwise")
$
and proceed as usual
$
M_X (t) = integral_(-infinity)^infinity e^(t x) dot 1 / beta e^(-x / beta) dif x = 1 / beta lim_(b-infinity) [e^((t - 1 / beta) x) / (t - 1 / beta)]_(x=0)^(x=b) = 1 / (1 - beta t)
$
So it's a geometric series
$
1 + beta t + (beta t)^2 + dots.c \
$
Multiply each $n^"th"$ term by $n/n!$
$
= 1 + beta t + 2 beta^2 (t^2 / 2!) + 6 beta^3 (t^3 / 3!) + dots.c
$
Recall that the coefficient of each $r^k/k! = mu)k$. So
- $E[x] = beta$
- $E[X^2] = 2 beta^2$
- $E[X^3] = 6 beta^3$
$
"Var"(X) = E[X^2] - (E[X])^2 = beta^2
$
]
#example[Uniform on $[0,1]$][
Let $X ~ U(0,1)$, then
$
M_X (t) &= integral_0^1 e^(t x) dot 1 dif x \
&= lr(e^(t x)/t |)_(x=0)^(x=1) = (e^t - 1) / t \
&= (cancel(1) + t^2 / 2! + dots.c - cancel(1)) / t \
&= 1 + t^2 / 2! + t^2 / 3! + t^3 / 4! + dots.c \
&= 1 + 1 / 2 t + 1 / 3 (t^2 / 2!) + 1 / 4(t^3 / 3!) + dots.c
$
So
- $E[X] = 1 / 2$
- $E[X^2] = 1 / 3$
- $E[X^n] = 1 / (n + 1)$
]
== Properties of the MGF
#definition[
Random variables $X$ and $Y$ are equal in distribution if $P(X in B) = P(Y in
B)$ for all all subsets $B$ of $RR$.
]
#abuse[
Abbreviate this by $X eq.delta Y$
]
#example[Normal distribution][
Let $Z ~ N(0,1)$. Then
$
E[e^(t Z)] = 1 / sqrt(2 pi) integral^infinity_(-infinity) e^(-1 / 2 z^2 + t z -1 / 2 t^2 + 1 / 2 t^2) dif z \
= e^(t^2 / 2) 1 / sqrt(2 pi) = integral_(-infinity)^infinity e^(-1 / 2 (z-t)^2) dif z = e^(t^2 / 2)
$
To get the MGF for a general normal RV, $X ~ N(mu, sigma^2)$, then
$
X = sigma Z + mu
$
we get
$
E[e^(t (sigma Z + mu))] = e^(t mu) E[e^(t sigma Z)] = e^(t mu) dot e^((t^2 sigma^2) / 2) = exp(mu t + (sigma^2 t^2) / 2)
$
]
== Joint distributions of RV
Looking at multiple random variables jointly. If $X$ and $Y$ are both random
variables defined on $Omega$< treat them as coordinates of a 2 dimensional
random vector. It's a vector valued function on $Omega$,
$
Omega -> RR^2
$
Valid both discretely and continuously
#example[
$
(X,Y)
$
1. Poker hand: $X$ is num of face cards, $Y$ is num of red cards.
2. Demographic info: $X$ = height, $Y$ = weight
]
In general, with $n$ random variables jointly where
$
X_1, X_2, ..., X_n
$
defined on $Omega$ are coordinates of an $n$-dimensional random vector that
maps the results to $RR^n$.
The probability distribution of $(X_1, dots.c, X_n)$ is now $P((X_1, dots.c,
X_n) in B)$ where $B$ are subsets of $RR^n$ (power set of $RR^n)$.
The probability distribution of the random vector is called the _joint
distribution_.
#fact[
Let $X$ and $Y$ both be discrete random variables defined on the same $Omega$
Then, the joint PMF is
$
P(X = x, Y = y) = P_(X,Y) (x,y)
$
where $p_(X,Y) (x,y) >= 0$ for all possible values $x,y$ of $X$ and $Y$
respectively.
] And,
$
sum_(x in X) sum_(y in Y) p_(X,Y) (x,y) = 1
$
#definition[
Let $X_1, X_2, ..., X_n$ are discrete random variables defined on $Omega$,
their joint PMF is given by
$
p(k_1, k_2, ..., k_n) = P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n)
$
for all possible $k_1, ..., k_n$ of $X_1, ..., X_n$.
]
#fact[
The joint probability in set notation:
$
P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n) = P({X_1 = k_1}sect{X_n = k_n})
$
The joint PDF has the same properties as single variable PDF
$
p_(X_1,X_2,X_n) (k_1,k_2,...,k_n) >= 0
$
]