auto-update(nvim): 2025-02-19 16:07:50

2025-02-19 16:07:50 -08:00 · 2025-02-19 16:07:50 -08:00 · 8ade952fdf
commit 8ade952fdf
parent 69878532bb
1 changed files with 283 additions and 3 deletions
--- a/documents/by-course/pstat-120a/course-notes/main.typ
+++ b/documents/by-course/pstat-120a/course-notes/main.typ
@ -1,5 +1,4 @@
 #import "@youwen/zen:0.1.0": *
-#import "@preview/ctheorems:1.1.3": *

 #show: zen.with(
  title: "PSTAT120A Course Notes",
@ -1993,8 +1992,8 @@ Previously we discussed "raw moments." Be careful not to confuse them with
 _central moments_.

 #fact[
-  The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. p_X
-  (x) is the expected value of the difference about the mean raised to the
+  The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. $p_X
+  (x)$ is the expected value of the difference about the mean raised to the
  $n^"th"$ power

  $
@ -2016,6 +2015,8 @@ $
  mu'_2 = E[(X-mu)^2] = sigma^2_X = "Var"(X)
 $

+Effectively we're centering our distribution first.
+
 #example[
  Let $Y$ be a uniformly chosen integer from ${0,1,2,...,m}$. Find the first and
  second moment of $Y$.
@ -2103,3 +2104,282 @@ indicator of where the center of the distribution lies.
  The median reflects the fact that 90% of the values and probability is in the
  range $1,2,...,9$ while the mean is heavily influenced by the $-100$ value.
 ]
+
+= President's Day lecture
+
+...
+
+= Lecture #datetime(day: 19, month: 2, year: 2025).display()
+
+== Moment generating functions
+
+Like the CDF, the moment generating function also completely characterizes the
+distribution. That is, if you can find the MGF, it tells you all of the
+information about the distribution. So it is an alternative way to characterize
+a random variable.
+
+They are "easy" to use for finding the distributions of:
+
+- sums of independent random variables
+- the distribution of the limit of a sequence of random variables
+
+#definition[
+  Let $X$ be a random variable with all finite moments
+  $
+    E[X^k] = mu_k, k = 1,2,...
+  $
+  Then the *moment generating function* of a random variable $X$ is defined by
+  $M_x(t) = E[e^(t x)]$, for the real variable $t$.
+]
+
+All of the moments must be defined for the MGF to exist. The MGF looks like
+
+$
+  sum_("all" x) e^(t x) p(x)
+$
+in the discrete case, and
+$
+  integral^infinity_(-infinity) e^(t x) f(x) dif x
+$
+in the continuous case.
+
+#proposition[
+  It holds that the $n^"th"$ derivative of $M$ evaluated at 0 gives the
+  $n^"th"$ moment.
+  $
+    M_x^((n)) (0) = E[X^n]
+  $
+]
+
+#proof[
+  $
+    M_X (t) &equiv E[e^(t x)] = E[1 + (t X) + (t X^2) / 2! + dots.c] \
+    &= E[1] + E[t X] + E[(t^2 X^2) / 2!] + dots.c \
+    &= E[1] + t E[X] + t^2 / 2! E[X^2] + dots.c \
+    &= 1 + t / 1! mu_1 + t^2 / 2! mu_2 + dots.c
+  $
+
+  The coefficient of $t^k/k!$ in the Taylor series expansion of $M_X (t)$ is the
+  $k^"th"$ moment. So an alternative way to get $mu_k$ is
+
+  $
+    mu_k = lr(((dif^k M(t))/(dif t^k)) |)_(t=0) = "coefficient of" t^k / k!
+  $
+]
+
+#example[Binomial][
+
+  Let $X ~ "Bin"(n,p)$. Then the MGF of $X$ is given by
+
+  $
+    sum_(k=0)^n e^(t k) vec(n,k) p^k q^(n-k) = sum_(k=0)^n vec(n,k) underbrace(p (e^t)^k,a) underbrace(q^(n-k), b)
+  $
+
+  Applying the binomial theorem
+
+  $
+    (a + b)^n = sum_(k=0)^n vec(n,k) a^k b^(n-k)
+  $
+
+  So we have
+
+  $
+    (q + p e^t)^n
+  $
+
+  Let's find the first moment
+
+  $
+    mu_1 = lr((dif M(t))/(dif t) |)_(t=0) \
+    = n p
+  $
+
+  The second moment:
+
+  $
+    mu_2 = lr((dif^2 M(t))/(dif t^2) |)_(t=0) \
+    = n(n-1) p^2 + n p
+  $
+
+  For example, if $X$ has MGF $(1/3 + 2/3 e^t)^10$, then $X ~ "Bin"(10,2/3)$.
+]
+
+#example[Poisson][
+  Let $X ~ "Pois"(lambda)$. Then the MGF of $X$ is given
+  $
+    M_X (t) = E[e^(t X)] \
+    sum^infinity_(x=0) e^(t x) e^(-lambda) lambda^x / x! \
+    e^(-lambda) sum^infinity_(x=0) e^(t x)lambda^x / x!
+  $
+  Note: $e^a = sum_(x=0) ^infinity a^x / x!$
+  $
+    = e^(-lambda) e^(lambda e^t) \
+    = e^(-lambda (1 - e^t))
+  $
+  Then, the first moment can be found by,
+  $
+    mu_1 = lr(e^(-lambda (1 - e^t)) (-lambda) (-e^t) |)_(t=0) = lambda
+  $
+]
+
+#example[Exponential][
+  Let $X ~"Exp"(lambda)$ with PDF
+  $
+    f(x) = cases(lambda e^(-lambda x) &"for" x > 0, 0 &"otherwise")
+  $
+  Find the MGF of $X$
+
+  $
+    M_X (t) &= integral^infinity_(-infinity) e^(t x) dot lambda e^(-lambda x) dif x \
+    &= lambda integral_0^infinity e^((t-lambda) x) dif x \
+    &= lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x \
+  $
+  This integral depends on $t$, so we should consider three cases. If $t =
+  lambda$, then the integral diverges.
+
+  If $t != lambda$,
+  $
+    E[e^(t X)] = lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x = lambda lim_(b -> infinity) [(e^((t - lambda) x) - 1) / (t - lambda)]^(x=b)_(x=0) \
+    lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t > lambda, lambda/(lambda - t) &"if" t < lambda)
+  $
+  Combining with the $lambda = t$ case,
+
+  $
+    lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t >= lambda, lambda/(lambda - t) &"if" t < lambda)
+  $
+]
+
+#example[Alternative parameterization of the exponential][
+  Consider $X ~ "Exp"(beta)$ with PDF
+  $
+    f(x) = cases(1/beta e^(-x/beta) &"for" x > 0, 0 &"otherwise")
+  $
+  and proceed as usual
+  $
+    M_X (t) = integral_(-infinity)^infinity e^(t x) dot 1 / beta e^(-x / beta) dif x = 1 / beta lim_(b-infinity) [e^((t - 1 / beta) x) / (t - 1 / beta)]_(x=0)^(x=b) = 1 / (1 - beta t)
+  $
+  So it's a geometric series
+  $
+    1 + beta t + (beta t)^2 + dots.c \
+  $
+  Multiply each $n^"th"$ term by $n/n!$
+  $
+    = 1 + beta t + 2 beta^2 (t^2 / 2!) + 6 beta^3 (t^3 / 3!) + dots.c
+  $
+  Recall that the coefficient of each $r^k/k! = mu)k$. So
+  - $E[x] = beta$
+  - $E[X^2] = 2 beta^2$
+  - $E[X^3] = 6 beta^3$
+  $
+    "Var"(X) = E[X^2] - (E[X])^2 = beta^2
+  $
+]
+
+#example[Uniform on $[0,1]$][
+  Let $X ~ U(0,1)$, then
+  $
+    M_X (t) &= integral_0^1 e^(t x) dot 1 dif x \
+    &= lr(e^(t x)/t |)_(x=0)^(x=1) = (e^t - 1) / t \
+    &= (cancel(1) + t^2 / 2! + dots.c - cancel(1)) / t \
+    &= 1 + t^2 / 2! + t^2 / 3! + t^3 / 4! + dots.c \
+    &= 1 + 1 / 2 t + 1 / 3 (t^2 / 2!) + 1 / 4(t^3 / 3!) + dots.c
+  $
+  So
+  - $E[X] = 1 / 2$
+  - $E[X^2] = 1 / 3$
+  - $E[X^n] = 1 / (n + 1)$
+]
+
+== Properties of the MGF
+
+#definition[
+  Random variables $X$ and $Y$ are equal in distribution if $P(X in B) = P(Y in
+B)$ for all all subsets $B$ of $RR$.
+]
+
+#abuse[
+  Abbreviate this by $X eq.delta Y$
+]
+
+#example[Normal distribution][
+  Let $Z ~ N(0,1)$. Then
+  $
+    E[e^(t Z)] = 1 / sqrt(2 pi) integral^infinity_(-infinity) e^(-1 / 2 z^2 + t z -1 / 2 t^2 + 1 / 2 t^2) dif z \
+    = e^(t^2 / 2) 1 / sqrt(2 pi) = integral_(-infinity)^infinity e^(-1 / 2 (z-t)^2) dif z = e^(t^2 / 2)
+  $
+
+  To get the MGF for a general normal RV, $X ~ N(mu, sigma^2)$, then
+  $
+    X = sigma Z + mu
+  $
+  we get
+  $
+    E[e^(t (sigma Z + mu))] = e^(t mu) E[e^(t sigma Z)] = e^(t mu) dot e^((t^2 sigma^2) / 2) = exp(mu t + (sigma^2 t^2) / 2)
+  $
+]
+
+== Joint distributions of RV
+
+Looking at multiple random variables jointly. If $X$ and $Y$ are both random
+variables defined on $Omega$< treat them as coordinates of a 2 dimensional
+random vector. It's a vector valued function on $Omega$,
+
+$
+  Omega -> RR^2
+$
+
+Valid both discretely and continuously
+
+#example[
+  $
+    (X,Y)
+  $
+  1. Poker hand: $X$ is num of face cards, $Y$ is num of red cards.
+  2. Demographic info: $X$ = height, $Y$ = weight
+]
+
+In general, with $n$ random variables jointly where
+$
+  X_1, X_2, ..., X_n
+$
+defined on $Omega$ are coordinates of an $n$-dimensional random vector that
+maps the results to $RR^n$.
+
+The probability distribution of $(X_1, dots.c, X_n)$ is now $P((X_1, dots.c,
+X_n) in B)$ where $B$ are subsets of $RR^n$ (power set of $RR^n)$.
+
+The probability distribution of the random vector is called the _joint
+distribution_.
+
+#fact[
+  Let $X$ and $Y$ both be discrete random variables defined on the same $Omega$
+  Then, the joint PMF is
+  $
+    P(X = x, Y = y) = P_(X,Y) (x,y)
+  $
+  where $p_(X,Y) (x,y) >= 0$ for all possible values $x,y$ of $X$ and $Y$
+  respectively.
+] And,
+$
+  sum_(x in X) sum_(y in Y) p_(X,Y) (x,y) = 1
+$
+
+#definition[
+  Let $X_1, X_2, ..., X_n$ are discrete random variables defined on $Omega$,
+  their joint PMF is given by
+  $
+    p(k_1, k_2, ..., k_n) = P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n)
+  $
+  for all possible $k_1, ..., k_n$ of $X_1, ..., X_n$.
+]
+
+#fact[
+  The joint probability in set notation:
+  $
+    P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n) = P({X_1 = k_1}sect{X_n = k_n})
+  $
+  The joint PDF has the same properties as single variable PDF
+  $
+    p_(X_1,X_2,X_n) (k_1,k_2,...,k_n) >= 0
+  $
+]