diff --git a/documents/by-course/pstat-120a/course-notes/main.typ b/documents/by-course/pstat-120a/course-notes/main.typ index 126e0d1..b4c361e 100644 --- a/documents/by-course/pstat-120a/course-notes/main.typ +++ b/documents/by-course/pstat-120a/course-notes/main.typ @@ -1,5 +1,4 @@ #import "@youwen/zen:0.1.0": * -#import "@preview/ctheorems:1.1.3": * #show: zen.with( title: "PSTAT120A Course Notes", @@ -1993,8 +1992,8 @@ Previously we discussed "raw moments." Be careful not to confuse them with _central moments_. #fact[ - The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. p_X - (x) is the expected value of the difference about the mean raised to the + The $n^"th"$ central moment of a discrete random variable $X$ with p.m.f. $p_X + (x)$ is the expected value of the difference about the mean raised to the $n^"th"$ power $ @@ -2016,6 +2015,8 @@ $ mu'_2 = E[(X-mu)^2] = sigma^2_X = "Var"(X) $ +Effectively we're centering our distribution first. + #example[ Let $Y$ be a uniformly chosen integer from ${0,1,2,...,m}$. Find the first and second moment of $Y$. @@ -2103,3 +2104,282 @@ indicator of where the center of the distribution lies. The median reflects the fact that 90% of the values and probability is in the range $1,2,...,9$ while the mean is heavily influenced by the $-100$ value. ] + += President's Day lecture + +... + += Lecture #datetime(day: 19, month: 2, year: 2025).display() + +== Moment generating functions + +Like the CDF, the moment generating function also completely characterizes the +distribution. That is, if you can find the MGF, it tells you all of the +information about the distribution. So it is an alternative way to characterize +a random variable. + +They are "easy" to use for finding the distributions of: + +- sums of independent random variables +- the distribution of the limit of a sequence of random variables + +#definition[ + Let $X$ be a random variable with all finite moments + $ + E[X^k] = mu_k, k = 1,2,... + $ + Then the *moment generating function* of a random variable $X$ is defined by + $M_x(t) = E[e^(t x)]$, for the real variable $t$. +] + +All of the moments must be defined for the MGF to exist. The MGF looks like + +$ + sum_("all" x) e^(t x) p(x) +$ +in the discrete case, and +$ + integral^infinity_(-infinity) e^(t x) f(x) dif x +$ +in the continuous case. + +#proposition[ + It holds that the $n^"th"$ derivative of $M$ evaluated at 0 gives the + $n^"th"$ moment. + $ + M_x^((n)) (0) = E[X^n] + $ +] + +#proof[ + $ + M_X (t) &equiv E[e^(t x)] = E[1 + (t X) + (t X^2) / 2! + dots.c] \ + &= E[1] + E[t X] + E[(t^2 X^2) / 2!] + dots.c \ + &= E[1] + t E[X] + t^2 / 2! E[X^2] + dots.c \ + &= 1 + t / 1! mu_1 + t^2 / 2! mu_2 + dots.c + $ + + The coefficient of $t^k/k!$ in the Taylor series expansion of $M_X (t)$ is the + $k^"th"$ moment. So an alternative way to get $mu_k$ is + + $ + mu_k = lr(((dif^k M(t))/(dif t^k)) |)_(t=0) = "coefficient of" t^k / k! + $ +] + +#example[Binomial][ + + Let $X ~ "Bin"(n,p)$. Then the MGF of $X$ is given by + + $ + sum_(k=0)^n e^(t k) vec(n,k) p^k q^(n-k) = sum_(k=0)^n vec(n,k) underbrace(p (e^t)^k,a) underbrace(q^(n-k), b) + $ + + Applying the binomial theorem + + $ + (a + b)^n = sum_(k=0)^n vec(n,k) a^k b^(n-k) + $ + + So we have + + $ + (q + p e^t)^n + $ + + Let's find the first moment + + $ + mu_1 = lr((dif M(t))/(dif t) |)_(t=0) \ + = n p + $ + + The second moment: + + $ + mu_2 = lr((dif^2 M(t))/(dif t^2) |)_(t=0) \ + = n(n-1) p^2 + n p + $ + + For example, if $X$ has MGF $(1/3 + 2/3 e^t)^10$, then $X ~ "Bin"(10,2/3)$. +] + +#example[Poisson][ + Let $X ~ "Pois"(lambda)$. Then the MGF of $X$ is given + $ + M_X (t) = E[e^(t X)] \ + sum^infinity_(x=0) e^(t x) e^(-lambda) lambda^x / x! \ + e^(-lambda) sum^infinity_(x=0) e^(t x)lambda^x / x! + $ + Note: $e^a = sum_(x=0) ^infinity a^x / x!$ + $ + = e^(-lambda) e^(lambda e^t) \ + = e^(-lambda (1 - e^t)) + $ + Then, the first moment can be found by, + $ + mu_1 = lr(e^(-lambda (1 - e^t)) (-lambda) (-e^t) |)_(t=0) = lambda + $ +] + +#example[Exponential][ + Let $X ~"Exp"(lambda)$ with PDF + $ + f(x) = cases(lambda e^(-lambda x) &"for" x > 0, 0 &"otherwise") + $ + Find the MGF of $X$ + + $ + M_X (t) &= integral^infinity_(-infinity) e^(t x) dot lambda e^(-lambda x) dif x \ + &= lambda integral_0^infinity e^((t-lambda) x) dif x \ + &= lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x \ + $ + This integral depends on $t$, so we should consider three cases. If $t = + lambda$, then the integral diverges. + + If $t != lambda$, + $ + E[e^(t X)] = lambda lim_(b->infinity) integral_0^b e^((t - lambda) x) dif x = lambda lim_(b -> infinity) [(e^((t - lambda) x) - 1) / (t - lambda)]^(x=b)_(x=0) \ + lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t > lambda, lambda/(lambda - t) &"if" t < lambda) + $ + Combining with the $lambda = t$ case, + + $ + lambda lim_(b -> infinity) (e^((t - lambda) b) - 1) / (t - lambda) = cases(infinity &"if" t >= lambda, lambda/(lambda - t) &"if" t < lambda) + $ +] + +#example[Alternative parameterization of the exponential][ + Consider $X ~ "Exp"(beta)$ with PDF + $ + f(x) = cases(1/beta e^(-x/beta) &"for" x > 0, 0 &"otherwise") + $ + and proceed as usual + $ + M_X (t) = integral_(-infinity)^infinity e^(t x) dot 1 / beta e^(-x / beta) dif x = 1 / beta lim_(b-infinity) [e^((t - 1 / beta) x) / (t - 1 / beta)]_(x=0)^(x=b) = 1 / (1 - beta t) + $ + So it's a geometric series + $ + 1 + beta t + (beta t)^2 + dots.c \ + $ + Multiply each $n^"th"$ term by $n/n!$ + $ + = 1 + beta t + 2 beta^2 (t^2 / 2!) + 6 beta^3 (t^3 / 3!) + dots.c + $ + Recall that the coefficient of each $r^k/k! = mu)k$. So + - $E[x] = beta$ + - $E[X^2] = 2 beta^2$ + - $E[X^3] = 6 beta^3$ + $ + "Var"(X) = E[X^2] - (E[X])^2 = beta^2 + $ +] + +#example[Uniform on $[0,1]$][ + Let $X ~ U(0,1)$, then + $ + M_X (t) &= integral_0^1 e^(t x) dot 1 dif x \ + &= lr(e^(t x)/t |)_(x=0)^(x=1) = (e^t - 1) / t \ + &= (cancel(1) + t^2 / 2! + dots.c - cancel(1)) / t \ + &= 1 + t^2 / 2! + t^2 / 3! + t^3 / 4! + dots.c \ + &= 1 + 1 / 2 t + 1 / 3 (t^2 / 2!) + 1 / 4(t^3 / 3!) + dots.c + $ + So + - $E[X] = 1 / 2$ + - $E[X^2] = 1 / 3$ + - $E[X^n] = 1 / (n + 1)$ +] + +== Properties of the MGF + +#definition[ + Random variables $X$ and $Y$ are equal in distribution if $P(X in B) = P(Y in +B)$ for all all subsets $B$ of $RR$. +] + +#abuse[ + Abbreviate this by $X eq.delta Y$ +] + +#example[Normal distribution][ + Let $Z ~ N(0,1)$. Then + $ + E[e^(t Z)] = 1 / sqrt(2 pi) integral^infinity_(-infinity) e^(-1 / 2 z^2 + t z -1 / 2 t^2 + 1 / 2 t^2) dif z \ + = e^(t^2 / 2) 1 / sqrt(2 pi) = integral_(-infinity)^infinity e^(-1 / 2 (z-t)^2) dif z = e^(t^2 / 2) + $ + + To get the MGF for a general normal RV, $X ~ N(mu, sigma^2)$, then + $ + X = sigma Z + mu + $ + we get + $ + E[e^(t (sigma Z + mu))] = e^(t mu) E[e^(t sigma Z)] = e^(t mu) dot e^((t^2 sigma^2) / 2) = exp(mu t + (sigma^2 t^2) / 2) + $ +] + +== Joint distributions of RV + +Looking at multiple random variables jointly. If $X$ and $Y$ are both random +variables defined on $Omega$< treat them as coordinates of a 2 dimensional +random vector. It's a vector valued function on $Omega$, + +$ + Omega -> RR^2 +$ + +Valid both discretely and continuously + +#example[ + $ + (X,Y) + $ + 1. Poker hand: $X$ is num of face cards, $Y$ is num of red cards. + 2. Demographic info: $X$ = height, $Y$ = weight +] + +In general, with $n$ random variables jointly where +$ + X_1, X_2, ..., X_n +$ +defined on $Omega$ are coordinates of an $n$-dimensional random vector that +maps the results to $RR^n$. + +The probability distribution of $(X_1, dots.c, X_n)$ is now $P((X_1, dots.c, +X_n) in B)$ where $B$ are subsets of $RR^n$ (power set of $RR^n)$. + +The probability distribution of the random vector is called the _joint +distribution_. + +#fact[ + Let $X$ and $Y$ both be discrete random variables defined on the same $Omega$ + Then, the joint PMF is + $ + P(X = x, Y = y) = P_(X,Y) (x,y) + $ + where $p_(X,Y) (x,y) >= 0$ for all possible values $x,y$ of $X$ and $Y$ + respectively. +] And, +$ + sum_(x in X) sum_(y in Y) p_(X,Y) (x,y) = 1 +$ + +#definition[ + Let $X_1, X_2, ..., X_n$ are discrete random variables defined on $Omega$, + their joint PMF is given by + $ + p(k_1, k_2, ..., k_n) = P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n) + $ + for all possible $k_1, ..., k_n$ of $X_1, ..., X_n$. +] + +#fact[ + The joint probability in set notation: + $ + P(X_1 = k_1, X_1 = k_2, ..., X_n = k_n) = P({X_1 = k_1}sect{X_n = k_n}) + $ + The joint PDF has the same properties as single variable PDF + $ + p_(X_1,X_2,X_n) (k_1,k_2,...,k_n) >= 0 + $ +]