From b329bd3a23e9c6e86134652b201fb88ab3cfbd71 Mon Sep 17 00:00:00 2001 From: Youwen Wu Date: Mon, 3 Feb 2025 15:15:35 -0800 Subject: [PATCH] auto-update(nvim): 2025-02-03 15:15:35 --- .../pstat-120a/course-notes/main.typ | 270 +++++++++++++++++- 1 file changed, 268 insertions(+), 2 deletions(-) diff --git a/documents/by-course/pstat-120a/course-notes/main.typ b/documents/by-course/pstat-120a/course-notes/main.typ index fa65682..006a512 100644 --- a/documents/by-course/pstat-120a/course-notes/main.typ +++ b/documents/by-course/pstat-120a/course-notes/main.typ @@ -365,9 +365,9 @@ This is mostly a formal manipulation to derive the obviously true proposition fr $ ] -#proposition[ +#proposition("Inclusion-exclusion principle")[ $ P(A union B) = P(A) + P(B) - P(A sect B) $ -] +] #proof[ $ @@ -1143,3 +1143,269 @@ exactly one sequence that gives us success. (5 / 6)^7 $ ] + += Lecture #datetime(day: 3, month: 2, year: 2025).display() + +== CDFs, PMFs, PDFs + +Properties of a CDF: + +Any CDF $F(x) = P(X <= x)$ satisfies + +1. $F(-infinity) = 0$, $F(infinity) = 1$ +2. $F(x)$ is non-decreasing in $x$ (monotonically increasing) +$ s < t => F(s) <= F(t) $ +3. $P(a < X <= b) = P(X <= b) - P(X <= a) = F(b) - F(a)$ + +#example[ + Let $X$ be a continuous random variable with density (pdf) + + $ + f(x) = cases( +c x^2 &"for" 0 < x < 2, +0 &"otherwise" +) + $ + + 1. What is $c$? + + $c$ is such that + $ + 1 = integral^infinity_(-infinity) f(x) dif x = integral_0^2 c x^2 dif x + $ + + 2. Find the probability that $X$ is between 1 and 1.4. + + Integrate the curve between 1 and 1.4. + + $ + integral_1^1.4 3 / 8 x^2 dif x = (x^3 / 8) |_1^1.4 \ + = 0.218 + $ + + This is the probability that $X$ lies between 1 and 1.4. + + 3. Find the probability that $X$ is between 1 and 3. + + Idea: integrate between 1 and 3, be careful after 2. + + $ integral^2_1 3 / 8 x^2 dif x + integral_2^3 0 dif x = $ + + 4. What is the CDF for $P(X <= x)$? Integrate the curve to $x$. + + $ + F(x) = P(X <= x) = integral_(-infinity)^x f(t) dif t \ + = integral_0^x 3 / 8 t^2 dif t \ + = x^3 / 8 + $ + + Important: include the range! + + $ + F(x) = cases( + 0 &"for" x <= 0, + x^3/8 &"for" 0 < x < 2, + 1 &"for" x >= 2 + ) + $ + + 5. Find a point $a$ such that you integrate up to the point to find exactly $1/2$ + the area. + + We want to find $1/2 = P(X <= a)$. + + $ 1 / 2 = P(X <= a) = F(a) = a^3 / 8 => a = root(3, 4) $ +] + +== The (continuous) uniform distribution + +The most simple and the best of the named distributions! + +#definition[ + Let $[a,b]$ be a bounded interval on the real line. A random variable $X$ has the uniform distribution on the interval $[a,b]$ if $X$ has the density function + + $ + f(x) = cases( +1/(b-a) &"for" x in [a,b], +0 &"for" x in.not [a,b] +) + $ + + Abbreviate this by $X ~ "Unif" [a,b]$. +] + +The graph of $"Unif" [a,b]$ is a constant line at height $1/(b-a)$ defined +across $[a,b]$. The integral is just the area of a rectangle, and we can check +it is 1. + +#fact[ + For $X ~ "Unif" [a,b]$, its cumulative distribution function (CDF) is given by: + + $ + F_x (x) = cases( +0 &"for" x < a, +(x-a)/(b-a) &"for" x in [a,b], +1 &"for" x > b +) + $ +] + +#fact[ + If $X ~ "Unif" [a,b]$, and $[c,d] subset [a,b]$, then + $ + P(c <= X <= d) = integral_c^d 1 / (b-a) dif x = (d-c) / (b-a) + $ +] + +#example[ + Let $Y$ be a uniform random variable on $[-2,5]$. Find the probability that its + absolute value is at least 1. + + $Y$ takes values in the interval $[-2,5]$, so the absolute value is at least 1 iff. $Y in [-2,1] union [1,5]$. + + The density function of $Y$ is $f(x) = 1/(5- (-2)) = 1/7$ on $[-2,5]$ and 0 everywhere else. + + So, + + $ + P(|Y| >= 1) &= P(Y in [-2,-1] union [1,5]) \ + &= P(-2 <= Y <= -1) + P(1 <= Y <= 5) \ + &= 5 / 7 + $ +] + +== The exponential distribution + +The geometric distribution can be viewed as modeling waiting times, in a discrete setting, i.e. we wait for $n - 1$ failures to arrive at the $n^"th"$ success. + +The exponential distribution is the continuous analogue to the geometric +distribution, in that we often use it to model waiting times in the continuous +sense. For example, the first custom to enter the barber shop. + +#definition[ + Let $0 < lambda < infinity$. A random variable $X$ has the exponential distribution with parameter $lambda$ if $X$ has PDF + + $ + f(x) = cases( + lambda e^(-lambda x) &"for" x >= 0, + 0 &"for" x < 0 + ) + $ + + Abbreviate this by $X ~ "Exp"(lambda)$, the exponential distribution with rate $lambda$. + + The CDF of the $"Exp"(lambda)$ distribution is given by: + + $ + F(t) + cases( + 0 &"if" t <0, + 1 - e^(-lambda t) &"if" t>= 0 + ) + $ +] + +#example[ + Suppose the length of a phone call, in minutes, is well modeled by an exponential random variable with a rate $lambda = 1/10$. + + 1. What is the probability that a call takes more than 8 minutes? + 2. What is the probability that a call takes between 8 and 22 minutes? + + Let $X$ be the length of the phone call, so that $X ~ "Exp"(1/10)$. Then we can find the desired probability by: + + $ + P(X > 8) &= 1 - P(X <= 8) \ + &= 1 - F_x (8) \ + &= 1 - (1 - e^(-(1 / 10) dot 8)) \ + &= e^(-8 / 10) approx 0.4493 + $ + + Now to find $P(8 < X < 22)$, we can take the difference in CDFs: + + $ + &P(X > 8) - P(X >= 22) \ + &= e^(-8 / 10) - e^(-22 / 10) \ + &approx 0.3385 + $ +] + +#fact("Memoryless property of the exponential distribution")[ + Suppose that $X ~ "Exp"(lambda)$. Then for any $s,t > 0$, we have + $ + P(X > t + s | X > t) = P(X > s) + $ +] + +This is like saying if I've been waiting 5 minutes and then 3 minutes for the +bus, what is the probability that I'm gonna wait more than 5 + 3 minutes, given +that I've already waited 5 minutes? And that's precisely equal to just the +probability I'm gonna wait more than 3 minutes. + +#proof[ + $ + P(X > t + s | X > t) = (P(X > t + s sect X > t)) / (P(X > t)) \ + = P(X > t + s) / P(X > t) + = e^(-lambda (t+ s)) / (e^(-lambda t)) = e^(-lambda s) \ + equiv P(X > s) + $ +] + +== Gamma distribution + +#definition[ + Let $r, lambda > 0$. A random variable $X$ has the *gamma distribution* with parameters $(r, lambda)$ if $X$ is nonnegative and has probability density function + + $ + f(x) = cases( +(lambda^r x^(r-2))/(Gamma(r)) e^(-lambda x) &"for" x >= 0, +0 &"for" x < 0 +) + $ + + Abbreviate this by $X ~ "Gamma"(r, lambda)$. +] + +The gamma function $Gamma(r)$ generalizes the factorial function and is defined as + +$ + Gamma(r) = integral_0^infinity x^(r-1) e^(-x) dif x, "for" r > 0 +$ + +Special case: $Gamma(n) = (n - 1)!$ if $n in ZZ^+$. + +#remark[ + The $"Exp"(lambda)$ distribution is a special case of the gamma distribution, + with parameter $r = 1$. +] + +== The normal (Gaussian) distribution + +#definition[ + A random variable $ZZ$ has the *standard normal distribution* if $Z$ has + density function + + $ + phi(x) = 1 / sqrt(2 pi) e^(-x^2 / 2) + $ + on the real line. Abbreviate this by $Z ~ N(0,1)$. +] + +#fact("CDF of a standard normal random variable")[ + Let $Z~N(0,1)$ be normally distributed. Then its CDF is given by + $ + Phi(x) = integral_(-infinity)^x phi(s) dif s = integral_(-infinity)^x 1 / sqrt(2 pi) e^(-(-s^2) / 2) dif s + $ +] + +The normal distribution is so important, instead of the standard $f_Z(x)$ and +$F_z(x)$, we use the special $phi(x)$ and $Phi(x)$. + +#fact[ + $ + integral_(-infinity)^infinity e^(-s^2 / 2) dif s = sqrt(2 pi) + $ + + No closed form of the standard normal CDF $Phi$ exists, so we are left to either: + - approximate + - use technology (calculator) + - use the standard normal probability table in the textbook +]