auto-update(nvim): 2025-02-03 15:15:35
Some checks failed
Deploy Quartz site to GitHub Pages using Nix / build (push) Has been cancelled
Deploy Quartz site to GitHub Pages using Nix / deploy (push) Has been cancelled

This commit is contained in:
Youwen Wu 2025-02-03 15:15:35 -08:00
parent 1e177107d8
commit b329bd3a23
Signed by: youwen5
GPG key ID: 865658ED1FE61EC3

View file

@ -365,9 +365,9 @@ This is mostly a formal manipulation to derive the obviously true proposition fr
$
]
#proposition[
#proposition("Inclusion-exclusion principle")[
$ P(A union B) = P(A) + P(B) - P(A sect B) $
]
]<inclusion-exclusion>
#proof[
$
@ -1143,3 +1143,269 @@ exactly one sequence that gives us success.
(5 / 6)^7
$
]
= Lecture #datetime(day: 3, month: 2, year: 2025).display()
== CDFs, PMFs, PDFs
Properties of a CDF:
Any CDF $F(x) = P(X <= x)$ satisfies
1. $F(-infinity) = 0$, $F(infinity) = 1$
2. $F(x)$ is non-decreasing in $x$ (monotonically increasing)
$ s < t => F(s) <= F(t) $
3. $P(a < X <= b) = P(X <= b) - P(X <= a) = F(b) - F(a)$
#example[
Let $X$ be a continuous random variable with density (pdf)
$
f(x) = cases(
c x^2 &"for" 0 < x < 2,
0 &"otherwise"
)
$
1. What is $c$?
$c$ is such that
$
1 = integral^infinity_(-infinity) f(x) dif x = integral_0^2 c x^2 dif x
$
2. Find the probability that $X$ is between 1 and 1.4.
Integrate the curve between 1 and 1.4.
$
integral_1^1.4 3 / 8 x^2 dif x = (x^3 / 8) |_1^1.4 \
= 0.218
$
This is the probability that $X$ lies between 1 and 1.4.
3. Find the probability that $X$ is between 1 and 3.
Idea: integrate between 1 and 3, be careful after 2.
$ integral^2_1 3 / 8 x^2 dif x + integral_2^3 0 dif x = $
4. What is the CDF for $P(X <= x)$? Integrate the curve to $x$.
$
F(x) = P(X <= x) = integral_(-infinity)^x f(t) dif t \
= integral_0^x 3 / 8 t^2 dif t \
= x^3 / 8
$
Important: include the range!
$
F(x) = cases(
0 &"for" x <= 0,
x^3/8 &"for" 0 < x < 2,
1 &"for" x >= 2
)
$
5. Find a point $a$ such that you integrate up to the point to find exactly $1/2$
the area.
We want to find $1/2 = P(X <= a)$.
$ 1 / 2 = P(X <= a) = F(a) = a^3 / 8 => a = root(3, 4) $
]
== The (continuous) uniform distribution
The most simple and the best of the named distributions!
#definition[
Let $[a,b]$ be a bounded interval on the real line. A random variable $X$ has the uniform distribution on the interval $[a,b]$ if $X$ has the density function
$
f(x) = cases(
1/(b-a) &"for" x in [a,b],
0 &"for" x in.not [a,b]
)
$
Abbreviate this by $X ~ "Unif" [a,b]$.
]<continuous-uniform>
The graph of $"Unif" [a,b]$ is a constant line at height $1/(b-a)$ defined
across $[a,b]$. The integral is just the area of a rectangle, and we can check
it is 1.
#fact[
For $X ~ "Unif" [a,b]$, its cumulative distribution function (CDF) is given by:
$
F_x (x) = cases(
0 &"for" x < a,
(x-a)/(b-a) &"for" x in [a,b],
1 &"for" x > b
)
$
]
#fact[
If $X ~ "Unif" [a,b]$, and $[c,d] subset [a,b]$, then
$
P(c <= X <= d) = integral_c^d 1 / (b-a) dif x = (d-c) / (b-a)
$
]
#example[
Let $Y$ be a uniform random variable on $[-2,5]$. Find the probability that its
absolute value is at least 1.
$Y$ takes values in the interval $[-2,5]$, so the absolute value is at least 1 iff. $Y in [-2,1] union [1,5]$.
The density function of $Y$ is $f(x) = 1/(5- (-2)) = 1/7$ on $[-2,5]$ and 0 everywhere else.
So,
$
P(|Y| >= 1) &= P(Y in [-2,-1] union [1,5]) \
&= P(-2 <= Y <= -1) + P(1 <= Y <= 5) \
&= 5 / 7
$
]
== The exponential distribution
The geometric distribution can be viewed as modeling waiting times, in a discrete setting, i.e. we wait for $n - 1$ failures to arrive at the $n^"th"$ success.
The exponential distribution is the continuous analogue to the geometric
distribution, in that we often use it to model waiting times in the continuous
sense. For example, the first custom to enter the barber shop.
#definition[
Let $0 < lambda < infinity$. A random variable $X$ has the exponential distribution with parameter $lambda$ if $X$ has PDF
$
f(x) = cases(
lambda e^(-lambda x) &"for" x >= 0,
0 &"for" x < 0
)
$
Abbreviate this by $X ~ "Exp"(lambda)$, the exponential distribution with rate $lambda$.
The CDF of the $"Exp"(lambda)$ distribution is given by:
$
F(t) + cases(
0 &"if" t <0,
1 - e^(-lambda t) &"if" t>= 0
)
$
]
#example[
Suppose the length of a phone call, in minutes, is well modeled by an exponential random variable with a rate $lambda = 1/10$.
1. What is the probability that a call takes more than 8 minutes?
2. What is the probability that a call takes between 8 and 22 minutes?
Let $X$ be the length of the phone call, so that $X ~ "Exp"(1/10)$. Then we can find the desired probability by:
$
P(X > 8) &= 1 - P(X <= 8) \
&= 1 - F_x (8) \
&= 1 - (1 - e^(-(1 / 10) dot 8)) \
&= e^(-8 / 10) approx 0.4493
$
Now to find $P(8 < X < 22)$, we can take the difference in CDFs:
$
&P(X > 8) - P(X >= 22) \
&= e^(-8 / 10) - e^(-22 / 10) \
&approx 0.3385
$
]
#fact("Memoryless property of the exponential distribution")[
Suppose that $X ~ "Exp"(lambda)$. Then for any $s,t > 0$, we have
$
P(X > t + s | X > t) = P(X > s)
$
]<memoryless>
This is like saying if I've been waiting 5 minutes and then 3 minutes for the
bus, what is the probability that I'm gonna wait more than 5 + 3 minutes, given
that I've already waited 5 minutes? And that's precisely equal to just the
probability I'm gonna wait more than 3 minutes.
#proof[
$
P(X > t + s | X > t) = (P(X > t + s sect X > t)) / (P(X > t)) \
= P(X > t + s) / P(X > t)
= e^(-lambda (t+ s)) / (e^(-lambda t)) = e^(-lambda s) \
equiv P(X > s)
$
]
== Gamma distribution
#definition[
Let $r, lambda > 0$. A random variable $X$ has the *gamma distribution* with parameters $(r, lambda)$ if $X$ is nonnegative and has probability density function
$
f(x) = cases(
(lambda^r x^(r-2))/(Gamma(r)) e^(-lambda x) &"for" x >= 0,
0 &"for" x < 0
)
$
Abbreviate this by $X ~ "Gamma"(r, lambda)$.
]
The gamma function $Gamma(r)$ generalizes the factorial function and is defined as
$
Gamma(r) = integral_0^infinity x^(r-1) e^(-x) dif x, "for" r > 0
$
Special case: $Gamma(n) = (n - 1)!$ if $n in ZZ^+$.
#remark[
The $"Exp"(lambda)$ distribution is a special case of the gamma distribution,
with parameter $r = 1$.
]
== The normal (Gaussian) distribution
#definition[
A random variable $ZZ$ has the *standard normal distribution* if $Z$ has
density function
$
phi(x) = 1 / sqrt(2 pi) e^(-x^2 / 2)
$
on the real line. Abbreviate this by $Z ~ N(0,1)$.
]<normal-dist>
#fact("CDF of a standard normal random variable")[
Let $Z~N(0,1)$ be normally distributed. Then its CDF is given by
$
Phi(x) = integral_(-infinity)^x phi(s) dif s = integral_(-infinity)^x 1 / sqrt(2 pi) e^(-(-s^2) / 2) dif s
$
]
The normal distribution is so important, instead of the standard $f_Z(x)$ and
$F_z(x)$, we use the special $phi(x)$ and $Phi(x)$.
#fact[
$
integral_(-infinity)^infinity e^(-s^2 / 2) dif s = sqrt(2 pi)
$
No closed form of the standard normal CDF $Phi$ exists, so we are left to either:
- approximate
- use technology (calculator)
- use the standard normal probability table in the textbook
]