auto-update(nvim): 2025-02-10 03:00:55
Some checks are pending
Deploy Quartz site to GitHub Pages using Nix / build (push) Waiting to run
Deploy Quartz site to GitHub Pages using Nix / deploy (push) Blocked by required conditions

This commit is contained in:
Youwen Wu 2025-02-10 03:00:55 -08:00
parent dfbe35333b
commit cb17974289
Signed by: youwen5
GPG key ID: 865658ED1FE61EC3

View file

@ -12,10 +12,7 @@
= Introduction
PSTAT 120A is an introductory course on probability and statistics. However, it
is a theoretical course rather an applied statistics course. You will not learn
how to read or conduct real-world statistical studies. Leave your $p$-values at
home, this ain't your momma's AP Stats.
PSTAT 120A is an introductory course on probability and statistics with an emphasis on theory.
= Lecture #datetime(day: 6, month: 1, year: 2025).display()
@ -772,8 +769,6 @@ us generalize to more than two colors.
Both approaches given the same answer.
]
= Discussion section #datetime(day: 22, month: 1, year: 2025).display()
= Lecture #datetime(day: 23, month: 1, year: 2025).display()
== Independence
@ -1144,6 +1139,244 @@ exactly one sequence that gives us success.
$
]
= Notes on textbook chapter 3
Recall that a random variable $X$ is a function $X : Omega -> RR$ that gives
the probability of an event $omega in Omega$. The _probability distribution_ of
$X$ gives its important probabilistic information. The probability distribution
is a description of the probabilities $P(X in B)$ for subsets $B in RR$. We
describe the probability density function and the cumulative distribution
function.
A random variable $X$ is discrete if there is countable $A$ such that $P(X in
A) = 1$. $k$ is a possible value if $P(X = k) > 0$.
A discrete random variable has probability distribution entirely determined by
p.m.f $p(k) = P(X = k)$. The p.m.f. is a function from the set of possible
values of $X$ into $[0,1]$. Labeling the p.m.f. with the random variable is
done by $p_X (k)$.
By the axioms of probability,
$
sum_k p_X (k) = sum_k P(X=k) = 1
$
For a subset $B subset RR$,
$
P(X in B) = sum_(k in B) p_X (k)
$
Now we introduce another major class of random variables.
#definition[
Let $X$ be a random variable. If $f$ satisfies
$
P(X <= b) = integral^b_(-infinity) f(x) dif x
$
for all $b in RR$, then $f$ is the *probability density function* of $X$.
]
The probability that $X in (-infinity, b]$ is equal to the area under the graph
of $f$ from $-infinity$ to $b$.
A corollary is the following.
#fact[
$ P(X in B) = integral_B f(x) dif x $
]
for any $B subset RR$ where integration makes sense.
The set can be bounded or unbounded, or any collection of intervals.
#fact[
$ P(a <= X <= b) = integral_a^b f(x) dif x $
$ P(X > a) = integral_a^infinity f(x) dif x $
]
#fact[
If a random variable $X$ has density function $f$ then individual point
values have probability zero:
$ P(X = c) = integral_c^c f(x) dif x = 0, forall c in RR $
]
#remark[
It follows a random variable with a density function is not discrete. Also
the probabilities of intervals are not changed by including or excluding
endpoints.
]
How to determine which functions are p.d.f.s? Since $P(-infinity < X <
infinity) = 1$, a p.d.f. $f$ must satisfy
$
f(x) >= 0 forall x in RR \
integral^infinity_(-infinity) f(x) dif x = 1
$
#fact[
Random variables with density functions are called _continuous_ random
variables. This does not imply that the random variable is a continuous
function on $Omega$ but it is standard terminology.
]
#definition[
Let $[a,b]$ be a bounded interval on the real line. A random variable $X$ has
the *uniform distribution* on $[a,b]$ if $X$ has density function
$
f(x) = cases(
1/(b-a)", if" x in [a,b],
0", if" x in.not [a,b]
)
$
Abbreviate this by $X ~ "Unif"[a,b]$.
]
= Notes on week 3 lecture slides
== Negative binomial
Consider a sequence of Bernoulli trials with the following characteristics:
- Each trial success or failure
- Prob. of success $p$ is same on each trial
- Trials are independent (notice they are not fixed to specific number)
- Experiment continues until $r$ successes are observed, where $r$ is a given parameter
Then if $X$ is the number of trials necessary until $r$ successes are observed,
we say $X$ is a *negative binomial* random variable.
#definition[
Let $k in ZZ^+$ and $0 < p <= 1$. A random variable $X$ has the negative
binomial distribution with parameters ${k,p}$ if the possible values of $X$
are the integers ${k,k+1, k+2, ...}$ and the p.m.f. is
$
P(X = n) = vec(n-1, k-1) p^k (1-p)^(n-k) "for" n >= k
$
Abbreviate this by $X ~ "Negbin"(k,p)$.
]
#example[
Steph Curry has a three point percentage of approx. $43%$. What is the
probability that Steph makes his third three-point basket on his $5^"th"$
attempt?
Let $X$ be number of attempts required to observe the 3rd success. Then,
$
X ~ "Negbin"(k = 3, p = 0.43)
$
So,
$
P(X = 5) &= vec(5-1,3-1)(0.43)^3 (1 - 0.43)^(5-3) \
&= vec(4,2) (0.43)^3 (0.57)^2 \
&approx 0.155
$
]
== Poisson distribution
This p.m.f. follows from the Taylor expansion
$
e^lambda = sum_(k=0)^infinity lambda^k / k!
$
which implies that
$
sum_(k=0)^infinity e^(-lambda) lambda^k / k! = e^(-lambda) e^lambda = 1
$
#definition[
For an integer valued random variable $X$, we say $X ~ "Poisson"(lambda)$ if it has p.m.f.
$ P(X = k) = e^(lambda) lambda^k / k! $
for $k in {0,1,2,...}$ for $lambda > 0$ and
$
sum_(k = 0)^infinity P(X=k) = 1
$
]
The Poisson arises from the Binomial. It applies in the binomial context when
$n$ is very large ($n >= 100$) and $p$ is very small $p <= 0.05$, such that $n
p$ is a moderate number ($n p < 10$).
Then $X$ follows a Poisson distribution with $lambda = n p$.
$
P("Bin"(n,p) = k) approx P("Poisson"(lambda = n p) = k)
$
for $k = 0,1,...,n$.
#example[
The number of typing errors in the page of a textbook.
Let
- $n$ be the number of letters of symbols per page (large)
- $p$ be the probability of error, small enough such that
- $lim_(n -> infinity) lim_(p -> 0) n p = lambda = 0.1$
What is the probability of exactly 1 error?
We can approximate the distribution of $X$ with a $"Poisson"(lambda = 0.1)$
distribution
$
P(X = 1) = (e^(-0.1) (0.1)^1) / 1! = 0.09048
$
]
#example[
The number of reported auto accidents in a big city on any given day
Let
- $n$ be the number of autos on the road
- $p$ be the probability of an accident for any individual is small such that
$lim_(n->infinity) lim_(p->0) n p = lambda = 2$
What is the probability of no accidents today?
We can approximate $X$ by $"Poisson"(lambda = 2)$
$
P(X = 0) = (e^(-2) (2)^0) / 0! = 0.1353
$
]
A discrete example:
#example[
Suppose we have an election with candidates $B$ and $W$. A total of 10,000
ballots were cast such that
$
10,000 "votes" cases(5005 space B, 4995 space W)
$
But 15 ballots had irregularities and were disqualified. What is the
probability that the election results will change?
There are three combinations of disqualified ballots that would result in a
different election outcome: 13 $B$ and 2 $W$, 14 $B$ and 1 $W$, and 15 $B$
and 0 $W$. What is the probability of these?
]
= Lecture #datetime(day: 3, month: 2, year: 2025).display()
== CDFs, PMFs, PDFs