From 19a41f3870794a0be654600432e54472f03a5e03 Mon Sep 17 00:00:00 2001
From: Youwen Wu <youwenw@gmail.com>
Date: Wed, 19 Feb 2025 16:36:00 -0800
Subject: [PATCH] auto-update(nvim): 2025-02-19 16:36:00

---
 .../pstat-120a/course-notes/main.typ          | 270 +++++++++++-------
 1 file changed, 164 insertions(+), 106 deletions(-)
diff --git a/documents/by-course/pstat-120a/course-notes/main.typ b/documents/by-course/pstat-120a/course-notes/main.typ
index b4c361e..c5ff443 100644
--- a/documents/by-course/pstat-120a/course-notes/main.typ
+++ b/documents/by-course/pstat-120a/course-notes/main.typ
@@ -242,10 +242,7 @@ Requires equally likely outcomes and finite sample spaces.
 
 An approach done commonly by applied statisticians who work in the disgusting
 real world. This is where we are generally concerned with irrelevant concerns
-like accurate sampling and $p$-values and such. I am told this is covered in
-PSTAT 120B, so hopefully I can avoid ever taking that class (as a pure math
-major).
-
+like accurate sampling and $p$-values and such.
 $
   P(A) = (hash "of times" A "occurs in large number of trials") / (hash "of trials")
 $
@@ -768,6 +765,72 @@ us generalize to more than two colors.
   Both approaches given the same answer.
 ]
 
+= Baye's theorem and conditional probability
+
+== Conditional probability, partitions, law of total probability
+
+Sometimes we want to analyze the probability of events in a sample space given
+that we already know another event has occurred. Ergo, we want the probability
+of $A in Omega$ conditional on the event $B in Omega$.
+
+#definition[
+  For two events $A, B in Omega$, the probability of $A$ given $B$ is written
+  $
+    P(A | B)
+  $
+]
+
+#fact[
+  To calculate the conditional probability, use the following formula:
+  $
+    P(A | B) = (P(A B)) / (P(B))
+  $
+]
+
+Oftentimes we don't know $P(B)$, but we do know $P(B)$ given some events in
+$Omega$. That is, we know the probability of $B$ conditional on some events.
+For example, if we have a 50% chance of choosing a rigged (6-sided) die and a
+50% chance of choosing a fair die, we know the probability of getting side $n$
+given that we have the rigged die, and the probability of side $n$ given that
+we have the fair die. Also note that we know the probability of both events
+we're conditioning on (50% each), and they're disjoint events.
+
+In these situations, the following law is useful:
+
+#theorem[Law of total probability][
+  Given a _partition_ of $Omega$ with pairwise disjoint subsets $A_1, A_2, A_3, ..., A_n in Omega$, such that
+  $
+    union.big_(A_i in Omega) A_i = Omega \
+    sect.big_(A_i in Omega) A_i = emptyset
+  $
+  The probability of an event $B in Omega$ is given by
+  $
+    P(B) = P(B | A_1) P(A_1) + P(B | A_2) P(A_2) + dots.c + P(B | A_n) P(A_n)
+  $
+]<law-total-prob>
+
+#proof[
+  This is easy to show by writing the definition of the conditional probability
+  and simplifying.
+]
+
+== Baye's theorem
+
+Finally let's discuss a rule for inverting conditional probabilities, that is,
+getting $P(B | A)$ from $P(A | B)$.
+
+#theorem[Baye's theorem][
+  Given two events $A,B in Omega$,
+  $
+    P(A | B) = (P(B | A)P(A)) / (P(B | A)P(A) + P(B | A^c)P(A^c))
+  $
+]
+
+#proof[
+  Apply the definition of conditional probability, then apply @law-total-prob
+  noting that $A$ and $A^c$ are a partitioning of $Omega$.
+]
+
 = Lecture #datetime(day: 23, month: 1, year: 2025).display()
 
 == Independence
@@ -777,7 +840,9 @@ us generalize to more than two colors.
   "Joint probability is equal to product of their marginal probabilities."
 ]
 
-#fact[This definition must be used to show the independence of two events.]
+#fact[
+  This definition must be used to show the independence of two events.
+]
 
 #fact[
   If $A$ and $B$ are independent, then,
@@ -836,6 +901,99 @@ us generalize to more than two colors.
   different events $A_i$ and $A_j$ are independent for any $i != j$.
 ]
 
+= A bit of review on random variables
+
+== Random variables, discrete random variables
+
+Recall that a random variable $X$ is a function $X : Omega -> RR$ that gives
+the probability of an event $omega in Omega$. The _probability distribution_ of
+$X$ gives its important probabilistic information. The probability distribution
+is a description of the probabilities $P(X in B)$ for subsets $B in RR$. We
+describe the probability density function and the cumulative distribution
+function.
+
+A random variable $X$ is discrete if there is countable $A$ such that $P(X in
+A) = 1$. $k$ is a possible value if $P(X = k) > 0$.
+
+A discrete random variable has probability distribution entirely determined by
+p.m.f $p(k) = P(X = k)$. The p.m.f. is a function from the set of possible
+values of $X$ into $[0,1]$. Labeling the p.m.f. with the random variable is
+done by $p_X (k)$.
+
+By the axioms of probability,
+
+$
+  sum_k p_X (k) = sum_k P(X=k) = 1
+$
+
+For a subset $B subset RR$,
+
+$
+  P(X in B) = sum_(k in B) p_X (k)
+$
+
+== Continuous random variables
+
+Now we introduce another major class of random variables.
+
+#definition[
+  Let $X$ be a random variable. If $f$ satisfies
+
+  $
+    P(X <= b) = integral^b_(-infinity) f(x) dif x
+  $
+
+  for all $b in RR$, then $f$ is the *probability density function* of $X$.
+]
+
+The probability that $X in (-infinity, b]$ is equal to the area under the graph
+of $f$ from $-infinity$ to $b$.
+
+A corollary is the following.
+
+#fact[
+  $ P(X in B) = integral_B f(x) dif x $
+]
+
+for any $B subset RR$ where integration makes sense.
+
+The set can be bounded or unbounded, or any collection of intervals.
+
+#fact[
+  $ P(a <= X <= b) = integral_a^b f(x) dif x $
+  $ P(X > a) = integral_a^infinity f(x) dif x $
+]
+
+#fact[
+  If a random variable $X$ has density function $f$ then individual point
+  values have probability zero:
+
+  $ P(X = c) = integral_c^c f(x) dif x = 0, forall c in RR $
+]
+
+#remark[
+  It follows a random variable with a density function is not discrete. Also
+  the probabilities of intervals are not changed by including or excluding
+  endpoints.
+]
+
+How to determine which functions are p.d.f.s? Since $P(-infinity < X <
+infinity) = 1$, a p.d.f. $f$ must satisfy
+
+$
+  f(x) >= 0 forall x in RR \
+  integral^infinity_(-infinity) f(x) dif x = 1
+$
+
+#fact[
+  Random variables with density functions are called _continuous_ random
+  variables. This does not imply that the random variable is a continuous
+  function on $Omega$ but it is standard terminology.
+]
+
+Named distributions of continuous random variables are introduced in the
+following chapters.
+
 = Lecture #datetime(day: 27, year: 2025, month: 1).display()
 
 == Bernoulli trials
@@ -1138,107 +1296,7 @@ exactly one sequence that gives us success.
   $
 ]
 
-= Notes on textbook chapter 3
-
-Recall that a random variable $X$ is a function $X : Omega -> RR$ that gives
-the probability of an event $omega in Omega$. The _probability distribution_ of
-$X$ gives its important probabilistic information. The probability distribution
-is a description of the probabilities $P(X in B)$ for subsets $B in RR$. We
-describe the probability density function and the cumulative distribution
-function.
-
-A random variable $X$ is discrete if there is countable $A$ such that $P(X in
-A) = 1$. $k$ is a possible value if $P(X = k) > 0$.
-
-A discrete random variable has probability distribution entirely determined by
-p.m.f $p(k) = P(X = k)$. The p.m.f. is a function from the set of possible
-values of $X$ into $[0,1]$. Labeling the p.m.f. with the random variable is
-done by $p_X (k)$.
-
-By the axioms of probability,
-
-$
-  sum_k p_X (k) = sum_k P(X=k) = 1
-$
-
-For a subset $B subset RR$,
-
-$
-  P(X in B) = sum_(k in B) p_X (k)
-$
-
-Now we introduce another major class of random variables.
-
-#definition[
-  Let $X$ be a random variable. If $f$ satisfies
-
-  $
-    P(X <= b) = integral^b_(-infinity) f(x) dif x
-  $
-
-  for all $b in RR$, then $f$ is the *probability density function* of $X$.
-]
-
-The probability that $X in (-infinity, b]$ is equal to the area under the graph
-of $f$ from $-infinity$ to $b$.
-
-A corollary is the following.
-
-#fact[
-  $ P(X in B) = integral_B f(x) dif x $
-]
-
-for any $B subset RR$ where integration makes sense.
-
-The set can be bounded or unbounded, or any collection of intervals.
-
-#fact[
-  $ P(a <= X <= b) = integral_a^b f(x) dif x $
-  $ P(X > a) = integral_a^infinity f(x) dif x $
-]
-
-#fact[
-  If a random variable $X$ has density function $f$ then individual point
-  values have probability zero:
-
-  $ P(X = c) = integral_c^c f(x) dif x = 0, forall c in RR $
-]
-
-#remark[
-  It follows a random variable with a density function is not discrete. Also
-  the probabilities of intervals are not changed by including or excluding
-  endpoints.
-]
-
-How to determine which functions are p.d.f.s? Since $P(-infinity < X <
-infinity) = 1$, a p.d.f. $f$ must satisfy
-
-$
-  f(x) >= 0 forall x in RR \
-  integral^infinity_(-infinity) f(x) dif x = 1
-$
-
-#fact[
-  Random variables with density functions are called _continuous_ random
-  variables. This does not imply that the random variable is a continuous
-  function on $Omega$ but it is standard terminology.
-]
-
-#definition[
-  Let $[a,b]$ be a bounded interval on the real line. A random variable $X$ has
-  the *uniform distribution* on $[a,b]$ if $X$ has density function
-
-  $
-    f(x) = cases(
-      1/(b-a)", if" x in [a,b],
-      0", if" x in.not [a,b]
-    )
-  $
-
-  Abbreviate this by $X ~ "Unif"[a,b]$.
-]
-
-= Notes on week 3 lecture slides
+= Some more discrete distributions
 
 == Negative binomial