auto-update(nvim): 2025-01-27 15:16:13

2025-01-27 15:16:13 -08:00 · 2025-01-27 15:16:13 -08:00 · 68822d7ff0
commit 68822d7ff0
parent 7a073a861b
1 changed files with 302 additions and 0 deletions
--- a/documents/by-course/pstat-120a/course-notes/main.typ
+++ b/documents/by-course/pstat-120a/course-notes/main.typ
@ -841,3 +841,305 @@ us generalize to more than two colors.
  We say that the events $A_1, ..., A_n$ are *pairwise independent* if any two
  different events $A_i$ and $A_j$ are independent for any $i != j$.
 ]
 = Lecture #datetime(day: 27, year: 2025, month: 1).display()
 == Bernoulli trials
 The setup: the experiment has exactly two outcomes:
 - Success -- $S$ or 1
 - Failure -- $F$ or 0
 Additionally:
 $
  P(S) = p, (0 < p < 1) \
  P(F) = 1 - p = q
 $
 Construct the probability mass function:
 $
  P(X = 1) = p \
  P(X = 0) = 1 - p
 $
 Write it as:
 $ p_x(k) = p^k (1-p)^(1-k) $
 for $k = 1$ and $k = 0$.
 == Binomial distribution
 The setup: very similar to Bernoulli, trials have exactly 2 outcomes. A bunch
 of Bernoulli trials in a row.
 Importantly: $p$ and $q$ are defined exactly the same in all trials.
 This ties the binomial distribution to the sampling with replacement model,
 since each trial does not affect the next.
 We conduct $n$ *independent* trials of this experiment. Example with coins: each
 flip independently has a $1/2$ chance of heads or tails (holds same for die,
 rigged coin, etc).
 $n$ is fixed, i.e. known ahead of time.
 == Binomial random variable
 Let $X = hash$ of successes in $n$ independent trials. For any particular
 sequence of $n$ trials, it takes the form $Omega = {omega} "where" omega = S
  F F dots.c F$ and is of length $n$.
 Then $X(omega) = 0,1,2,...,n$ can take $n + 1$ possible values. The
 probability of any particular sequence is given by the product of the
 individual trial probabilities.
 #example[
  $ omega = S F F S F dots.c S = (p q q p q dots.c p) $
 ]
 So $P(x = 0) = P(F F F dots.c F) = q dot q dot dots.c dot q = q^n$.
 And
 $
  P(X = 1) = P(S F F dots.c F) + P(F S F F dots.c F) + dots.c + P(F F F dots.c F S) \
  = underbrace(n, "possible outcomes") dot p^1 dot p^(n-1) \
  = vec(n, 1) dot p^1 dot p^(n-1) \
  = n dot p^1 dot p^(n-1)
 $
 Now we can generalize
 $
  P(X = 2) = vec(n,2) p^2 q^(n-2)
 $
 How about all successes?
 $
  P(X = n) = P(S S dots.c S) = p^n
 $
 We see that for all failures we have $q^n$ and all successes we have $p^n$.
 Otherwise we use our method above.
 In general, here is the probability mass function for the binomial random variable
 $
  P(X = k) = vec(n, k) p^k q^(n-k), "for" k = 0,1,2,...,n
 $
 Binomial distribution is very powerful. Choosing between two things, what are the probabilities?
 To summarize the characterization of the binomial random variable:
 - $n$ independent trials
 - each trial results in binary success or failure
 - with probability of success $p$, identically across trials
 with $X = hash$ successes in *fixed* $n$ trials.
 $ X ~ "Bin"(n,p) $
 with probability mass function
 $
  P(X = x) = vec(n,x) p^x (1 - p)^(n-x) = p(x) "for" x = 0,1,2,...,n
 $
 We see this is in fact the binomial theorem!
 $
  p(x) >= 0, sum^n_(x=0) p(x) = sum^n_(x=0) vec(n,x) p^x q^(n-x) = (p + q)^n
 $
 In fact,
 $
  (p + q)^n = (p + (1 - p))^n = 1
 $
 #example[
  Family 5 children, what is the probability that number of males = 2 if we
  assume births are independent and probability of a male is 0.5.
  First we check binomial criteria: $n$ independent trials, well formed
  $S$/$F$, probability the same across trials. Let's say male is $S$ and
  otherwise $F$.
  We have $n=5$ and $p = 0.5$. We just need $P(X = 2)$.
  $
    P(X = 2) = vec(5,2) (0.5)^2 (0.5)^3 \
    = (5 dot 4) / (2 dot 1) (1 / 2)^5 = 10 / 32
  $
 ]
 #example[
  What is the probability of getting exactly three aces (1's) out of 10 throws
  of a fair die?
  Seems a little trickier but we can still write this as well defined $S$/$F$.
  Let $S$ be getting an ace and $F$ being anything else.
  Then $p = 1/6$ and $n = 10$. We want $P(X=3)$. So
  $
    P(X=3) = vec(10,3) p^3 q^7 = vec(10,3) (1 / 6)^3 (5 / 6)^7 \
    approx 0.15505
  $
 ]
 #example[
  Suppose we have two types of candy, red and black. Select $n$ candies. Let $X$
  be the number of red candies among $n$ selected.
  2 cases.
  - case 1: with replacement: Binomial Distribution, $n$, $p = a/(a + b)$.
  $ P(X = 2) = vec(n,2) (a / (a+b))^2 (b / (a+b))^(n-2) $
  - case 2: without replacement: then use counting
  $ P(X = x) = (vec(a,x) vec(b,n-x)) / vec(a+b,n) = p(x) $
 ]
 We've done case 2 before, but now we introduce a random variable to represent
 it.
 $ P(X = x) = (vec(a,x) vec(b,n-x)) / vec(a+b,n) = p(x) $
 is known as a *Hypergeometric distribution*.
 == Hypergeometric distribution
 There are different characterizations of the parameters, but
 $ X ~ "Hypergeom"(hash "total", hash "successes", "sample size") $
 For example,
 $ X ~ "Hypergeom"(N, a, n) "where" N = a+b $
 In the textbook, it's
 $ X ~ "Hypergeom"(N, N_a, n) $
 #remark[
  If $x$ is very small relative to $a + b$, then both cases give similar (approx.
  the same) answers.
 ]
 For instance, if we're sampling for blood types from UCSB, and we take a
 student out without replacement, we don't really change the sample size
 substantially. So both answers give a similar result.
 Suppose we have two types of items, type $A$ and type $B$. Let $N_A$ be $hash$
 type $A$, $N_B$ $hash$ type $B$. $N = N_A + N_B$ is the total number of
 objects.
 We sample $n$ items *without replacement* ($n <= N$) with order not mattering.
 Denote by $X$ the number of type $A$ objects in our sample.
 #definition[
  Let $0 <= N_A <= N$ and $1 <= n <= N$ be integers. A random variable $X$ has the *hypergeometric distribution* with parameters $(N, N_A, n)$ if $X$ takes values in the set ${0,1,...,n}$ and has p.m.f.
  $ P(X = k) = (vec(N_A,k) vec(N-N_A,n-k)) / vec(N,n) = p(k) $
 ]
 #example[
  Let $N_A = 10$ defectives. Let $N_B = 90$ non-defectives. We select $n=5$ without replacement. What is the probability that 2 of the 5 selected are defective?
  $
    X ~ "Hypergeom" (N = 100, N_A = 10, n = 5)
  $
  We want $P(X=2)$.
  $
    P(X=2) = (vec(10,2) vec(90,3)) / vec(100,5) approx 0.0702
  $
 ]
 #remark[
  Make sure you can distinguish when a problem is binomial or when it is
  hypergeometric. This is very important on exams.
  Recall that both ask about number of successes, in a fixed number of trials.
  But binomial is sample with replacement (each trial is independent) and
  sampling without replacement is hypergeometric.
 ]
 #example[
  Cat gives birth to 6 kittens. 2 are male, 4 are females. Your neighbor comes and picks up 3 kittens randomly to take home with them.
  How to define random variable? What is p.m.f.?
  Let $X$ be the number of male cats in the neighbor's selection.
  $ X ~ "Hypergeom"(N = 6, N_A = 2, n = 3) $
  and $X$ takes values in ${0,1,2}$. Find the p.m.f. by finding probabilities for these values.
  $
    &P(X = 0) = (vec(2,0) vec(4,3)) / vec(6,3) = 4 / 20 \
    &P(X = 1) = (vec(2,1) vec(4,2)) / vec(6,3) = 12 / 20 \
    &P(X = 2) = (vec(2,2) vec(4,1)) / vec(6,3) = 4 / 20 \
    &P(X = 3) = (vec(2,3) vec(4,0)) / vec(6,3) = 0
  $
  Note that for $P(X=3)$, we are asking for 3 successes (drawing males) where
  there are only 2 males, so it must be 0.
 ]
 == Geometric distribution
 Consider an infinite sequence of independent trials. e.g. number of attmepts until I make a basket.
 Let $X_i$ denote the outcome of the $i^"th"$ trial, where success is 1 and failure is 0. Let $N$ be the number of trials needed to observe the first success in a sequence of independent trials with probabilty of success $p$. Then
 We fail $k-1$ times and succeed on the $k^"th"$ try. Then:
 $
  P(N = k) = P(X_1 = 0, X_2 = 0, ..., X_(k-1) = 0, X_k = 1) = (1 - p)^(k-1) p
 $
 This is the probability of failures raised to the amount of failures, times
 probability of success.
 The key characteristic in these trials, we keep going until we succeed. There's
 no $n$ choose $k$ in front like the binomial distribution because there's
 exactly one sequence that gives us success.
 #definition[
  Let $0 < p <= 1$. A random variable $X$ has the geometric distribution with
  success parameter $p$ if the possible values of $X$ are ${1,2,3,...}$ and $X$
  satisfies
  $
    P(X=k) = (1-p)^(k-1) p
  $
  for positive integers $k$. Abbreviate this by $X ~ "Geom"(p)$.
 ]
 #example[
  What is the probability it takes more than seven rolls of a fair die to roll a
  six?
  Let $X$ be the number of rolls of a fair die until the first six. Then $X ~
  "Geom"(1/6)$. Now we just want $P(X > 7)$.
  $
    P(X > 7) = sum^infinity_(k=8) P(X=k) = sum^infinity_(k=8) (5 / 6)^(k-1) 1 / 6
  $
  Re-indexing,
  $
    sum^infinity_(k=8) (5 / 6)^(k-1) 1 / 6 = 1 / 6 (5 / 6)^7 sum^infinity_(j=0) (5 / 6)^j
  $
  Now we calculate by standard methods:
  $
    1 / 6 (5 / 6)^7 sum^infinity_(j=0) (5 / 6)^j = 1 / 6 (5 / 6)^7 dot 1 / (1-5 / 6) =
    (5 / 6)^7
  $
 ]