diff --git a/documents/by-course/pstat-120a/course-notes/main.typ b/documents/by-course/pstat-120a/course-notes/main.typ index 71fe880..fa65682 100644 --- a/documents/by-course/pstat-120a/course-notes/main.typ +++ b/documents/by-course/pstat-120a/course-notes/main.typ @@ -841,3 +841,305 @@ us generalize to more than two colors. We say that the events $A_1, ..., A_n$ are *pairwise independent* if any two different events $A_i$ and $A_j$ are independent for any $i != j$. ] + += Lecture #datetime(day: 27, year: 2025, month: 1).display() + +== Bernoulli trials + +The setup: the experiment has exactly two outcomes: +- Success -- $S$ or 1 +- Failure -- $F$ or 0 + +Additionally: +$ + P(S) = p, (0 < p < 1) \ + P(F) = 1 - p = q +$ + +Construct the probability mass function: + +$ + P(X = 1) = p \ + P(X = 0) = 1 - p +$ + +Write it as: + +$ p_x(k) = p^k (1-p)^(1-k) $ + +for $k = 1$ and $k = 0$. + +== Binomial distribution + +The setup: very similar to Bernoulli, trials have exactly 2 outcomes. A bunch +of Bernoulli trials in a row. + +Importantly: $p$ and $q$ are defined exactly the same in all trials. + +This ties the binomial distribution to the sampling with replacement model, +since each trial does not affect the next. + +We conduct $n$ *independent* trials of this experiment. Example with coins: each +flip independently has a $1/2$ chance of heads or tails (holds same for die, +rigged coin, etc). + +$n$ is fixed, i.e. known ahead of time. + +== Binomial random variable + +Let $X = hash$ of successes in $n$ independent trials. For any particular +sequence of $n$ trials, it takes the form $Omega = {omega} "where" omega = S + F F dots.c F$ and is of length $n$. + +Then $X(omega) = 0,1,2,...,n$ can take $n + 1$ possible values. The +probability of any particular sequence is given by the product of the +individual trial probabilities. + +#example[ + $ omega = S F F S F dots.c S = (p q q p q dots.c p) $ +] + +So $P(x = 0) = P(F F F dots.c F) = q dot q dot dots.c dot q = q^n$. + +And +$ + P(X = 1) = P(S F F dots.c F) + P(F S F F dots.c F) + dots.c + P(F F F dots.c F S) \ + = underbrace(n, "possible outcomes") dot p^1 dot p^(n-1) \ + = vec(n, 1) dot p^1 dot p^(n-1) \ + = n dot p^1 dot p^(n-1) +$ + +Now we can generalize + +$ + P(X = 2) = vec(n,2) p^2 q^(n-2) +$ + +How about all successes? + +$ + P(X = n) = P(S S dots.c S) = p^n +$ + +We see that for all failures we have $q^n$ and all successes we have $p^n$. +Otherwise we use our method above. + +In general, here is the probability mass function for the binomial random variable + +$ + P(X = k) = vec(n, k) p^k q^(n-k), "for" k = 0,1,2,...,n +$ + + +Binomial distribution is very powerful. Choosing between two things, what are the probabilities? + +To summarize the characterization of the binomial random variable: + +- $n$ independent trials +- each trial results in binary success or failure +- with probability of success $p$, identically across trials + +with $X = hash$ successes in *fixed* $n$ trials. + +$ X ~ "Bin"(n,p) $ + +with probability mass function + +$ + P(X = x) = vec(n,x) p^x (1 - p)^(n-x) = p(x) "for" x = 0,1,2,...,n +$ + +We see this is in fact the binomial theorem! + +$ + p(x) >= 0, sum^n_(x=0) p(x) = sum^n_(x=0) vec(n,x) p^x q^(n-x) = (p + q)^n +$ + +In fact, +$ + (p + q)^n = (p + (1 - p))^n = 1 +$ + +#example[ + Family 5 children, what is the probability that number of males = 2 if we + assume births are independent and probability of a male is 0.5. + + First we check binomial criteria: $n$ independent trials, well formed + $S$/$F$, probability the same across trials. Let's say male is $S$ and + otherwise $F$. + + We have $n=5$ and $p = 0.5$. We just need $P(X = 2)$. + + $ + P(X = 2) = vec(5,2) (0.5)^2 (0.5)^3 \ + = (5 dot 4) / (2 dot 1) (1 / 2)^5 = 10 / 32 + $ +] + +#example[ + What is the probability of getting exactly three aces (1's) out of 10 throws + of a fair die? + + Seems a little trickier but we can still write this as well defined $S$/$F$. + Let $S$ be getting an ace and $F$ being anything else. + + Then $p = 1/6$ and $n = 10$. We want $P(X=3)$. So + + $ + P(X=3) = vec(10,3) p^3 q^7 = vec(10,3) (1 / 6)^3 (5 / 6)^7 \ + approx 0.15505 + $ +] + +#example[ + Suppose we have two types of candy, red and black. Select $n$ candies. Let $X$ + be the number of red candies among $n$ selected. + + 2 cases. + + - case 1: with replacement: Binomial Distribution, $n$, $p = a/(a + b)$. + $ P(X = 2) = vec(n,2) (a / (a+b))^2 (b / (a+b))^(n-2) $ + - case 2: without replacement: then use counting + $ P(X = x) = (vec(a,x) vec(b,n-x)) / vec(a+b,n) = p(x) $ +] + +We've done case 2 before, but now we introduce a random variable to represent +it. + +$ P(X = x) = (vec(a,x) vec(b,n-x)) / vec(a+b,n) = p(x) $ + +is known as a *Hypergeometric distribution*. + +== Hypergeometric distribution + +There are different characterizations of the parameters, but + +$ X ~ "Hypergeom"(hash "total", hash "successes", "sample size") $ + +For example, +$ X ~ "Hypergeom"(N, a, n) "where" N = a+b $ + +In the textbook, it's +$ X ~ "Hypergeom"(N, N_a, n) $ + +#remark[ + If $x$ is very small relative to $a + b$, then both cases give similar (approx. + the same) answers. +] + +For instance, if we're sampling for blood types from UCSB, and we take a +student out without replacement, we don't really change the sample size +substantially. So both answers give a similar result. + +Suppose we have two types of items, type $A$ and type $B$. Let $N_A$ be $hash$ +type $A$, $N_B$ $hash$ type $B$. $N = N_A + N_B$ is the total number of +objects. + +We sample $n$ items *without replacement* ($n <= N$) with order not mattering. +Denote by $X$ the number of type $A$ objects in our sample. + +#definition[ + Let $0 <= N_A <= N$ and $1 <= n <= N$ be integers. A random variable $X$ has the *hypergeometric distribution* with parameters $(N, N_A, n)$ if $X$ takes values in the set ${0,1,...,n}$ and has p.m.f. + + $ P(X = k) = (vec(N_A,k) vec(N-N_A,n-k)) / vec(N,n) = p(k) $ +] + +#example[ + Let $N_A = 10$ defectives. Let $N_B = 90$ non-defectives. We select $n=5$ without replacement. What is the probability that 2 of the 5 selected are defective? + + $ + X ~ "Hypergeom" (N = 100, N_A = 10, n = 5) + $ + + We want $P(X=2)$. + + $ + P(X=2) = (vec(10,2) vec(90,3)) / vec(100,5) approx 0.0702 + $ +] + +#remark[ + Make sure you can distinguish when a problem is binomial or when it is + hypergeometric. This is very important on exams. + + Recall that both ask about number of successes, in a fixed number of trials. + But binomial is sample with replacement (each trial is independent) and + sampling without replacement is hypergeometric. +] + +#example[ + Cat gives birth to 6 kittens. 2 are male, 4 are females. Your neighbor comes and picks up 3 kittens randomly to take home with them. + + How to define random variable? What is p.m.f.? + + Let $X$ be the number of male cats in the neighbor's selection. + + $ X ~ "Hypergeom"(N = 6, N_A = 2, n = 3) $ + and $X$ takes values in ${0,1,2}$. Find the p.m.f. by finding probabilities for these values. + + $ + &P(X = 0) = (vec(2,0) vec(4,3)) / vec(6,3) = 4 / 20 \ + &P(X = 1) = (vec(2,1) vec(4,2)) / vec(6,3) = 12 / 20 \ + &P(X = 2) = (vec(2,2) vec(4,1)) / vec(6,3) = 4 / 20 \ + &P(X = 3) = (vec(2,3) vec(4,0)) / vec(6,3) = 0 + $ + + Note that for $P(X=3)$, we are asking for 3 successes (drawing males) where + there are only 2 males, so it must be 0. +] + +== Geometric distribution + +Consider an infinite sequence of independent trials. e.g. number of attmepts until I make a basket. + +Let $X_i$ denote the outcome of the $i^"th"$ trial, where success is 1 and failure is 0. Let $N$ be the number of trials needed to observe the first success in a sequence of independent trials with probabilty of success $p$. Then + +We fail $k-1$ times and succeed on the $k^"th"$ try. Then: + +$ + P(N = k) = P(X_1 = 0, X_2 = 0, ..., X_(k-1) = 0, X_k = 1) = (1 - p)^(k-1) p +$ + +This is the probability of failures raised to the amount of failures, times +probability of success. + +The key characteristic in these trials, we keep going until we succeed. There's +no $n$ choose $k$ in front like the binomial distribution because there's +exactly one sequence that gives us success. + +#definition[ + Let $0 < p <= 1$. A random variable $X$ has the geometric distribution with + success parameter $p$ if the possible values of $X$ are ${1,2,3,...}$ and $X$ + satisfies + + $ + P(X=k) = (1-p)^(k-1) p + $ + + for positive integers $k$. Abbreviate this by $X ~ "Geom"(p)$. +] + +#example[ + What is the probability it takes more than seven rolls of a fair die to roll a + six? + + Let $X$ be the number of rolls of a fair die until the first six. Then $X ~ + "Geom"(1/6)$. Now we just want $P(X > 7)$. + + $ + P(X > 7) = sum^infinity_(k=8) P(X=k) = sum^infinity_(k=8) (5 / 6)^(k-1) 1 / 6 + $ + + Re-indexing, + + $ + sum^infinity_(k=8) (5 / 6)^(k-1) 1 / 6 = 1 / 6 (5 / 6)^7 sum^infinity_(j=0) (5 / 6)^j + $ + + Now we calculate by standard methods: + + $ + 1 / 6 (5 / 6)^7 sum^infinity_(j=0) (5 / 6)^j = 1 / 6 (5 / 6)^7 dot 1 / (1-5 / 6) = + (5 / 6)^7 + $ +]