alexandria/documents/by-course/pstat-120a/course-notes/main.typ

843 lines
27 KiB
Text

#import "@youwen/zen:0.1.0": *
#import "@preview/ctheorems:1.1.3": *
#show: zen.with(
title: "PSTAT120A Course Notes",
author: "Youwen Wu",
date: "Winter 2025",
subtitle: "Taught by Brian Wainwright",
)
#outline()
= Introduction
PSTAT 120A is an introductory course on probability and statistics. However, it
is a theoretical course rather an applied statistics course. You will not learn
how to read or conduct real-world statistical studies. Leave your $p$-values at
home, this ain't your momma's AP Stats.
= Lecture #datetime(day: 6, month: 1, year: 2025).display()
== Preliminaries
#definition[
Statistics is the science dealing with the collection, summarization,
analysis, and interpretation of data.
]
== Set theory for dummies
A terse introduction to elementary naive set theory and the basic operations
upon them.
#remark[
Keep in mind that without $cal(Z F C)$ or another model of set theory that
resolves fundamental issues, our set theory is subject to paradoxes like
Russell's. Whoops, the universe doesn't exist.
]
#definition[
A *set* is a collection of elements.
]
#example[Examples of sets][
+ Trivial set: ${1}$
+ Empty set: $emptyset$
+ $A = {a,b,c}$
]
We can construct sets using set-builder notation (also sometimes called set
comprehension).
$ {"expression with" x | "conditions on" x} $
#example("Set builder notation")[
+ The set of all even integers: ${2n | n in ZZ}$
+ The set of all perfect squares in $RR$: ${x^2 | x in NN}$
]
We also have notation for working with sets:
With arbitrary sets $A$, $B$:
+ $a in A$ ($a$ is a member of the set $A$)
+ $a in.not A$ ($a$ is not a member of the set $A$)
+ $A subset.eq B$ (Set theory: $A$ is a subset of $B$) (Stats: $A$ is a sample space in $B$)
+ $A subset B$ (Proper subset: $A != B$)
+ $A^c$ or $A'$ (read "complement of $A$," and introduced later)
+ $A union B$ (Union of $A$ and $B$. Gives a set with both the elements of $A$ and $B$)
+ $A sect B$ (Intersection of $A$ and $B$. Gives a set consisting of the elements in *both* $A$ and $B$)
+ $A \\ B$ (Set difference. The set of all elements of $A$ that are not also in $B$)
+ $A times B$ (Cartesian product. Ordered pairs of $(a,b)$ $forall a in A$, $forall b in B$)
We can also write a few of these operations precisely as set comprehensions.
+ $A subset B => A = {a | a in B, forall a in A}$
+ $A union B = {x | x in A or x in B}$ (here $or$ is the logical OR)
+ $A sect B = {x | x in A and x in B}$ (here $and$ is the logical AND)
+ $A \\ B = {a | a in A and a in.not B}$
+ $A times B = {(a,b) | forall a in A, forall b in B}$
Take a moment and convince yourself that these definitions are equivalent to
the previous ones.
#definition[
The universal set $Omega$ is the set of all objects in a given set
theoretical universe.
]
With the above definition, we can now introduce the set complement.
#definition[
The set complement $A'$ is given by
$
A' = Omega \\ A
$
where $Omega$ is the _universal set_.
]
#example[The real plane][
The real plane $RR^2$ can be defined as a Cartesian product of $RR$ with
itself.
$ RR^2 = RR times RR $
]
Check your intuition that this makes sense. Why do you think $RR^n$ was chosen
as the notation for $n$ dimensional spaces in $RR$?
#definition[Disjoint sets][
If $A sect B$ = $emptyset$, then we say that $A$ and $B$ are *disjoint*.
]
#fact[
For any sets $A$ and $B$, we have DeMorgan's Laws:
+ $(A union B)' = A' sect B'$
+ $(A sect B)' = A' union B'$
]
#fact[Generalized DeMorgan's][
+ $(union.big_i A_i)' = sect.big_i A_i '$
+ $(sect.big_i A_i)' = union.big_i A_i '$
]
== Sizes of infinity
#definition[
Let $N(A)$ be the number of elements in $A$. $N(A)$ is called the _cardinality_ of $A$.
]
We say a set is finite if it has finite cardinality, or infinite if it has an
infinite cardinality.
Infinite sets can be either _countably infinite_ or _uncountably infinite_.
When a set is countably infinite, its cardinality is $aleph_0$ (here $aleph$ is
the Hebrew letter aleph and read "aleph null").
When a set is uncountably infinite, its cardinality is greater than $aleph_0$.
#example("Countable sets")[
+ The natural numbers $NN$.
+ The rationals $QQ$.
+ The natural numbers $ZZ$.
+ The set of all logical tautologies.
]
#example("Uncountable sets")[
+ The real numbers $RR$.
+ The real numbers in the interval $[0,1]$.
+ The _power set_ of $ZZ$, which is the set of all subsets of $ZZ$.
]
#remark[
All the uncountable sets above have cardinality $2^(aleph_0)$ or $aleph_1$ or
$frak(c)$ or $beth_1$. This is the _cardinality of the continuum_, also
called "aleph 1" or "beth 1".
However, in general uncountably infinite sets do not have the same
cardinality.
]
#fact[
If a set is countably infinite, then it has a bijection with $ZZ$. This means
every set with cardinality $aleph_0$ has a bijection to $ZZ$. More generally,
any sets with the same cardinality have a bijection between them.
]
This gives us the following equivalent statement:
#fact[
Two sets have the same cardinality if and only if there exists a bijective
function between them. In symbols,
$ N(A) = N(B) <==> exists F : A <-> B $
]
= Lecture #datetime(day: 8, month: 1, year: 2025).display()
== Probability
#definition[
A *random experiment* is one in which the set of all possible outcomes is known in advance, but one can't predict which outcome will occur on a given trial of the experiment.
]
#example("Finite sample spaces")[
Toss a coin:
$ Omega = {H,T} $
Roll a pair of dice:
$ Omega = {1,2,3,4,5,6} times {1,2,3,4,5,6} $
]
#example("Countably infinite sample spaces")[
Shoot a basket until you make one:
$ Omega = {M, F M, F F M, F F F M, dots} $
]
#example("Uncountably infinite sample space")[
Waiting time for a bus:
$ Omega = {T : t >= 0} $
]
#fact[
Elements of $Omega$ are called sample points.
]
#definition[
Any properly defined subset of $Omega$ is called an *event*.
]
#example[Dice][
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
$ A = {(4,6,), (5,5),(6,4)} $
]
Probabilistic concepts in the parlance of set theory:
- Superset ($Omega$) $<->$ sample space
- Element $<->$ outcome / sample point ($omega$)
- Disjoint sets $<->$ mutually exclusive events
== Classical approach
Classical approach:
$ P(a) = (hash A) / (hash Omega) $
Requires equally likely outcomes and finite sample spaces.
#remark[
With an infinite sample space, the probability becomes 0, which is often wrong.
]
#example("Dice again")[
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
$
A &= {(4,6,), (5,5),(6,4)} \
P(A) &= 3 / 36 = 1 / 12
$
]
== Relative frequency approach
An approach done commonly by applied statisticians who work in the disgusting
real world. This is where we are generally concerned with irrelevant concerns
like accurate sampling and $p$-values and such. I am told this is covered in
PSTAT 120B, so hopefully I can avoid ever taking that class (as a pure math
major).
$
P(A) = (hash "of times" A "occurs in large number of trials") / (hash "of trials")
$
#example[
Flipping a coin to determine the probability of it landing heads.
]
== Subjective approach
Personal definition of probability. Not "real" probability, merely co-opting
its parlance to lend credibility to subjective judgements of confidence.
== Axiomatic approach
Consider a random experiment. Then:
#definition[
The *sample space* $Omega$ is the set of all possible outcomes of the
experiment.
]
#definition[
Elements of $Omega$ are called *sample points*.
]
#definition[
Subsets of $Omega$ are called *events*. The collection of events (in other
terms, the power set of $Omega$) in $Omega$ is denoted by $cal(F)$.
]
#definition[
The *probability measure*, or probability distribution, or simply probability s a function $P$.
Let $P : cal(F) -> RR$ be a function satisfying the following axioms (properties).
+ $P(A) >= 0, forall A$
+ $P(Omega) = 1$
+ If $A_i sect A_j = emptyset, forall i != j$, then
$ P(union.big_(i=1)^infinity A_i) = sum_(i=1)^infinity P(A_i) $
]
The 3-tuple $(Omega, cal(F), P)$ is called a *probability space*.
#remark[
In more advanced texts you will see $Omega$ introduced as a so-called
$sigma$-algebra. A $sigma$-algebra on a set $Omega$ is a nonempty collection
$Sigma$ of subsets of $Omega$ that is closed under set complement, countable
unions, and as a corollary, countable intersections.
]
Now let us show various results with $P$.
#proposition[
$ P(emptyset) = 0 $
]
#proof[
By axiom 3,
$
A_1 = emptyset, A_2 = emptyset, A_3 = emptyset \
P(emptyset) = sum^infinity_(i=1) P(A_i) = sum^infinity_(i=1) P(emptyset)
$
Suppose $P(emptyset) != 0$. Then $P >= 0$ by axiom 1 but then $P -> infinity$ in the sum, which implies $Omega > 1$, which is disallowed by axiom 2. So $P(emptyset) = 0$.
]
#proposition[
If $A_1, A_2, ..., A_n$ are disjoint, then
$ P(union.big^n_(i=1) A_i) = sum^n_(i= 1) P(A_i) $
]
This is mostly a formal manipulation to derive the obviously true proposition from our axioms.
#proof[
Write any finite set $(A_1, A_2, ..., A_n)$ as an infinite set $(A_1, A_2, ..., A_n, emptyset, emptyset, ...)$. Then
$
P(union.big_(i=1)^infinity A_i) = sum^n_(i=1) P(A_i) + sum^infinity_(i=n+1) P(emptyset) = sum^n_(i=1) P(A_i)
$
And because all of the elements after $A_n$ are $emptyset$, their union adds no additional elements to the resultant union set of all $A_i$, so
$
P(union.big_(i=1)^infinity A_i) = P(union.big_(i=1)^n A_i) = sum_(i=1)^n P(A_i)
$
]
#proposition[Complement][
$ P(A') = 1 - P(A) $
]
#proof[
$
A' union A &= Omega \
A' sect A &= emptyset \
P(A' union A) &= P(A') + P(A) &"(by axiom 3)"\
= P(Omega) &= 1 &"(by axiom 2)" \
therefore P(A') &= 1 - P(A)
$
]
#proposition[
$ A subset.eq B => P(A) <= P(B) $
]
#proof[
$ B = A union (A' sect B) $
but $A$ and ($A' sect B$) are disjoint, so
$
P(B) &= P(A union (A' sect B)) \
&= P(A) + P(A' sect B) \
&therefore P(B) >= P(A)
$
]
#proposition[
$ P(A union B) = P(A) + P(B) - P(A sect B) $
]
#proof[
$
A = (A sect B) union (A sect B') \
=> P(A) = P(A sect B) + P(A sect B') \
=> P(B) = P(B sect A) + P(B sect A') \
P(A) + P(B) = P(A sect B) + P(A sect B) + P(A sect B') + P(A' sect B) \
=> P(A) + P(B) - P(A sect B) = P(A sect B) + P(A sect B') + P(A' sect B) \
$
]
#remark[
This is a stronger result of axiom 3, which generalizes for all sets $A$ and $B$ regardless of whether they're disjoint.
]
#remark[
These are mostly intuitively true statements (think about the probabilistic
concepts represented by the sets) in classical probability that we derive
rigorously from our axiomatic probability function $P$.
]
#example[
Now let us consider some trivial concepts in classical probability written in
the parlance of combinatorial probability.
Select one card from a deck of 52 cards.
Then the following is true:
$
Omega = {1,2,...,52} \
A = "card is a heart" = {H 2, H 3, H 4, ..., H"Ace"} \
B = "card is an Ace" = {H"Ace", C"Ace", D"Ace", S"Ace"} \
C = "card is black" = {C 2, C 3, ..., C"Ace", S 2, S 3, ..., S"Ace"} \
P(A) = 13 / 52,
P(B) = 4 / 52,
P(C) = 26 / 52 \
P(A sect B) = 1 / 52 \
P(A sect C) = 0 \
P(B sect C) = 2 / 52 \
P(A union B) = P(A) + P(B) - P(A sect B) = 16 / 52 \
P(B') = 1 - P(B) = 48 / 52 \
P(A sect B') = P(A) - P(A sect B) = 13 / 52 - 1 / 52 = 12 / 52 \
P((A sect B') union (A' sect B)) = P(A sect B') + P(A' sect B) = 15 / 52 \
P(A' sect B') = P(A union B)' = 1 - P(A union B) = 36 / 52
$
]
== Countable sample spaces
#definition[
A sample space $Omega$ is said to be *countable* if it's finite or countably infinite.
]
In such a case, one can list the elements of $Omega$.
$ Omega = {omega_1, omega_2, omega_3, ...} $
with associated probabilities, $p_1, p_2, p_3,...$, where
$
p_i = P(omega_i) >= 0 \
1 = P(Omega) = sum P(omega_i)
$
#example[Fair die, again][
All outcomes are equally likely,
$ p_1 = p_2 = ... = p_6 = 1 / 6 $
Let $A$ be the event that the score is odd = ${1,3,5}$
$ P(A) = 3 / 6 $
]
#example[Loaded die][
Consider a die where the probabilities of rolling odd sides is double the probability of rolling an even side.
$
p_2 = p_4 = p_6, p_1 = p_3 = p_5 = 2p_2 \
6p_2 + 3p_2 = 9p_2 = 1 \
p_2 = 1 / 9, p_1 = 2 / 9
$
]
#example[Coins][
Toss a fair coin until you get the first head.
$
Omega = {H, T H, T T H, ...} "(countably infinite)" \
P(H) = 1 / 2 \
P(T T H) = (1 / 2)^3 \
P(Omega) = sum_(n=1)^infinity (1 / 2)^n = 1 / (1 - 1 / 2) - 1 = 1
$
]
#example[Birthdays][
What is the probability two people share the same birthday?
$
Omega = [1,365] times [1,365] \
P(A) = 365 / 365^2 = 1 / 365
$
]
== Continuous sample spaces
#definition[
A *continuous sample space* contains an interval in $RR$ and is uncountably infinite.
]
#definition[
A probability density function (#smallcaps[pdf]) gives the probability at the point
$s$.
]
Properties of the #smallcaps[pdf]:
- $f(s) >= 0, forall p_i >= 0$
- $integral_S f(s) dif s = 1, forall p_i >= 0$
#example[
Waiting time for bus: $Omega = {s : s >= 0}$.
]
= Notes on counting
The cardinality of $A$ is given by $hash A$. Let us develop methods for finding
$hash A$ from a description of the set $A$ (in other words, methods for
counting).
== General multiplication principle
#fact[
Let $A$ and $B$ be finite sets, $k in ZZ^+$. Then let $f : A -> B$ be a
function such that each element in $B$ is the image of exactly $k$ elements
in $A$ (such a function is called _$k$-to-one_). Then $hash A = k dot hash
B$.
]<ktoone>
#example[
Four fully loaded 10-seater vans transported people to the picnic. How many
people were transported?
By @ktoone, we have $A$ is the set of people, $B$ is the set of vans, $f : A -> B$ maps a person to the van they ride in. So $f$ is a 10-to-one function, $hash A = 40$, $hash B = 4$, and clearly the answer is $10 dot 4 = 40$.
]
#definition[
An $n$-tuple is an ordered sequence of $n$ elements.
]
Many of our methods in probability rely on multiplying together multiple
outcomes to obtain their combined amount of outcomes. We make this explicit below in @tuplemultiplication.
#fact[
Suppose a set of $n$-tuples $(a_1, ..., a_n)$ obeys these rules:
+ There are $r_1$ choices for the first entry $a_1$.
+ Once the first $k$ entries $a_1, ..., a_k$ have been chosen, the number of alternatives for the next entry $a_(k+1)$ is $r_(k+1)$, regardless of the previous choices.
Then the total number of $n$-tuples is the product $r_1 dot r_2 dot r_2 dot dots dot r_n$.
]<tuplemultiplication>
#proof[
It is trivially true for $n = 1$ since you have $r_1$ choices of $a_1$ for a
1-tuple $(a_1)$.
Let $A$ be the set of all possible $n$-tuples and $B$ be the set of all
possible $(n+1)$-tuples. Now let us assume the statement is true for $A$.
Proceed by induction on $B$, noting that for each $n$-tuple in $A$, $(a_1,
..., a_n)$, we have $r_(n+1)$ tuples in $A$.
Let $f : B -> A$ be a function which takes each $(n+1)$-tuple and truncates the $a_(n+1)$ term, leaving us with just an $n$-tuple of the form $(a_1, a_2, ..., a_n)$.
$ f((a_1, ..., a_n, a_(n + 1))) = (a_1, ..., a_n) $
Now notice that $f$ is precisely a $r_(n+1)$-to-one function! Recall by
our assumption that @tuplemultiplication is true for $n$-tuples, so $A$ has $r_1 dot
r_2 dot ... dot r_n$ elements, or $hash A = r_1 dot ... dot r_n$. Then by
@ktoone, we have $hash B = hash A dot r_(n+1) = r_1 dot r_2 dot
... dot r_(n+1)$. Our induction is complete and we have proved @tuplemultiplication.
]
@tuplemultiplication is sometimes called the _general multiplication principle_.
We can use @tuplemultiplication to derive counting formulas for various
situations. Let $A_1, A_2, A_n$ be finite sets. Then as a corollary of
@tuplemultiplication, we can count the number of $n$-tuples in a finite
Cartesian product of $A_1, A_2, A_n$.
#fact[
Let $A_1, A_2, A_n$ be finite sets. Then
$
hash (A_1 times A_2 times ... times, A_n) = (hash A_1) dot (hash A_2) dot ... dot (hash A_n) = Pi^n_(i=1) (hash A_i)
$
]
#example[
How many distinct subsets does a set of size $n$ have?
The answer is $2^n$. Each subset can be encoded as an $n$-tuple with entries 0
or 1, where the $i$th entry is 1 if the $i$th element of the set is in the
subset and 0 if it is not.
Thus the number of subsets is the same as the cardinality of
$ {0,1} times ... times {0,1} = {0,1}^n $
which is $2^n$.
This is why given a set $X$ with cardinality $aleph$, we write the
cardinality of the power set of $X$ as $2^aleph$.
]
== Permutations
Now we can use the multiplication principle to count permutations.
#fact[
Consider all $k$-tuples $(a_1, ..., a_k)$ that can be constructed from a set $A$ of size $n, n>= k$ without repetition. The total number of these $k$-tuples is
$ (n)_k = n dot (n - 1) ... (n - k + 1) = n! / (n-k)! $
In particular, with $k=n$, each $n$-tuple is an ordering or _permutation_ of $A$. So the total number of permutations of a set of $n$ elements is $n!$.
]<permutation>
#proof[
We construct the $k$-tuples sequentially. For the first element, we choose
one element from $A$ with $n$ alternatives. The next element has $n - 1$
alternatives. In general, after $j$ elements are chosen, there are $n - j +
1$ alternatives.
Then clearly after choosing $k$ elements for our $k$-tuple we have by
@tuplemultiplication the number of $k$-tuples being $n dot (n - 1) dot ...
dot (n - k + 1) = (n)_k$.
]
#example[
Consider a round table with 8 seats.
+ In how many ways can we seat 8 guests around the table?
+ In how many ways can we do this if we do not differentiate between seating arrangements that are rotations of each other?
For (1), we easily see that we're simply asking for permutations of an
8-tuple, so $8!$ is the answer.
For (2), we number each person and each seat from 1-8, then always place person 1 in seat 1, and count the permutations of the other 7 people in the other 7 seats. Then the answer is $7!$.
Alternatively, notice that each arrangement has 8 equivalent arrangements under rotation. So the answer is $8!/8 = 7!$.
]
== Counting from sets
We turn our attention to sets, which unlike tuples are unordered collections.
#fact[
Let $n,k in NN$ with $0 <= k <= n$. The numbers of distinct subsets of size $k$ that a set of size $n$ has is given by the *binomial coefficient*
$ vec(n,k) = n! / (k! (n-k)!) $
]
#proof[
Let $A$ be a set of size $n$. By @permutation, $n!/(n-k)!$ unique ordered
$k$-tuples can be constructed from elements of $A$. Each subset of $A$ of
size $k$ has exactly $k!$ different orderings, and hence appears exactly $k!$
times among the ordered $k$-tuples. Thus the number of subsets of size $k$ is
$n! / (k! (n-k)!)$.
]
#example[
In a class there are 12 boys and 14 girls. How many different teams of 7 pupils
with 3 boys and 4 girls can be create?
First let us compute how many subsets of size 3 we can choose from the 12 boys and how many subsets of size 4 we can choose from the 14 girls.
$
"boys" &= vec(12,3) \
"girls" &= vec(14,4)
$
Then let us consider the entire team as a 2-tuple of (boys, girls). Then
there are $vec(12,3)$ alternatives for the choice of boys, and $vec(14,4)$ alternatives for
the choice of girls, so by the multiplication principle, we have the total being
$ vec(12,3) vec(14,4) $
]
#example[
Color the numbers 1, 2 red, the numbers 3, 4 green, and the numbers 5, 6
yellow. How many different two-element subsets of $A$ are there that have two
different colors?
First choose 2 colors, $vec(3,2) = 3$. Then from each color, choose one. Altogether it's
$ vec(3,2) vec(2,1) vec(2,1) = 3 dot 2 dot 2 = 12 $
]
One way to view $vec(n,k)$ is as the number of ways of painting $n$ elements
with two colors, red and yellow, with $k$ red and $n - k$ yellow elements. Let
us generalize to more than two colors.
#fact[
Let $n$ and $r$ be positive integers and $k_1, ..., k_r$ nonnegative integers
such that $k_1 + dots.c + k_r = n$. The number of ways of assigning labels
$1,2, ..., r$ to $n$ items so that for each $i = 1, 2, ..., r$, exactly $k_i$
items receive label $i$, is the *multinomial coefficient*
$ vec(n, (k_1, k_2, ..., k_r)) = vec(n!, k_1 ! k_2 ! dots.c k_r !) $
]<multinomial-coefficient>
#proof[
Order the $n$ integers in some manner, and assign labels like this: for the
first $k_1$ integers, assign the label 1, then for the next $k_2$ integers,
assign the label 2, and so on. The $i$th label will be assigned to all the
integers between positions $k_1 + dots.c + k_(i-1) + 1$ and $k_1 + dots.c +
k_i$.
Then notice that all possible orderings (permutations) of the integers gives
every possible way to label the integers. However, we overcount by some
amount. How much? The order of the integers with a given label don't matter,
so we need to deduplicate those.
Each set of labels is duplicated once for each way we can order all of the
elements with the same label. For label $i$, there are $k_i$ elements with
that label, so $k_i !$ ways to order those. By @tuplemultiplication, we know
that we can express the combined amount of ways each group of $k_1, ..., k_i$
numbers are labeled as $k_1 ! k_2 ! k_3 ! dots.c k_r !$.
So by @ktoone, we can account for the duplicates and the answer is
$ n! / (k_1 ! k_2 ! k_3 ! dots.c k_r !) $
]
#remark[
@multinomial-coefficient gives us a way to count how many ways there are to
fit $n$ distinguishable objects into $r$ distinguishable containers of
varying capacity.
To find the amount of ways to fit $n$ distinguishable objects into $k$
indistinguishable containers of _any_ capacity, use the "ball-and-urn"
technique.
]
#example[
How many different ways can six people be divided into three pairs?
First we use the multimonial coefficient to count the amount of ways to assign specific labels to pairs of elements:
$ vec(6, (2,2,2)) $
But notice that the actual labels themselves are irrelevant. Our multimonial
coefficient counts how many ways there are to assign 3 distinguishable
labels, say Pair 1, Pair 2, Pair 3, to our 6 elements.
To make this more explicit, say we had a 3-tuple where the position encoded
the label, where position 1 corresponds to Pair 1, and so on. Then the values
are the actual pairs of people (numbered 1-6). For instance
$ ((1,2), (3,4), (5,6)) $
corresponds to assigning the label Pair 1 to (1,2), Pair 2 to (3,4) and Pair
3 to (5,6). What our multimonial coefficient is doing is it's counting this,
as well as any other orderings of this tuple. For instance
$ ((3,4), (1,2), (5,6)) $
is also counted. However since in our case the actual labels are irrelevant,
the two examples shown above should really be counted only once.
How many extra times is each case counted? It turns out that we can think of
our multimonial coefficient as permuting the labels across our pairs. So in
this case it's permuting all the ways we can order 3 labels, which is $3! =
6$. That means by @ktoone our answer is
$ vec(6, (2,2,2)) / 3! = 15 $
]
#example("Poker")[
How many poker hands are in the category _one pair_?
A one pair is a hand with two cards of the same rank and three cards with ranks
different from each other and the pair.
We can count in two ways: we count all the ordered hands, then divide by $5!$
to remove overcounting, or we can build the unordered hands directly.
When finding the ordered hands, the key is to figure out how we can encode
our information in a tuple of the form described in @tuplemultiplication, and
then use @tuplemultiplication to compute the solution.
In this case, the first element encodes the two slots in the hand of 5 our
pair occupies, the second element encodes the first card of the pair, the
third element encodes the second card of the pair, and the fourth, fifth, and
sixth elements represent the 3 cards that are not of the same rank.
Now it is clear that the number of alternatives in each position of the
6-tuple does not depend on any of the others, so @tuplemultiplication
applies. Then we can determine the amount of alternatives for each position
in the 6-tuple and multiply them to determine the total amount of ways the
6-tuple can be constructed, giving us the total amount of ways to construct
ordered poker hands with one pairs.
First we choose 2 slots out of 5 positions (in the hand) so there are
$vec(5,2)$ alternatives. Then we choose any of the 52 cards for our first
pair card, so there are 52 alternatives. Then we choose any card with the
same rank for the second card in the pair, where there are 3 possible
alternatives. Then we choose the third card which must not be the same rank
as the first two, where there are 48 alternatives. The fourth card must not
be the same rank as the others, so there are 44 alternatives. Likewise, the
final card has 40 alternatives.
So the final answer is, remembering to divide by $5!$ because we don't care
about order,
$ (vec(5,2) dot 52 dot 3 dot 48 dot 44 dot 40) / 5! $
Alternatively, we can find way to build an unordered hand with the
requirements. First we choose the rank of the pair, then we choose two suits
for that rank, then we choose the remaining 3 different ranks, and finally a
suit for each of the ranks. Then, noting that we will now omit constructing
the tuple and explicitly listing alternatives for brevity, we have
$ 13 dot vec(5,2) dot vec(12, 3) dot 4^3 $
Both approaches given the same answer.
]
= Discussion section #datetime(day: 22, month: 1, year: 2025).display()
= Lecture #datetime(day: 23, month: 1, year: 2025).display()
== Independence
#definition("Independence")[
Two events $A subset Omega$ and $B subset Omega$ are independent if and only if
$ P(B sect A) = P(B)P(A) $
"Joint probability is equal to product of their marginal probabilities."
]
#fact[This definition must be used to show the independence of two events.]
#fact[
If $A$ and $B$ are independent, then,
$
P(A | B) = underbrace((P(A sect B)) / P(B), "conditional probability") = (P(A) P(B)) / P(B) = P(A)
$
]
#example[
Flip a fair coin 3 times. Let the events:
- $A$ = we have exactly one tails among the first 2 flips
- $B$ = we have exactly one tails among the last 2 flips
- $D$ = we get exactly one tails among all 3 flip
Show that $A$ and $B$ are independent.
What about $B$ and $D$?
Compute all of the possible events, then we see that
$
P(A sect B) = (hash (A sect B)) / (hash Omega) = 2 / 8 = 4 / 8 dot 4 / 8 = P(A) P(B)
$
So they are independent.
Repeat the same reasoning for $B$ and $D$, we see that they are not independent.
]
#example[
Suppose we have 4 red and 7 green balls in an urn. We choose two balls with replacement. Let
- $A$ = the first ball is red
- $B$ = the second ball is greeen
Are $A$ and $B$ independent?
$
hash Omega = 11 times 11 = 121 \
hash A = 4 dot 11 = 44 \
hash B = 11 dot 7 = 77 \
hash (A sect B) = 4 dot 7 = 28
$
]
#definition[
Events $A_1, ..., A_n$ are independent (mutually independent) if for every collection $A_i_1, ..., A_i_k$, where $2 <= k <= n$ and $1 <= i_1 < i_2 < dots.c < i_k <= n$,
$
P(A_i_1 sect A_i_2 sect dots.c sect A_i_k) = P(A_i_1) P(A_i_2) dots.c P(A_i_k)
$
]
#definition[
We say that the events $A_1, ..., A_n$ are *pairwise independent* if any two
different events $A_i$ and $A_j$ are independent for any $i != j$.
]