452 lines
12 KiB
Text
452 lines
12 KiB
Text
#import "./dvd.typ": *
|
|
#import "@preview/ctheorems:1.1.3": *
|
|
|
|
#show: dvdtyp.with(
|
|
title: "PSTAT120A Course Notes",
|
|
author: "Youwen Wu",
|
|
date: "Winter 2025",
|
|
subtitle: "Taught by Brian Wainwright",
|
|
)
|
|
|
|
#outline()
|
|
|
|
= Lecture #datetime(day: 6, month: 1, year: 2025).display()
|
|
|
|
== Preliminaries
|
|
|
|
#definition[
|
|
Statistics is the science dealing with the collection, summarization,
|
|
analysis, and interpretation of data.
|
|
]
|
|
|
|
== Set theory for dummies
|
|
|
|
A terse introduction to elementary naive set theory and the basic operations
|
|
upon them.
|
|
|
|
#remark[
|
|
Keep in mind that without $cal(Z F C)$ or another model of set theory that
|
|
resolves fundamental issues, our set theory is subject to paradoxes like
|
|
Russell's. Whoops, the universe doesn't exist.
|
|
]
|
|
|
|
#definition[
|
|
A *set* is a collection of elements.
|
|
]
|
|
|
|
#example[Examples of sets][
|
|
+ Trivial set: ${1}$
|
|
+ Empty set: $emptyset$
|
|
+ $A = {a,b,c}$
|
|
]
|
|
|
|
We can construct sets using set-builder notation (also sometimes called set
|
|
comprehension).
|
|
|
|
$ {"expression with" x | "conditions on" x} $
|
|
|
|
#example("Set builder notation")[
|
|
+ The set of all even integers: ${2n | n in ZZ}$
|
|
+ The set of all perfect squares in $RR$: ${x^2 | x in NN}$
|
|
]
|
|
|
|
We also have notation for working with sets:
|
|
|
|
With arbitrary sets $A$, $B$:
|
|
|
|
+ $a in A$ ($a$ is a member of the set $A$)
|
|
+ $a in.not A$ ($a$ is not a member of the set $A$)
|
|
+ $A subset.eq B$ (Set theory: $A$ is a subset of $B$) (Stats: $A$ is a sample space in $B$)
|
|
+ $A subset B$ (Proper subset: $A != B$)
|
|
+ $A^c$ or $A'$ (read "complement of $A$," and introduced later)
|
|
+ $A union B$ (Union of $A$ and $B$. Gives a set with both the elements of $A$ and $B$)
|
|
+ $A sect B$ (Intersection of $A$ and $B$. Gives a set consisting of the elements in *both* $A$ and $B$)
|
|
+ $A \\ B$ (Set difference. The set of all elements of $A$ that are not also in $B$)
|
|
+ $A times B$ (Cartesian product. Ordered pairs of $(a,b)$ $forall a in A$, $forall b in B$)
|
|
|
|
We can also write a few of these operations precisely as set comprehensions.
|
|
|
|
+ $A subset B => A = {a | a in B, forall a in A}$
|
|
+ $A union B = {x | x in A or x in B}$ (here $or$ is the logical OR)
|
|
+ $A sect B = {x | x in A and x in B}$ (here $and$ is the logical AND)
|
|
+ $A \\ B = {a | a in A and a in.not B}$
|
|
+ $A times B = {(a,b) | forall a in A, forall b in B}$
|
|
|
|
Take a moment and convince yourself that these definitions are equivalent to
|
|
the previous ones.
|
|
|
|
#definition[
|
|
The universal set $Omega$ is the set of all objects in a given set
|
|
theoretical universe.
|
|
]
|
|
|
|
With the above definition, we can now introduce the set complement.
|
|
|
|
#definition[
|
|
The set complement $A'$ is given by
|
|
$
|
|
A' = Omega \\ A
|
|
$
|
|
where $Omega$ is the _universal set_.
|
|
]
|
|
|
|
#example[The real plane][
|
|
The real plane $RR^2$ can be defined as a Cartesian product of $RR$ with
|
|
itself.
|
|
|
|
$ RR^2 = RR times RR $
|
|
]
|
|
|
|
Check your intuition that this makes sense. Why do you think $RR^n$ was chosen
|
|
as the notation for $n$ dimensional spaces in $RR$?
|
|
|
|
#definition[Disjoint sets][
|
|
If $A sect B$ = $emptyset$, then we say that $A$ and $B$ are *disjoint*.
|
|
]
|
|
|
|
#fact[
|
|
For any sets $A$ and $B$, we have DeMorgan's Laws:
|
|
+ $(A union B)' = A' sect B'$
|
|
+ $(A sect B)' = A' union B'$
|
|
]
|
|
|
|
#fact[Generalized DeMorgan's][
|
|
+ $(union.big_i A_i)' = sect.big_i A_i '$
|
|
+ $(sect.big_i A_i)' = union.big_i A_i '$
|
|
]
|
|
|
|
== Sizes of infinity
|
|
|
|
#definition[
|
|
Let $N(A)$ be the number of elements in $A$. $N(A)$ is called the _cardinality_ of $A$.
|
|
]
|
|
|
|
We say a set is finite if it has finite cardinality, or infinite if it has an
|
|
infinite cardinality.
|
|
|
|
Infinite sets can be either _countably infinite_ or _uncountably infinite_.
|
|
|
|
When a set is countably infinite, its cardinality is $aleph_0$ (here $aleph$ is
|
|
the Hebrew letter aleph and read "aleph null").
|
|
|
|
When a set is uncountably infinite, its cardinality is greater than $aleph_0$.
|
|
|
|
#example("Countable sets")[
|
|
+ The natural numbers $NN$.
|
|
+ The rationals $QQ$.
|
|
+ The natural numbers $ZZ$.
|
|
+ The set of all logical tautologies.
|
|
]
|
|
|
|
#example("Uncountable sets")[
|
|
+ The real numbers $RR$.
|
|
+ The real numbers in the interval $[0,1]$.
|
|
+ The _power set_ of $ZZ$, which is the set of all subsets of $ZZ$.
|
|
]
|
|
|
|
#remark[
|
|
All the uncountable sets above have cardinality $2^(aleph_0)$ or $aleph_1$ or
|
|
$frak(c)$ or $beth_1$. This is the _cardinality of the continuum_, also
|
|
called "aleph 1" or "beth 1".
|
|
|
|
However, in general uncountably infinite sets do not have the same
|
|
cardinality.
|
|
]
|
|
|
|
#fact[
|
|
If a set is countably infinite, then it has a bijection with $ZZ$. This means
|
|
every set with cardinality $aleph_0$ has a bijection to $ZZ$. More generally,
|
|
any sets with the same cardinality have a bijection between them.
|
|
]
|
|
|
|
This gives us the following equivalent statement:
|
|
|
|
#fact[
|
|
Two sets have the same cardinality if and only if there exists a bijective
|
|
function between them. In symbols,
|
|
|
|
$ N(A) = N(B) <==> exists F : A <-> B $
|
|
]
|
|
|
|
= Lecture #datetime(day: 8, month: 1, year: 2025).display()
|
|
|
|
== Probability
|
|
|
|
#definition[
|
|
A *random experiment* is one in which the set of all possible outcomes is known in advance, but one can't predict which outcome will occur on a given trial of the experiment.
|
|
]
|
|
|
|
#example("Finite sample spaces")[
|
|
Toss a coin:
|
|
$ Omega = {H,T} $
|
|
|
|
Roll a pair of dice:
|
|
$ Omega = {1,2,3,4,5,6} times {1,2,3,4,5,6} $
|
|
]
|
|
|
|
#example("Countably infinite sample spaces")[
|
|
Shoot a basket until you make one:
|
|
$ Omega = {M, F M, F F M, F F F M, dots} $
|
|
]
|
|
|
|
#example("Uncountably infinite sample space")[
|
|
Waiting time for a bus:
|
|
$ Omega = {T : t >= 0} $
|
|
]
|
|
|
|
#fact[
|
|
Elements of $Omega$ are called sample points.
|
|
]
|
|
|
|
#definition[
|
|
Any properly defined subset of $Omega$ is called an *event*.
|
|
]
|
|
|
|
#example[Dice][
|
|
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
|
|
|
|
$ A = {(4,6,), (5,5),(6,4)} $
|
|
]
|
|
|
|
Probabilistic concepts in the parlance of set theory:
|
|
|
|
- Superset ($Omega$) $<->$ sample space
|
|
- Element $<->$ outcome / sample point ($omega$)
|
|
- Disjoint sets $<->$ mutually exclusive events
|
|
|
|
== Classical approach
|
|
|
|
Classical approach:
|
|
|
|
$ P(a) = (hash A) / (hash Omega) $
|
|
|
|
Requires equally likely outcomes and finite sample spaces.
|
|
|
|
#remark[
|
|
With an infinite sample space, the probability becomes 0, which is often wrong.
|
|
]
|
|
|
|
#example("Dice again")[
|
|
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
|
|
|
|
$
|
|
A &= {(4,6,), (5,5),(6,4)} \
|
|
P(A) &= 3 / 36 = 1 / 12
|
|
$
|
|
]
|
|
|
|
== Relative frequency approach
|
|
|
|
$
|
|
P(A) = (hash "of times" A "occurs in large number of trials") / (hash "of trials")
|
|
$
|
|
|
|
#example[
|
|
Flipping a coin to determine the probability of it landing heads.
|
|
]
|
|
|
|
== Subjective approach
|
|
|
|
Personal definition of probability. Not "real" probability, merely co-opting
|
|
its parlance to lend credibility to subjective judgements of confidence.
|
|
|
|
== Axiomatic approach
|
|
|
|
Our focus in PSTAT 120A. It seems rather silly to call this approach axiomatic
|
|
given we are essentially just defining a function with a few given properties
|
|
and deriving theorems from it while working atop our pre-existing (shaky,
|
|
non-rigorous) "axioms" of set theory, but this is the terminology that the
|
|
course uses.
|
|
|
|
#definition[
|
|
Let $P : X -> RR$ be a function satisfying the following axioms (properties).
|
|
|
|
+ $P(A) >= 0, forall A$
|
|
+ $P(Omega) = 1$
|
|
+ If $A_i sect A_j = emptyset, forall i != j$, then
|
|
$ P(union.big_(i=1)^infinity A_i) = sum_(i=1)^infinity P(A_i) $
|
|
]
|
|
|
|
Now let us show various results with $P$.
|
|
|
|
#proposition[
|
|
$ P(emptyset) = 0 $
|
|
]
|
|
|
|
#proof[
|
|
By axiom 3,
|
|
|
|
$
|
|
A_1 = emptyset, A_2 = emptyset, A_3 = emptyset \
|
|
P(emptyset) = sum^infinity_(i=1) P(A_i) = sum^infinity_(i=1) P(emptyset)
|
|
$
|
|
Suppose $P(emptyset) != 0$. Then $P >= 0$ by axiom 1 but then $P -> infinity$ in the sum, which implies $Omega > 1$, which is disallowed by axiom 2. So $P(emptyset) = 0$.
|
|
]
|
|
|
|
#proposition[
|
|
If $A_1, A_2, ..., A_n$ are disjoint, then
|
|
$ P(union.big^n_(i=1) A_i) = sum^n_(i= 1) P(A_i) $
|
|
]
|
|
|
|
This is mostly a formal manipulation to derive the obviously true proposition from our axioms.
|
|
|
|
#proof[
|
|
Write any finite set $(A_1, A_2, ..., A_n)$ as an infinite set $(A_1, A_2, ..., A_n, emptyset, emptyset, ...)$. Then
|
|
$
|
|
P(union.big_(i=1)^infinity A_i) = sum^n_(i=1) P(A_i) + sum^infinity_(i=n+1) P(emptyset) = sum^n_(i=1) P(A_i)
|
|
$
|
|
And because all of the elements after $A_n$ are $emptyset$, their union adds no additional elements to the resultant union set of all $A_i$, so
|
|
$
|
|
P(union.big_(i=1)^infinity A_i) = P(union.big_(i=1)^n A_i) = sum_(i=1)^n P(A_i)
|
|
$
|
|
]
|
|
|
|
#proposition[Complement][
|
|
$ P(A') = 1 - P(A) $
|
|
]
|
|
|
|
#proof[
|
|
$
|
|
A' union A &= Omega \
|
|
A' sect A &= emptyset \
|
|
P(A' union A) &= P(A') + P(A) &"(by axiom 3)"\
|
|
= P(Omega) &= 1 &"(by axiom 2)" \
|
|
therefore P(A') &= 1 - P(A)
|
|
$
|
|
]
|
|
|
|
#proposition[
|
|
$ A subset.eq B => P(A) <= P(B) $
|
|
]
|
|
|
|
#proof[
|
|
$ B = A union (A' sect B) $
|
|
|
|
but $A$ and ($A' sect B$) are disjoint, so
|
|
|
|
$
|
|
P(B) &= P(A union (A' sect B)) \
|
|
&= P(A) + P(A' sect B) \
|
|
&therefore P(B) >= P(A)
|
|
$
|
|
]
|
|
|
|
#proposition[
|
|
$ P(A union B) = P(A) + P(B) - P(A sect B) $
|
|
]
|
|
|
|
#proof[
|
|
$
|
|
A = (A sect B) union (A sect B') \
|
|
=> P(A) = P(A sect B) + P(A sect B') \
|
|
=> P(B) = P(B sect A) + P(B sect A') \
|
|
P(A) + P(B) = P(A sect B) + P(A sect B) + P(A sect B') + P(A' sect B) \
|
|
=> P(A) + P(B) - P(A sect B) = P(A sect B) + P(A sect B') + P(A' sect B) \
|
|
$
|
|
]
|
|
|
|
#remark[
|
|
This is a stronger result of axiom 3, which generalizes for all sets $A$ and $B$ regardless of whether they're disjoint.
|
|
]
|
|
|
|
#remark[
|
|
These are mostly intuitively true statements (think about the probabilistic
|
|
concepts represented by the sets) in classical probability that we derive
|
|
rigorously from our axiomatic probability function $P$.
|
|
]
|
|
|
|
#example[
|
|
Now let us consider some trivial concepts in classical probability written in
|
|
the parlance of combinatorial probability.
|
|
|
|
Select one card from a deck of 52 cards.
|
|
Then the following is true:
|
|
|
|
$
|
|
Omega = {1,2,...,52} \
|
|
A = "card is a heart" = {H 2, H 3, H 4, ..., H"Ace"} \
|
|
B = "card is an Ace" = {H"Ace", C"Ace", D"Ace", S"Ace"} \
|
|
C = "card is black" = {C 2, C 3, ..., C"Ace", S 2, S 3, ..., S"Ace"} \
|
|
P(A) = 13 / 52,
|
|
P(B) = 4 / 52,
|
|
P(C) = 26 / 52 \
|
|
P(A sect B) = 1 / 52 \
|
|
P(A sect C) = 0 \
|
|
P(B sect C) = 2 / 52 \
|
|
P(A union B) = P(A) + P(B) - P(A sect B) = 16 / 52 \
|
|
P(B') = 1 - P(B) = 48 / 52 \
|
|
P(A sect B') = P(A) - P(A sect B) = 13 / 52 - 1 / 52 = 12 / 52 \
|
|
P((A sect B') union (A' sect B)) = P(A sect B') + P(A' sect B) = 15 / 52 \
|
|
P(A' sect B') = P(A union B)' = 1 - P(A union B) = 36 / 52
|
|
$
|
|
]
|
|
|
|
== Countable sample spaces
|
|
|
|
#definition[
|
|
A sample space $Omega$ is said to be *countable* if it's finite or countably infinite.
|
|
]
|
|
|
|
In such a case, one can list the elements of $Omega$.
|
|
|
|
$ Omega = {omega_1, omega_2, omega_3, ...} $
|
|
with associated probabilities, $p_1, p_2, p_3,...$, where
|
|
$
|
|
p_i = P(omega_i) >= 0 \
|
|
1 = P(Omega) = sum P(omega_i)
|
|
$
|
|
|
|
#example[Fair die, again][
|
|
All outcomes are equally likely,
|
|
$ p_1 = p_2 = ... = p_6 = 1 / 6 $
|
|
Let $A$ be the event that the score is odd = ${1,3,5}$
|
|
$ P(A) = 3 / 6 $
|
|
]
|
|
|
|
#example[Loaded die][
|
|
Consider a die where the probabilities of rolling odd sides is double the probability of rolling an even side.
|
|
$
|
|
p_2 = p_4 = p_6, p_1 = p_3 = p_5 = 2p_2 \
|
|
6p_2 + 3p_2 = 9p_2 = 1 \
|
|
p_2 = 1 / 9, p_1 = 2 / 9
|
|
$
|
|
]
|
|
|
|
#example[Coins][
|
|
Toss a fair coin until you get the first head.
|
|
$
|
|
Omega = {H, T H, T T H, ...} "(countably infinite)" \
|
|
P(H) = 1 / 2 \
|
|
P(T T H) = (1 / 2)^3 \
|
|
P(Omega) = sum_(n=1)^infinity (1 / 2)^n = 1 / (1 - 1 / 2) - 1 = 1
|
|
$
|
|
]
|
|
|
|
#example[Birthdays][
|
|
What is the probability two people share the same birthday?
|
|
|
|
$
|
|
Omega = [1,365] times [1,365] \
|
|
P(A) = 365 / 365^2 = 1 / 365
|
|
$
|
|
]
|
|
|
|
== Continuous sample spaces
|
|
|
|
#definition[
|
|
A *continuous sample space* contains an interval in $RR$ and is uncountably infinite.
|
|
]
|
|
|
|
#definition[
|
|
A probability density function (#smallcaps[pdf]) gives the probability at the point
|
|
$s$.
|
|
]
|
|
|
|
Properties of the #smallcaps[pdf]:
|
|
|
|
- $f(s) >= 0, forall p_i >= 0$
|
|
- $integral_S f(s) dif s = 1, forall p_i >= 0$
|
|
|
|
#example[
|
|
Waiting time for bus: $Omega = {s : s >= 0}$.
|
|
]
|