alexandria/documents/by-course/pstat-120a/course-notes/main.typ
Youwen Wu 403f95fd4a
Some checks are pending
Deploy Quartz site to GitHub Pages using Nix / build (push) Waiting to run
Deploy Quartz site to GitHub Pages using Nix / deploy (push) Blocked by required conditions
auto-update(nvim): 2025-01-08 20:40:52
2025-01-08 20:40:52 -08:00

442 lines
12 KiB
Text

#import "./dvd.typ": *
#import "@preview/ctheorems:1.1.3": *
#show: dvdtyp.with(
title: "PSTAT120A Course Notes",
author: "Youwen Wu",
date: "Winter 2025",
subtitle: "Taught by Brian Wainwright",
)
#outline()
= Lecture 1
== Preliminaries
#definition[
Statistics is the science dealing with the collection, summarization,
analysis, and interpretation of data.
]
== Set theory for dummies
A terse introduction to elementary naive set theory and the basic operations
upon them.
#remark[
Keep in mind that without $cal(Z F C)$ or another model of set theory that
resolves fundamental issues, our set theory is subject to paradoxes like
Russell's. Whoops, the universe doesn't exist.
]
#definition[
A *Set* is a collection of elements.
]
#example[Examples of sets][
+ Trivial set: ${1}$
+ Empty set: $emptyset$
+ $A = {a,b,c}$
]
We can construct sets using set-builder notation (also sometimes called set
comprehension).
$ {"expression with" x | "conditions on" x} $
#example("Set builder notation")[
+ The set of all even integers: ${2n | n in ZZ}$
+ The set of all perfect squares in $RR$: ${x^2 | x in NN}$
]
We also have notation for working with sets:
With arbitrary sets $A$, $B$:
+ $a in A$ ($a$ is a member of the set $A$)
+ $a in.not A$ ($a$ is not a member of the set $A$)
+ $A subset.eq B$ (Set theory: $A$ is a subset of $B$) (Stats: $A$ is a sample space in $B$)
+ $A subset B$ (Proper subset: $A != B$)
+ $A^c$ or $A'$ (read "complement of $A$," and introduced later)
+ $A union B$ (Union of $A$ and $B$. Gives a set with both the elements of $A$ and $B$)
+ $A sect B$ (Intersection of $A$ and $B$. Gives a set consisting of the elements in *both* $A$ and $B$)
+ $A \\ B$ (Set difference. The set of all elements of $A$ that are not also in $B$)
+ $A times B$ (Cartesian product. Ordered pairs of $(a,b)$ $forall a in A$, $forall b in B$)
We can also write a few of these operations precisely as set comprehensions.
+ $A subset B => A = {a | a in B, forall a in A}$
+ $A union B = {x | x in A or x in B}$ (here $or$ is the logical OR)
+ $A sect B = {x | x in A and x in B}$ (here $and$ is the logical AND)
+ $A \\ B = {a | a in A and a in.not B}$
+ $A times B = {(a,b) | forall a in A, forall b in B}$
Take a moment and convince yourself that these definitions are equivalent to
the previous ones.
#definition[
The universal set $Omega$ is the set of all objects in a given set
theoretical universe.
]
With the above definition, we can now introduce the set complement.
#definition[
The set complement $A'$ is given by
$
A' = Omega \\ A
$
where $Omega$ is the _universal set_.
]
#example[The real plane][
The real plane $RR^2$ can be defined as a Cartesian product of $RR$ with
itself.
$ RR^2 = RR times RR $
]
Check your intuition that this makes sense. Why do you think $RR^n$ was chosen
as the notation for $n$ dimensional spaces in $RR$?
#definition[Disjoint sets][
If $A sect B$ = $emptyset$, then we say that $A$ and $B$ are *disjoint*.
]
#fact[
For any sets $A$ and $B$, we have DeMorgan's Laws:
+ $(A union B)' = A' sect B'$
+ $(A sect B)' = A' union B'$
]
#fact[Generalized DeMorgan's][
+ $(union.big_i A_i)' = sect.big_i A_i '$
+ $(sect.big_i A_i)' = union.big_i A_i '$
]
== Sizes of infinity
#definition[
Let $N(A)$ be the number of elements in $A$. $N(A)$ is called the _cardinality_ of $A$.
]
We say a set is finite if it has finite cardinality, or infinite if it has an
infinite cardinality.
Infinite sets can be either _countably infinite_ or _uncountably infinite_.
When a set is countably infinite, its cardinality is $aleph_0$ (here $aleph$ is
the Hebrew letter aleph and read "aleph null").
When a set is uncountably infinite, its cardinality is greater than $aleph_0$.
#example("Countable sets")[
+ The natural numbers $NN$.
+ The rationals $QQ$.
+ The natural numbers $ZZ$.
+ The set of all logical tautologies.
]
#example("Uncountable sets")[
+ The real numbers $RR$.
+ The real numbers in the interval $[0,1]$.
+ The _power set_ of $ZZ$, which is the set of all subsets of $ZZ$.
]
#remark[
All the uncountable sets above have cardinality $2^(aleph_0)$ or $aleph_1$ or
$frak(c)$ or $beth_1$. This is the _cardinality of the continuum_, also
called "aleph 1" or "beth 1".
However, in general uncountably infinite sets do not have the same
cardinality.
]
#fact[
If a set is countably infinite, then it has a bijection with $ZZ$. This means
every set with cardinality $aleph_0$ has a bijection to $ZZ$. More generally,
any sets with the same cardinality have a bijection between them.
]
This gives us the following equivalent statement:
#fact[
Two sets have the same cardinality if and only if there exists a bijective
function between them. In symbols,
$ N(A) = N(B) <==> exists F : A <-> B $
]
= Lecture #datetime(day: 8, month: 1, year: 2025).display()
== Probability
#definition[
A *random experiment* is one in which the set of all possible outcomes is known in advance, but one can't predict which outcome will occur on a given trial of the experiment.
]
#example("Finite sample spaces")[
Toss a coin:
$ Omega = {H,T} $
Roll a pair of dice:
$ Omega = {1,2,3,4,5,6} times {1,2,3,4,5,6} $
]
#example("Countably infinite sample spaces")[
Shoot a basket until you make one:
$ Omega = {M, F M, F F M, F F F M, dots} $
]
#example("Uncountably infinite sample space")[
Waiting time for a bus:
$ Omega = {T : t >= 0} $
]
#fact[
Elements of $Omega$ are called sample points.
]
#definition[
Any properly defined subset of $Omega$ is called an *event*.
]
#example[Dice][
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
$ A = {(4,6,), (5,5),(6,4)} $
]
Probabilistic concepts in the parlance of set theory:
- Superset ($Omega$) $<->$ sample space
- Element $<->$ outcome / sample point ($omega$)
- Disjoint sets $<->$ mutually exclusive events
== Classical approach
Classical approach:
$ P(a) = (hash A) / (hash Omega) $
Requires equally likely outcomes and finite sample spaces.
#remark[
With an infinite sample space, the probability becomes 0, which is often wrong.
]
#example("Dice again")[
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
$
A &= {(4,6,), (5,5),(6,4)} \
P(A) &= 3 / 36 = 1 / 12
$
]
== Relative frequency approach
$
P(A) = (hash "of times" A "occurs in large number of trials") / (hash "of trials")
$
#example[
Flipping a coin to determine the probability of it landing heads.
]
== Subjective approach
Personal definition of probability. Not "real" probability, merely co-opting
its parlance to lend credibility to subjective judgements of confidence.
== Axiomatic approach
Our focus in PSTAT 120A. It seems rather silly to call this approach axiomatic
given we are essentially just defining a function with a few given properties
and deriving theorems from it while working atop our pre-existing (shaky,
non-rigorous) "axioms" of set theory, but this is the terminology that the
course uses.
#definition[
Let $P : X -> RR$ be a function satisfying the following axioms (properties).
+ $P(A) >= 0, forall A$
+ $P(Omega) = 1$
+ If $A_i sect A_j = emptyset, forall i != j$, then
$ P(union.big_(i=1)^infinity A_i) = sum_(i=1)^infinity P(A_i) $
]
Now let us show various results with $P$.
#proposition[
$ P(emptyset) = 0 $
]
#proof[
By axiom 3,
$
A_1 = emptyset, A_2 = emptyset, A_3 = emptyset \
P(emptyset) = sum^infinity_(i=1) P(A_i) = sum^infinity_(i=1) P(emptyset)
$
Suppose $P(emptyset) != 0$. Then $P >= 0$ by axiom 1 but then $P -> infinity$ in the sum, which implies $Omega > 1$, which is disallowed by axiom 2. So $P(emptyset) = 0$.
]
#proposition[
If $A_1, A_2, ..., A_n$ are disjoint, then
$ P(union.big^n_(i=1) A_i) = sum^n_(i= 1) P(A_i) $
]
This is mostly a formal manipulation to derive the obviously true proposition from our axioms.
#proof[
Write any finite set $(A_1, A_2, ..., A_n)$ as an infinite set $(A_1, A_2, ..., A_n, emptyset, emptyset, ...)$. Then
$
P(union.big_(i=1)^infinity A_i) = sum^n_(i=1) P(A_i) + sum^infinity_(i=n+1) P(emptyset) = sum^n_(i=1) P(A_i)
$
And because all of the elements after $A_n$ are $emptyset$, their union adds no additional elements to the resultant union set of all $A_i$, so
$
P(union.big_(i=1)^infinity A_i) = P(union.big_(i=1)^n A_i) = sum_(i=1)^n P(A_i)
$
]
#proposition[Complement][
$ P(A') = 1 - P(A) $
]
#proof[
$
A' union A &= Omega \
A' sect A &= emptyset \
P(A' union A) &= P(A') + P(A) &"(by axiom 3)"\
= P(Omega) &= 1 &"(by axiom 2)" \
therefore P(A') &= 1 - P(A)
$
]
#proposition[
$ A subset.eq B => P(A) <= P(B) $
]
#proof[
$ B = A union (A' sect B) $
but $A$ and ($A' sect B$) are disjoint, so
$
P(B) &= P(A union (A' sect B)) \
&= P(A) + P(A' sect B) \
&therefore P(B) >= P(A)
$
]
#proposition[
$ P(A union B) = P(A) + P(B) - P(A sect B) $
]
#proof[
$
A = (A sect B) union (A sect B') \
=> P(A) = P(A sect B) + P(A sect B') \
=> P(B) = P(B sect A) + P(B sect A') \
P(A) + P(B) = P(A sect B) + P(A sect B) + P(A sect B') + P(A' sect B) \
=> P(A) + P(B) - P(A sect B) = P(A sect B) + P(A sect B') + P(A' sect B) \
$
]
#remark[
This is a stronger result of axiom 3, which generalizes for all sets $A$ and $B$ regardless of whether they're disjoint.
]
#example[
Select one card from a deck of 52 cards.
$
Omega = {1,2,...,52} \
A = "card is a heart" = {H 2, H 3, H 4, ..., H"Ace"} \
B = "card is an Ace" = {H"Ace", C"Ace", D"Ace", S"Ace"} \
C = "card is black" = {C 2, C 3, ..., C"Ace", S 2, S 3, ..., S"Ace"} \
P(A) = 13 / 52,
P(B) = 4 / 52,
P(C) = 26 / 52 \
P(A sect B) = 1 / 52 \
P(A sect C) = 0 \
P(B sect C) = 2 / 52 \
P(A union B) = P(A) + P(B) - P(A sect B) = 16 / 52 \
P(B') = 1 - P(B) = 48 / 52 \
P(A sect B') = P(A) - P(A sect B) = 13 / 52 - 1 / 52 = 12 / 52 \
P((A sect B') union (A' sect B)) = P(A sect B') + P(A' sect B) = 15 / 52 \
P(A' sect B') = P(A union B)' = 1 - P(A union B) = 36 / 52
$
]
== Countable sample spaces
#definition[
A sample space $Omega$ is said to be *countable* if its finite or countably infinite.
]
In such a case, one can list the elements of $Omega$.
$ Omega = {omega_1, omega_2, omega_3, ...} $
with associated probabilities, $p_1, p_2, p_3,...$, where
$
p_i = P(omega_i) >= 0 \
1 = P(Omega) = sum P(omega_i)
$
#example[Fair die, again][
All outcomes are equally likely,
$ p_1 = p_2 = ... = p_6 = 1 / 6 $
Let $A$ be the event that the score is odd = ${1,3,5}$
$ P(A) = 3 / 6 $
]
#example[Loaded die][
Consider a die where the probabilities of rolling odd sides is double the probability of rolling an even side.
$
p_2 = p_4 = p_6, p_1 = p_3 = p_5 = 2p_2 \
6p_2 + 3p_2 = 9p_2 = 1 \
p_2 = 1 / 9, p_1 = 2 / 9
$
]
#example[Coins][
Toss a fair coin until you get the first head.
$
Omega = {H, T H, T T H, ...} "(countably infinite)" \
P(H) = 1 / 2 \
P(T T H) = (1 / 2)^3 \
P(Omega) = sum_(n=1)^infinity (1 / 2)^n = 1 / (1 - 1 / 2) - 1 = 1
$
]
#example[Birthdays][
What is the probability two people share the same birthday?
$
Omega = [1,365] times [1,365] \
P(A) = 365 / 365^2 = 1 / 365
$
]
== Continuous sample spaces
#definition[
A *continuous sample space* contains an interval in $RR$ and is uncountably infinite.
]
#definition[
A probability density function (#smallcaps[pdf]) gives the probability at the point
$s$.
]
Properties of the #smallcaps[pdf]:
- $f(s) >= 0, forall p_i >= 0$
- $integral_S f(s) dif s = 1, forall p_i >= 0$
#example[
Waiting time for bus: $Omega = {s : s >= 0}$.
]