A *random experiment* is one in which the set of all possible outcomes is known in advance, but one can't predict which outcome will occur on a given trial of the experiment.
]
#example("Finite sample spaces")[
Toss a coin:
$ Omega = {H,T} $
Roll a pair of dice:
$ Omega = {1,2,3,4,5,6} times {1,2,3,4,5,6} $
]
#example("Countably infinite sample spaces")[
Shoot a basket until you make one:
$ Omega = {M, F M, F F M, F F F M, dots} $
]
#example("Uncountably infinite sample space")[
Waiting time for a bus:
$ Omega = {T : t >= 0} $
]
#fact[
Elements of $Omega$ are called sample points.
]
#definition[
Any properly defined subset of $Omega$ is called an *event*.
]
#example[Dice][
Rolling a fair die twice, let $A$ be the event that the combined score of both dice is 10.
Suppose $P(emptyset) != 0$. Then $P >= 0$ by axiom 1 but then $P -> infinity$ in the sum, which implies $Omega > 1$, which is disallowed by axiom 2. So $P(emptyset) = 0$.
The cardinality of $A$ is given by $hash A$. Let us develop methods for finding
$hash A$ from a description of the set $A$ (in other words, methods for
counting).
== General multiplication principle
#fact[
Let $A$ and $B$ be finite sets, $k in ZZ^+$. Then let $f : A -> B$ be a
function such that each element in $B$ is the image of exactly $k$ elements
in $A$ (such a function is called _$k$-to-one_). Then $hash A = k dot hash
B$.
]<ktoone>
#example[
Four fully loaded 10-seater vans transported people to the picnic. How many
people were transported?
By @ktoone, we have $A$ is the set of people, $B$ is the set of vans, $f : A -> B$ maps a person to the van they ride in. So $f$ is a 10-to-one function, $hash A = 40$, $hash B = 4$, and clearly the answer is $10 dot 4 = 40$.
]
#definition[
An $n$-tuple is an ordered sequence of $n$ elements.
]
Many of our methods in probability rely on multiplying together multiple
outcomes to obtain their combined amount of outcomes. We make this explicit below in @tuplemultiplication.
#fact[
Suppose a set of $n$-tuples $(a_1, ..., a_n)$ obeys these rules:
+ There are $r_1$ choices for the first entry $a_1$.
+ Once the first $k$ entries $a_1, ..., a_k$ have been chosen, the number of alternatives for the next entry $a_(k+1)$ is $r_(k+1)$, regardless of the previous choices.
Then the total number of $n$-tuples is the product $r_1 dot r_2 dot r_2 dot dots dot r_n$.
]<tuplemultiplication>
#proof[
It is trivially true for $n = 1$ since you have $r_1$ choices of $a_1$ for a
1-tuple $(a_1)$.
Let $A$ be the set of all possible $n$-tuples and $B$ be the set of all
possible $(n+1)$-tuples. Now let us assume the statement is true for $A$.
Proceed by induction on $B$, noting that for each $n$-tuple in $A$, $(a_1,
..., a_n)$, we have $r_(n+1)$ tuples in $A$.
Let $f : B -> A$ be a function which takes each $(n+1)$-tuple and truncates the $a_(n+1)$ term, leaving us with just an $n$-tuple of the form $(a_1, a_2, ..., a_n)$.
How many distinct subsets does a set of size $n$ have?
The answer is $2^n$. Each subset can be encoded as an $n$-tuple with entries 0
or 1, where the $i$th entry is 1 if the $i$th element of the set is in the
subset and 0 if it is not.
Thus the number of subsets is the same as the cardinality of
$ {0,1} times ... times {0,1} = {0,1}^n $
which is $2^n$.
This is why given a set $X$ with cardinality $aleph$, we write the
cardinality of the power set of $X$ as $2^aleph$.
]
== Permutations
Now we can use the multiplication principle to count permutations.
#fact[
Consider all $k$-tuples $(a_1, ..., a_k)$ that can be constructed from a set $A$ of size $n, n>= k$ without repetition. The total number of these $k$-tuples is
$ (n)_k = n dot (n - 1) ... (n - k + 1) = n! / (n-k)! $
In particular, with $k=n$, each $n$-tuple is an ordering or _permutation_ of $A$. So the total number of permutations of a set of $n$ elements is $n!$.
+ In how many ways can we seat 8 guests around the table?
+ In how many ways can we do this if we do not differentiate between seating arrangements that are rotations of each other?
For (1), we easily see that we're simply asking for permutations of an
8-tuple, so $8!$ is the answer.
For (2), we number each person and each seat from 1-8, then always place person 1 in seat 1, and count the permutations of the other 7 people in the other 7 seats. Then the answer is $7!$.
Alternatively, notice that each arrangement has 8 equivalent arrangements under rotation. So the answer is $8!/8 = 7!$.
]
== Counting from sets
We turn our attention to sets, which unlike tuples are unordered collections.
#fact[
Let $n,k in NN$ with $0 <= k <= n$. The numbers of distinct subsets of size $k$ that a set of size $n$ has is given by the *binomial coefficient*
$ vec(n,k) = n! / (k! (n-k)!) $
]
#proof[
Let $A$ be a set of size $n$. By @permutation, $n!/(n-k)!$ unique ordered
$k$-tuples can be constructed from elements of $A$. Each subset of $A$ of
size $k$ has exactly $k!$ different orderings, and hence appears exactly $k!$
times among the ordered $k$-tuples. Thus the number of subsets of size $k$ is
$n! / (k! (n-k)!)$.
]
#example[
In a class there are 12 boys and 14 girls. How many different teams of 7 pupils
with 3 boys and 4 girls can be create?
First let us compute how many subsets of size 3 we can choose from the 12 boys and how many subsets of size 4 we can choose from the 14 girls.
$
"boys" &= vec(12,3) \
"girls" &= vec(14,4)
$
Then let us consider the entire team as a 2-tuple of (boys, girls). Then
there are $vec(12,3)$ alternatives for the choice of boys, and $vec(14,4)$ alternatives for
the choice of girls, so by the multiplication principle, we have the total being
$ vec(12,3) vec(14,4) $
]
#example[
Color the numbers 1, 2 red, the numbers 3, 4 green, and the numbers 5, 6
yellow. How many different two-element subsets of $A$ are there that have two
different colors?
First choose 2 colors, $vec(3,2) = 3$. Then from each color, choose one. Altogether it's
Repeat the same reasoning for $B$ and $D$, we see that they are not independent.
]
#example[
Suppose we have 4 red and 7 green balls in an urn. We choose two balls with replacement. Let
- $A$ = the first ball is red
- $B$ = the second ball is greeen
Are $A$ and $B$ independent?
$
hash Omega = 11 times 11 = 121 \
hash A = 4 dot 11 = 44 \
hash B = 11 dot 7 = 77 \
hash (A sect B) = 4 dot 7 = 28
$
]
#definition[
Events $A_1, ..., A_n$ are independent (mutually independent) if for every collection $A_i_1, ..., A_i_k$, where $2 <= k <= n$ and $1 <= i_1 < i_2 < dots.c < i_k <= n$,
There are different characterizations of the parameters, but
$ X ~ "Hypergeom"(hash "total", hash "successes", "sample size") $
For example,
$ X ~ "Hypergeom"(N, a, n) "where" N = a+b $
In the textbook, it's
$ X ~ "Hypergeom"(N, N_a, n) $
#remark[
If $x$ is very small relative to $a + b$, then both cases give similar (approx.
the same) answers.
]
For instance, if we're sampling for blood types from UCSB, and we take a
student out without replacement, we don't really change the sample size
substantially. So both answers give a similar result.
Suppose we have two types of items, type $A$ and type $B$. Let $N_A$ be $hash$
type $A$, $N_B$ $hash$ type $B$. $N = N_A + N_B$ is the total number of
objects.
We sample $n$ items *without replacement* ($n <= N$) with order not mattering.
Denote by $X$ the number of type $A$ objects in our sample.
#definition[
Let $0 <= N_A <= N$ and $1 <= n <= N$ be integers. A random variable $X$ has the *hypergeometric distribution* with parameters $(N, N_A, n)$ if $X$ takes values in the set ${0,1,...,n}$ and has p.m.f.
Let $N_A = 10$ defectives. Let $N_B = 90$ non-defectives. We select $n=5$ without replacement. What is the probability that 2 of the 5 selected are defective?
Note that for $P(X=3)$, we are asking for 3 successes (drawing males) where
there are only 2 males, so it must be 0.
]
== Geometric distribution
Consider an infinite sequence of independent trials. e.g. number of attmepts until I make a basket.
Let $X_i$ denote the outcome of the $i^"th"$ trial, where success is 1 and failure is 0. Let $N$ be the number of trials needed to observe the first success in a sequence of independent trials with probabilty of success $p$. Then
We fail $k-1$ times and succeed on the $k^"th"$ try. Then:
5. Find a point $a$ such that you integrate up to the point to find exactly $1/2$
the area.
We want to find $1/2 = P(X <= a)$.
$ 1 / 2 = P(X <= a) = F(a) = a^3 / 8 => a = root(3, 4) $
]
== The (continuous) uniform distribution
The most simple and the best of the named distributions!
#definition[
Let $[a,b]$ be a bounded interval on the real line. A random variable $X$ has the uniform distribution on the interval $[a,b]$ if $X$ has the density function
$
f(x) = cases(
1/(b-a) &"for" x in [a,b],
0 &"for" x in.not [a,b]
)
$
Abbreviate this by $X ~ "Unif" [a,b]$.
]<continuous-uniform>
The graph of $"Unif" [a,b]$ is a constant line at height $1/(b-a)$ defined
across $[a,b]$. The integral is just the area of a rectangle, and we can check
it is 1.
#fact[
For $X ~ "Unif" [a,b]$, its cumulative distribution function (CDF) is given by:
$
F_x (x) = cases(
0 &"for" x < a,
(x-a)/(b-a) &"for" x in [a,b],
1 &"for" x > b
)
$
]
#fact[
If $X ~ "Unif" [a,b]$, and $[c,d] subset [a,b]$, then
$
P(c <= X <= d) = integral_c^d 1 / (b-a) dif x = (d-c) / (b-a)
$
]
#example[
Let $Y$ be a uniform random variable on $[-2,5]$. Find the probability that its
absolute value is at least 1.
$Y$ takes values in the interval $[-2,5]$, so the absolute value is at least 1 iff. $Y in [-2,1] union [1,5]$.
The density function of $Y$ is $f(x) = 1/(5- (-2)) = 1/7$ on $[-2,5]$ and 0 everywhere else.
So,
$
P(|Y| >= 1) &= P(Y in [-2,-1] union [1,5]) \
&= P(-2 <= Y <= -1) + P(1 <= Y <= 5) \
&= 5 / 7
$
]
== The exponential distribution
The geometric distribution can be viewed as modeling waiting times, in a discrete setting, i.e. we wait for $n - 1$ failures to arrive at the $n^"th"$ success.
The exponential distribution is the continuous analogue to the geometric
distribution, in that we often use it to model waiting times in the continuous
sense. For example, the first custom to enter the barber shop.
#definition[
Let $0 < lambda < infinity$. A random variable $X$ has the exponential distribution with parameter $lambda$ if $X$ has PDF
$
f(x) = cases(
lambda e^(-lambda x) &"for" x >= 0,
0 &"for" x < 0
)
$
Abbreviate this by $X ~ "Exp"(lambda)$, the exponential distribution with rate $lambda$.
The CDF of the $"Exp"(lambda)$ distribution is given by:
$
F(t) + cases(
0 &"if" t <0,
1 - e^(-lambda t) &"if" t>= 0
)
$
]
#example[
Suppose the length of a phone call, in minutes, is well modeled by an exponential random variable with a rate $lambda = 1/10$.
1. What is the probability that a call takes more than 8 minutes?
2. What is the probability that a call takes between 8 and 22 minutes?
Let $X$ be the length of the phone call, so that $X ~ "Exp"(1/10)$. Then we can find the desired probability by:
$
P(X > 8) &= 1 - P(X <= 8) \
&= 1 - F_x (8) \
&= 1 - (1 - e^(-(1 / 10) dot 8)) \
&= e^(-8 / 10) approx 0.4493
$
Now to find $P(8 < X < 22)$, we can take the difference in CDFs:
$
&P(X > 8) - P(X >= 22) \
&= e^(-8 / 10) - e^(-22 / 10) \
&approx 0.3385
$
]
#fact("Memoryless property of the exponential distribution")[
Suppose that $X ~ "Exp"(lambda)$. Then for any $s,t > 0$, we have
$
P(X > t + s | X > t) = P(X > s)
$
]<memoryless>
This is like saying if I've been waiting 5 minutes and then 3 minutes for the
bus, what is the probability that I'm gonna wait more than 5 + 3 minutes, given
that I've already waited 5 minutes? And that's precisely equal to just the
probability I'm gonna wait more than 3 minutes.
#proof[
$
P(X > t + s | X > t) = (P(X > t + s sect X > t)) / (P(X > t)) \
Let $r, lambda > 0$. A random variable $X$ has the *gamma distribution* with parameters $(r, lambda)$ if $X$ is nonnegative and has probability density function
$
f(x) = cases(
(lambda^r x^(r-2))/(Gamma(r)) e^(-lambda x) &"for" x >= 0,
0 &"for" x < 0
)
$
Abbreviate this by $X ~ "Gamma"(r, lambda)$.
]
The gamma function $Gamma(r)$ generalizes the factorial function and is defined as