blog/atom.xml

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>The Involution</title>
    <link href="https://blog.youwen.dev/atom.xml" rel="self" />
    <link href="https://blog.youwen.dev" />
    <id>https://blog.youwen.dev/atom.xml</id>
    <author>
        <name>Youwen Wu</name>
        
        <email>youwenw@gmail.com</email>
        
    </author>
    <updated>2025-02-16T00:00:00Z</updated>
    <entry>
    <title>Random variables, distributions, and probability theory</title>
    <link href="https://blog.youwen.dev/random-variables-distributions-and-probability-theory.html" />
    <id>https://blog.youwen.dev/random-variables-distributions-and-probability-theory.html</id>
    <published>2025-02-16T00:00:00Z</published>
    <updated>2025-02-16T00:00:00Z</updated>
    <summary type="html"><![CDATA[<article>
  <header>
    <h1 class="text-4xl">
      <a href="./random-variables-distributions-and-probability-theory.html">Random variables, distributions, and probability theory</a>
    </h1>
    <p
      class="mb-1 mt-2 italic font-light text-lg text-accent-light dark:text-accent-dark"
    >
      An overview of discrete and continuous random variables and their distributions and moment generating functions
    </p>
    <div class="mt-2">2025-02-16</div>
    <div class="mt-1 text-sm">
      
    </div>
  </header>
  <main class="post mt-4"><p>These are some notes I’ve been collecting on random variables, their
distributions, expected values, and moment generating functions. I
thought I’d write them down somewhere useful.</p>
<p>These are almost extracted verbatim from my in-class notes, which I take
in real time using Typst. I simply wrote a tiny compatibility shim to
allow Pandoc to render them to the web.</p>
<hr />
<h2 id="random-variables">Random variables</h2>
<p>First, some brief exposition on random variables. Quixotically, a random
variable is actually a function.</p>
<p>Standard notation: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Ω</mi><annotation encoding="application/x-tex">\Omega</annotation></semantics></math> is a sample space, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ω</mi><mo>∈</mo><mi>Ω</mi></mrow><annotation encoding="application/x-tex">\omega \in \Omega</annotation></semantics></math> is an
event.</p>
<p><em>Definition. </em></p>
<p>A <strong>random variable</strong> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is a function
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>:</mo><mi>Ω</mi><mo>→</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">X:\Omega \rightarrow {\mathbb{R}}</annotation></semantics></math> that takes the set of possible
outcomes in a sample space, and maps it to a <a href="https://en.wikipedia.org/wiki/Measurable_space">measurable
space</a>, typically (as in
our case) a subset of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>ℝ</mi><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics></math>.</p>
<p><em>Definition. </em></p>
<p>The <strong>state space</strong> of a random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is all of the values <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>
can take.</p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be a random variable that takes on the values
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ 0,1,2,3 \right\}</annotation></semantics></math>. Then the state space of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is the set
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ 0,1,2,3 \right\}</annotation></semantics></math>.</p>
<h3 id="discrete-random-variables">Discrete random variables</h3>
<p>A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is discrete if there is countable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> such that
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>∈</mo><mi>A</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P(X \in A) = 1</annotation></semantics></math>. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">k</annotation></semantics></math> is a possible value if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">P(X = k) &gt; 0</annotation></semantics></math>. We discuss
continuous random variables later.</p>
<p>The <em>probability distribution</em> of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> gives its important probabilistic
information. The probability distribution is a description of the
probabilities <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>∈</mo><mi>B</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \in B)</annotation></semantics></math> for subsets <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mo>∈</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">B \in {\mathbb{R}}</annotation></semantics></math>. We describe
the probability density function and the cumulative distribution
function.</p>
<p>A discrete random variable has probability distribution entirely
determined by its probability mass function (hereafter abbreviated p.m.f
or PMF) <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">p(k) = P(X = k)</annotation></semantics></math>. The p.m.f. is a function from the set of
possible values of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> into <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,1\rbrack</annotation></semantics></math>. Labeling the p.m.f.
with the random variable is done by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">p_{X}(k)</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>X</mi></msub><mo>:</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> State space of </mtext><mspace width="0.333em"></mspace></mrow><mi>X</mi><mo>→</mo><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">p_{X}:\text{ State space of }X \rightarrow \lbrack 0,1\rbrack</annotation></semantics></math></p>
<p>By the axioms of probability,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munder><mo>∑</mo><mi>k</mi></munder><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munder><mo>∑</mo><mi>k</mi></munder><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\sum_{k}p_{X}(k) = \sum_{k}P(X = k) = 1</annotation></semantics></math></p>
<p>For a subset <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mo>⊂</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">B \subset {\mathbb{R}}</annotation></semantics></math>,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>∈</mo><mi>B</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munder><mo>∑</mo><mrow><mi>k</mi><mo>∈</mo><mi>B</mi></mrow></munder><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \in B) = \sum_{k \in B}p_{X}(k)</annotation></semantics></math></p>
<h3 id="continuous-random-variables">Continuous random variables</h3>
<p>Now as promised we introduce another major class of random variables.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be a random variable. If <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> satisfies</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>b</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(X \leq b) = \int_{- \infty}^{b}f(x)dx</annotation></semantics></math></p>
<p>for all <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi><mo>∈</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">b \in {\mathbb{R}}</annotation></semantics></math>, then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> is the <strong>probability density
function</strong> (hereafter abbreviated p.d.f. or PDF) of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p>We immediately see that the p.d.f. is analogous to the p.m.f. of the
discrete case.</p>
<p>The probability that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∈</mo><mo stretchy="false" form="prefix">(</mo><mi>−</mi><mi>∞</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X \in ( - \infty,b\rbrack</annotation></semantics></math> is equal to the area
under the graph of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> from <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>−</mi><mi>∞</mi></mrow><annotation encoding="application/x-tex">- \infty</annotation></semantics></math> to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>b</mi><annotation encoding="application/x-tex">b</annotation></semantics></math>.</p>
<p>A corollary is the following.</p>
<p><em>Fact. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>∈</mo><mi>B</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mo>∫</mo><mi>B</mi></msub><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(X \in B) = \int_{B}f(x)dx</annotation></semantics></math></p>
<p>for any <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mo>⊂</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">B \subset {\mathbb{R}}</annotation></semantics></math> where integration makes sense.</p>
<p>The set can be bounded or unbounded, or any collection of intervals.</p>
<p><em>Fact. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mo>≤</mo><mi>X</mi><mo>≤</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mi>a</mi><mi>b</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(a \leq X \leq b) = \int_{a}^{b}f(x)dx</annotation></semantics></math>
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mi>a</mi><mi>∞</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(X &gt; a) = \int_{a}^{\infty}f(x)dx</annotation></semantics></math></p>
<p><em>Fact. </em></p>
<p>If a random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has density function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> then individual point
values have probability zero:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>c</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mi>c</mi><mi>c</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><mn>0</mn><mo>,</mo><mo>∀</mo><mi>c</mi><mo>∈</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">P(X = c) = \int_{c}^{c}f(x)dx = 0,\forall c \in {\mathbb{R}}</annotation></semantics></math></p>
<p><em>Remark. </em></p>
<p>It follows a random variable with a density function is not discrete. An
immediate corollary of this is that the probabilities of intervals are
not changed by including or excluding endpoints. So <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \leq k)</annotation></semantics></math> and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&lt;</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X &lt; k)</annotation></semantics></math> are equivalent.</p>
<p>How to determine which functions are p.d.f.s? Since
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mi>∞</mi><mo>&lt;</mo><mi>X</mi><mo>&lt;</mo><mi>∞</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P( - \infty &lt; X &lt; \infty) = 1</annotation></semantics></math>, a p.d.f. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> must satisfy</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≥</mo><mn>0</mn><mo>∀</mo><mi>x</mi><mo>∈</mo><mi>ℝ</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><mn>1</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
f(x) \geq 0\forall x \in {\mathbb{R}} \\
\int_{- \infty}^{\infty}f(x)dx = 1
\end{array}</annotation></semantics></math></p>
<p><em>Fact. </em></p>
<p>Random variables with density functions are called <em>continuous</em> random
variables. This does not imply that the random variable is a continuous
function on <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Ω</mi><annotation encoding="application/x-tex">\Omega</annotation></semantics></math> but it is standard terminology.</p>
<h2 id="discrete-distributions">Discrete distributions</h2>
<p>Recall that the <em>probability distribution</em> of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> gives its important
probabilistic information. Let us discuss some of these distributions.</p>
<p>In general we first consider the experiment’s properties and theorize
about the distribution that its random variable takes. We can then apply
the distribution to find out various pieces of probabilistic
information.</p>
<h3 id="bernoulli-trials">Bernoulli trials</h3>
<p>A Bernoulli trial is the original “experiment.” It’s simply a single
trial with a binary “success” or “failure” outcome. Encode this T/F, 0
or 1, or however you’d like. It becomes immediately useful in defining
more complex distributions, so let’s analyze its properties.</p>
<p>The setup: the experiment has exactly two outcomes:</p>
<ul>
<li><p>Success – <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>S</mi><annotation encoding="application/x-tex">S</annotation></semantics></math> or 1</p></li>
<li><p>Failure – <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>F</mi><annotation encoding="application/x-tex">F</annotation></semantics></math> or 0</p></li>
</ul>
<p>Additionally: <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>S</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>p</mi><mo>,</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>0</mn><mo>&lt;</mo><mi>p</mi><mo>&lt;</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>F</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo>=</mo><mi>q</mi></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
P(S) = p,(0 &lt; p &lt; 1) \\
P(F) = 1 - p = q
\end{array}</annotation></semantics></math></p>
<p>Construct the probability mass function:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>p</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>0</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn><mo>−</mo><mi>p</mi></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
P(X = 1) = p \\
P(X = 0) = 1 - p
\end{array}</annotation></semantics></math></p>
<p>Write it as:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mrow><mi>x</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></msub><mo>=</mo><msup><mi>p</mi><mi>k</mi></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mn>1</mn><mo>−</mo><mi>k</mi></mrow></msup></mrow><annotation encoding="application/x-tex">p_{x(k)} = p^{k}(1 - p)^{1 - k}</annotation></semantics></math></p>
<p>for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">k = 1</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">k = 0</annotation></semantics></math>.</p>
<h3 id="binomial-distribution">Binomial distribution</h3>
<p>The setup: very similar to Bernoulli, trials have exactly 2 outcomes. A
bunch of Bernoulli trials in a row.</p>
<p>Importantly: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>q</mi><annotation encoding="application/x-tex">q</annotation></semantics></math> are defined exactly the same in all trials.</p>
<p>This ties the binomial distribution to the sampling with replacement
model, since each trial does not affect the next.</p>
<p>We conduct <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> <strong>independent</strong> trials of this experiment. Example with
coins: each flip independently has a <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mfrac><mn>1</mn><mn>2</mn></mfrac><annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics></math> chance of heads or
tails (holds same for die, rigged coin, etc).</p>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> is fixed, i.e. known ahead of time.</p>
<h4 id="binomial-random-variable">Binomial random variable</h4>
<p>Let’s consider the random variable characterized by the binomial
distribution now.</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>=</mo><mi>#</mi></mrow><annotation encoding="application/x-tex">X = \#</annotation></semantics></math> of successes in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> independent trials. For any particular
sequence of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> trials, it takes the form
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Ω</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mi>ω</mi><mo stretchy="true" form="postfix">}</mo></mrow><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> where </mtext><mspace width="0.333em"></mspace></mrow><mi>ω</mi><mo>=</mo><mi>S</mi><mi>F</mi><mi>F</mi><mi>⋯</mi><mi>F</mi></mrow><annotation encoding="application/x-tex">\Omega = \left\{ \omega \right\}\text{ where }\omega = SFF\cdots F</annotation></semantics></math> and
is of length <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math>.</p>
<p>Then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>ω</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">X(\omega) = 0,1,2,\ldots,n</annotation></semantics></math> can take <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">n + 1</annotation></semantics></math> possible values. The
probability of any particular sequence is given by the product of the
individual trial probabilities.</p>
<p><em>Example. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ω</mi><mo>=</mo><mi>S</mi><mi>F</mi><mi>F</mi><mi>S</mi><mi>F</mi><mi>⋯</mi><mi>S</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>p</mi><mi>q</mi><mi>q</mi><mi>p</mi><mi>q</mi><mi>⋯</mi><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\omega = SFFSF\cdots S = (pqqpq\cdots p)</annotation></semantics></math></p>
<p>So <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo>=</mo><mn>0</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>F</mi><mi>F</mi><mi>F</mi><mi>⋯</mi><mi>F</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>q</mi><mo>⋅</mo><mi>q</mi><mo>⋅</mo><mi>⋯</mi><mo>⋅</mo><mi>q</mi><mo>=</mo><msup><mi>q</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">P(x = 0) = P(FFF\cdots F) = q \cdot q \cdot \cdots \cdot q = q^{n}</annotation></semantics></math>.</p>
<p>And <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>S</mi><mi>F</mi><mi>F</mi><mi>⋯</mi><mi>F</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>F</mi><mi>S</mi><mi>F</mi><mi>F</mi><mi>⋯</mi><mi>F</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>⋯</mi><mo>+</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>F</mi><mi>F</mi><mi>F</mi><mi>⋯</mi><mi>F</mi><mi>S</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><munder><munder><mi>n</mi><mo accent="true">⏟</mo></munder><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> possible outcomes</mtext></mrow></munder><mo>⋅</mo><msup><mi>p</mi><mn>1</mn></msup><mo>⋅</mo><msup><mi>p</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><msup><mi>p</mi><mn>1</mn></msup><mo>⋅</mo><msup><mi>p</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mi>n</mi><mo>⋅</mo><msup><mi>p</mi><mn>1</mn></msup><mo>⋅</mo><msup><mi>p</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msup></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
P(X = 1) = P(SFF\cdots F) + P(FSFF\cdots F) + \cdots + P(FFF\cdots FS) \\
 = \underset{\text{ possible outcomes}}{\underbrace{n}} \cdot p^{1} \cdot p^{n - 1} \\
 = \begin{pmatrix}
n \\
1
\end{pmatrix} \cdot p^{1} \cdot p^{n - 1} \\
 = n \cdot p^{1} \cdot p^{n - 1}
\end{array}</annotation></semantics></math></p>
<p>Now we can generalize</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mn>2</mn></msup><msup><mi>q</mi><mrow><mi>n</mi><mo>−</mo><mn>2</mn></mrow></msup></mrow><annotation encoding="application/x-tex">P(X = 2) = \begin{pmatrix}
n \\
2
\end{pmatrix}p^{2}q^{n - 2}</annotation></semantics></math></p>
<p>How about all successes?</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>n</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>S</mi><mi>S</mi><mi>⋯</mi><mi>S</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msup><mi>p</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">P(X = n) = P(SS\cdots S) = p^{n}</annotation></semantics></math></p>
<p>We see that for all failures we have <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>q</mi><mi>n</mi></msup><annotation encoding="application/x-tex">q^{n}</annotation></semantics></math> and all successes we have
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>p</mi><mi>n</mi></msup><annotation encoding="application/x-tex">p^{n}</annotation></semantics></math>. Otherwise we use our method above.</p>
<p>In general, here is the probability mass function for the binomial
random variable</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>k</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mi>k</mi></msup><msup><mi>q</mi><mrow><mi>n</mi><mo>−</mo><mi>k</mi></mrow></msup><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> for </mtext><mspace width="0.333em"></mspace></mrow><mi>k</mi><mo>=</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">P(X = k) = \begin{pmatrix}
n \\
k
\end{pmatrix}p^{k}q^{n - k},\text{ for }k = 0,1,2,\ldots,n</annotation></semantics></math></p>
<p>Binomial distribution is very powerful. Choosing between two things,
what are the probabilities?</p>
<p>To summarize the characterization of the binomial random variable:</p>
<ul>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> independent trials</p></li>
<li><p>each trial results in binary success or failure</p></li>
<li><p>with probability of success <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math>, identically across trials</p></li>
</ul>
<p>with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>=</mo><mi>#</mi></mrow><annotation encoding="application/x-tex">X = \#</annotation></semantics></math> successes in <strong>fixed</strong> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> trials.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Bin</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>n</mi><mo>,</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Bin}(n,p)</annotation></semantics></math></p>
<p>with probability mass function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mi>x</mi></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>n</mi><mo>−</mo><mi>x</mi></mrow></msup><mo>=</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>=</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">P(X = x) = \begin{pmatrix}
n \\
x
\end{pmatrix}p^{x}(1 - p)^{n - x} = p(x)\text{ for }x = 0,1,2,\ldots,n</annotation></semantics></math></p>
<p>We see this is in fact the binomial theorem!</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≥</mo><mn>0</mn><mo>,</mo><munderover><mo>∑</mo><mrow><mi>x</mi><mo>=</mo><mn>0</mn></mrow><mi>n</mi></munderover><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munderover><mo>∑</mo><mrow><mi>x</mi><mo>=</mo><mn>0</mn></mrow><mi>n</mi></munderover><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mi>x</mi></msup><msup><mi>q</mi><mrow><mi>n</mi><mo>−</mo><mi>x</mi></mrow></msup><mo>=</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>p</mi><mo>+</mo><mi>q</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">p(x) \geq 0,\sum_{x = 0}^{n}p(x) = \sum_{x = 0}^{n}\begin{pmatrix}
n \\
x
\end{pmatrix}p^{x}q^{n - x} = (p + q)^{n}</annotation></semantics></math></p>
<p>In fact, <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>p</mi><mo>+</mo><mi>q</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><mo>=</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>p</mi><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">(p + q)^{n} = \left( p + (1 - p) \right)^{n} = 1</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>What is the probability of getting exactly three aces (1’s) out of 10
throws of a fair die?</p>
<p>Seems a little trickier but we can still write this as well defined
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>S</mi><annotation encoding="application/x-tex">S</annotation></semantics></math>/<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>F</mi><annotation encoding="application/x-tex">F</annotation></semantics></math>. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>S</mi><annotation encoding="application/x-tex">S</annotation></semantics></math> be getting an ace and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>F</mi><annotation encoding="application/x-tex">F</annotation></semantics></math> being anything else.</p>
<p>Then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>=</mo><mfrac><mn>1</mn><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">p = \frac{1}{6}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>=</mo><mn>10</mn></mrow><annotation encoding="application/x-tex">n = 10</annotation></semantics></math>. We want <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = 3)</annotation></semantics></math>. So</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>10</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mn>3</mn></msup><msup><mi>q</mi><mn>7</mn></msup><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>10</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>3</mn></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>7</mn></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>≈</mo><mn>0.15505</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
P(X = 3) = \begin{pmatrix}
10 \\
3
\end{pmatrix}p^{3}q^{7} = \begin{pmatrix}
10 \\
3
\end{pmatrix}\left( \frac{1}{6} \right)^{3}\left( \frac{5}{6} \right)^{7} \\
 \approx 0.15505
\end{array}</annotation></semantics></math></p>
<h4 id="with-or-without-replacement">With or without replacement?</h4>
<p>I place particular emphasis on the fact that the binomial distribution
generally applies to cases where you’re sampling with <em>replacement</em>.
Consider the following: <em>Example. </em></p>
<p>Suppose we have two types of candy, red and black. Select <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> candies.
Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be the number of red candies among <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> selected.</p>
<p>2 cases.</p>
<ul>
<li>case 1: with replacement: Binomial Distribution, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math>,
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>=</mo><mfrac><mi>a</mi><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">p = \frac{a}{a + b}</annotation></semantics></math>.</li>
</ul>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mi>a</mi><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mi>b</mi><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>n</mi><mo>−</mo><mn>2</mn></mrow></msup></mrow><annotation encoding="application/x-tex">P(X = 2) = \begin{pmatrix}
n \\
2
\end{pmatrix}\left( \frac{a}{a + b} \right)^{2}\left( \frac{b}{a + b} \right)^{n - 2}</annotation></semantics></math></p>
<ul>
<li>case 2: without replacement: then use counting</li>
</ul>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>a</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi><mo>−</mo><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>a</mi><mo>+</mo><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mfrac><mo>=</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = x) = \frac{\begin{pmatrix}
a \\
x
\end{pmatrix}\begin{pmatrix}
b \\
n - x
\end{pmatrix}}{\begin{pmatrix}
a + b \\
n
\end{pmatrix}} = p(x)</annotation></semantics></math></p>
<p>In case 2, we used the elementary counting techniques we are already
familiar with. Immediately we see a distinct case similar to the
binomial but when sampling without replacement. Let’s formalize this as
a random variable!</p>
<h3 id="hypergeometric-distribution">Hypergeometric distribution</h3>
<p>Let’s introduce a random variable to represent a situation like case 2
above.</p>
<p><em>Definition. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>a</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi><mo>−</mo><mi>x</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>a</mi><mo>+</mo><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mfrac><mo>=</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = x) = \frac{\begin{pmatrix}
a \\
x
\end{pmatrix}\begin{pmatrix}
b \\
n - x
\end{pmatrix}}{\begin{pmatrix}
a + b \\
n
\end{pmatrix}} = p(x)</annotation></semantics></math></p>
<p>is known as a <strong>Hypergeometric distribution</strong>.</p>
<p>Abbreviate this by:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Hypergeom</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>#</mi><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> total</mtext></mrow><mo>,</mo><mi>#</mi><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> successes</mtext></mrow><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> sample size</mtext></mrow><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Hypergeom}\left( \#\text{ total},\#\text{ successes},\text{ sample size} \right)</annotation></semantics></math></p>
<p>For example,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Hypergeom</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>N</mi><mo>,</mo><msub><mi>N</mi><mi>a</mi></msub><mo>,</mo><mi>n</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Hypergeom}\left( N,N_{a},n \right)</annotation></semantics></math></p>
<p><em>Remark. </em></p>
<p>If <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> is very small relative to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow><annotation encoding="application/x-tex">a + b</annotation></semantics></math>, then both cases give similar
(approx. the same) answers.</p>
<p>For instance, if we’re sampling for blood types from UCSB, and we take a
student out without replacement, we don’t really change the sample size
substantially. So both answers give a similar result.</p>
<p>Suppose we have two types of items, type <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> and type <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>B</mi><annotation encoding="application/x-tex">B</annotation></semantics></math>. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>A</mi></msub><annotation encoding="application/x-tex">N_{A}</annotation></semantics></math>
be <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>#</mi><annotation encoding="application/x-tex">\#</annotation></semantics></math> type <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math>, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>N</mi><mi>B</mi></msub><annotation encoding="application/x-tex">N_{B}</annotation></semantics></math> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>#</mi><annotation encoding="application/x-tex">\#</annotation></semantics></math> type <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>B</mi><annotation encoding="application/x-tex">B</annotation></semantics></math>. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>N</mi><mo>=</mo><msub><mi>N</mi><mi>A</mi></msub><mo>+</mo><msub><mi>N</mi><mi>B</mi></msub></mrow><annotation encoding="application/x-tex">N = N_{A} + N_{B}</annotation></semantics></math> is the
total number of objects.</p>
<p>We sample <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> items <strong>without replacement</strong> (<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>≤</mo><mi>N</mi></mrow><annotation encoding="application/x-tex">n \leq N</annotation></semantics></math>) with order not
mattering. Denote by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> the number of type <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> objects in our sample.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>≤</mo><msub><mi>N</mi><mi>A</mi></msub><mo>≤</mo><mi>N</mi></mrow><annotation encoding="application/x-tex">0 \leq N_{A} \leq N</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>≤</mo><mi>n</mi><mo>≤</mo><mi>N</mi></mrow><annotation encoding="application/x-tex">1 \leq n \leq N</annotation></semantics></math> be integers. A random
variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the <strong>hypergeometric distribution</strong> with parameters
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mi>N</mi><mo>,</mo><msub><mi>N</mi><mi>A</mi></msub><mo>,</mo><mi>n</mi><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\left( N,N_{A},n \right)</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> takes values in the set
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>n</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ 0,1,\ldots,n \right\}</annotation></semantics></math> and has p.m.f.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>N</mi><mi>A</mi></msub></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>k</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>N</mi><mo>−</mo><msub><mi>N</mi><mi>A</mi></msub></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi><mo>−</mo><mi>k</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>N</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mfrac><mo>=</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = k) = \frac{\begin{pmatrix}
N_{A} \\
k
\end{pmatrix}\begin{pmatrix}
N - N_{A} \\
n - k
\end{pmatrix}}{\begin{pmatrix}
N \\
n
\end{pmatrix}} = p(k)</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mi>A</mi></msub><mo>=</mo><mn>10</mn></mrow><annotation encoding="application/x-tex">N_{A} = 10</annotation></semantics></math> defectives. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mi>B</mi></msub><mo>=</mo><mn>90</mn></mrow><annotation encoding="application/x-tex">N_{B} = 90</annotation></semantics></math> non-defectives. We select
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>=</mo><mn>5</mn></mrow><annotation encoding="application/x-tex">n = 5</annotation></semantics></math> without replacement. What is the probability that 2 of the 5
selected are defective?</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Hypergeom </mtext><mspace width="0.333em"></mspace></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>N</mi><mo>=</mo><mn>100</mn><mo>,</mo><msub><mi>N</mi><mi>A</mi></msub><mo>=</mo><mn>10</mn><mo>,</mo><mi>n</mi><mo>=</mo><mn>5</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Hypergeom }\left( N = 100,N_{A} = 10,n = 5 \right)</annotation></semantics></math></p>
<p>We want <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = 2)</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>10</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>90</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>100</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mfrac><mo>≈</mo><mn>0.0702</mn></mrow><annotation encoding="application/x-tex">P(X = 2) = \frac{\begin{pmatrix}
10 \\
2
\end{pmatrix}\begin{pmatrix}
90 \\
3
\end{pmatrix}}{\begin{pmatrix}
100 \\
5
\end{pmatrix}} \approx 0.0702</annotation></semantics></math></p>
<p><em>Remark. </em></p>
<p>Make sure you can distinguish when a problem is binomial or when it is
hypergeometric. This is very important on exams.</p>
<p>Recall that both ask about number of successes, in a fixed number of
trials. But binomial is sample with replacement (each trial is
independent) and sampling without replacement is hypergeometric.</p>
<h3 id="geometric-distribution">Geometric distribution</h3>
<p>Consider an infinite sequence of independent trials. e.g. number of
attempts until I make a basket.</p>
<p>In fact we can think of this as a variation on the binomial
distribution. But in this case we don’t sample <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> times and ask how
many successes we have, we sample as many times as we need for <em>one</em>
success. Later on we’ll see this is really a specific case of another
distribution, the <em>negative binomial</em>.</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>X</mi><mi>i</mi></msub><annotation encoding="application/x-tex">X_{i}</annotation></semantics></math> denote the outcome of the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>i</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">i^{\text{th}}</annotation></semantics></math> trial, where
success is 1 and failure is 0. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>N</mi><annotation encoding="application/x-tex">N</annotation></semantics></math> be the number of trials needed to
observe the first success in a sequence of independent trials with
probability of success <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math>. Then</p>
<p>We fail <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">k - 1</annotation></semantics></math> times and succeed on the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>k</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">k^{\text{th}}</annotation></semantics></math> try. Then:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>N</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><msub><mi>X</mi><mn>1</mn></msub><mo>=</mo><mn>0</mn><mo>,</mo><msub><mi>X</mi><mn>2</mn></msub><mo>=</mo><mn>0</mn><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>X</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>=</mo><mn>0</mn><mo>,</mo><msub><mi>X</mi><mi>k</mi></msub><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mi>p</mi></mrow><annotation encoding="application/x-tex">P(N = k) = P\left( X_{1} = 0,X_{2} = 0,\ldots,X_{k - 1} = 0,X_{k} = 1 \right) = (1 - p)^{k - 1}p</annotation></semantics></math></p>
<p>This is the probability of failures raised to the amount of failures,
times probability of success.</p>
<p>The key characteristic in these trials, we keep going until we succeed.
There’s no <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> choose <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">k</annotation></semantics></math> in front like the binomial distribution
because there’s exactly one sequence that gives us success.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>&lt;</mo><mi>p</mi><mo>≤</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">0 &lt; p \leq 1</annotation></semantics></math>. A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the geometric distribution
with success parameter <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math> if the possible values of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> are
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo>,</mo><mi>…</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ 1,2,3,\ldots \right\}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> satisfies</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mi>p</mi></mrow><annotation encoding="application/x-tex">P(X = k) = (1 - p)^{k - 1}p</annotation></semantics></math></p>
<p>for positive integers <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">k</annotation></semantics></math>. Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Geom</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Geom}(p)</annotation></semantics></math>.</p>
<p><em>Example. </em></p>
<p>What is the probability it takes more than seven rolls of a fair die to
roll a six?</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be the number of rolls of a fair die until the first six. Then
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Geom</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Geom}\left( \frac{1}{6} \right)</annotation></semantics></math>. Now we just want
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mn>7</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X &gt; 7)</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mn>7</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>8</mn></mrow><mo accent="false">∞</mo></munderover><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>8</mn></mrow><mo accent="false">∞</mo></munderover><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mfrac><mn>1</mn><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">P(X &gt; 7) = \sum_{k = 8}^{\infty}P(X = k) = \sum_{k = 8}^{\infty}\left( \frac{5}{6} \right)^{k - 1}\frac{1}{6}</annotation></semantics></math></p>
<p>Re-indexing,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>8</mn></mrow><mo accent="false">∞</mo></munderover><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>=</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>7</mn></msup><munderover><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>0</mn></mrow><mo accent="false">∞</mo></munderover><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mi>j</mi></msup></mrow><annotation encoding="application/x-tex">\sum_{k = 8}^{\infty}\left( \frac{5}{6} \right)^{k - 1}\frac{1}{6} = \frac{1}{6}\left( \frac{5}{6} \right)^{7}\sum_{j = 0}^{\infty}\left( \frac{5}{6} \right)^{j}</annotation></semantics></math></p>
<p>Now we calculate by standard methods:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>1</mn><mn>6</mn></mfrac><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>7</mn></msup><munderover><mo>∑</mo><mrow><mi>j</mi><mo>=</mo><mn>0</mn></mrow><mo accent="false">∞</mo></munderover><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mi>j</mi></msup><mo>=</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>7</mn></msup><mo>⋅</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>−</mo><mfrac><mn>5</mn><mn>6</mn></mfrac></mrow></mfrac><mo>=</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>5</mn><mn>6</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mn>7</mn></msup></mrow><annotation encoding="application/x-tex">\frac{1}{6}\left( \frac{5}{6} \right)^{7}\sum_{j = 0}^{\infty}\left( \frac{5}{6} \right)^{j} = \frac{1}{6}\left( \frac{5}{6} \right)^{7} \cdot \frac{1}{1 - \frac{5}{6}} = \left( \frac{5}{6} \right)^{7}</annotation></semantics></math></p>
<h3 id="negative-binomial">Negative binomial</h3>
<p>As promised, here’s the negative binomial.</p>
<p>Consider a sequence of Bernoulli trials with the following
characteristics:</p>
<ul>
<li><p>Each trial success or failure</p></li>
<li><p>Prob. of success <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math> is same on each trial</p></li>
<li><p>Trials are independent (notice they are not fixed to specific
number)</p></li>
<li><p>Experiment continues until <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>r</mi><annotation encoding="application/x-tex">r</annotation></semantics></math> successes are observed, where <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>r</mi><annotation encoding="application/x-tex">r</annotation></semantics></math> is
a given parameter</p></li>
</ul>
<p>Then if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is the number of trials necessary until <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>r</mi><annotation encoding="application/x-tex">r</annotation></semantics></math> successes are
observed, we say <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is a <strong>negative binomial</strong> random variable.</p>
<p>Immediately we see that the geometric distribution is just the negative
binomial with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">r = 1</annotation></semantics></math>.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>∈</mo><msup><mi>ℤ</mi><mo>+</mo></msup></mrow><annotation encoding="application/x-tex">k \in {\mathbb{Z}}^{+}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>&lt;</mo><mi>p</mi><mo>≤</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">0 &lt; p \leq 1</annotation></semantics></math>. A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>
has the negative binomial distribution with parameters
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mi>k</mi><mo>,</mo><mi>p</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ k,p \right\}</annotation></semantics></math> if the possible values of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> are the integers
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mi>k</mi><mo>,</mo><mi>k</mi><mo>+</mo><mn>1</mn><mo>,</mo><mi>k</mi><mo>+</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ k,k + 1,k + 2,\ldots \right\}</annotation></semantics></math> and the p.m.f. is</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>n</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>n</mi><mo>−</mo><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>k</mi><mo>−</mo><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mi>p</mi><mi>k</mi></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>n</mi><mo>−</mo><mi>k</mi></mrow></msup><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> for </mtext><mspace width="0.333em"></mspace></mrow><mi>n</mi><mo>≥</mo><mi>k</mi></mrow><annotation encoding="application/x-tex">P(X = n) = \begin{pmatrix}
n - 1 \\
k - 1
\end{pmatrix}p^{k}(1 - p)^{n - k}\text{ for }n \geq k</annotation></semantics></math></p>
<p>Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Negbin</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo>,</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Negbin}(k,p)</annotation></semantics></math>.</p>
<p><em>Example. </em></p>
<p>Steph Curry has a three point percentage of approx. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>43</mn><mi>%</mi></mrow><annotation encoding="application/x-tex">43\%</annotation></semantics></math>. What is the
probability that Steph makes his third three-point basket on his
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mn>5</mn><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">5^{\text{th}}</annotation></semantics></math> attempt?</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be number of attempts required to observe the 3rd success. Then,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Negbin</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo>=</mo><mn>3</mn><mo>,</mo><mi>p</mi><mo>=</mo><mn>0.43</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Negbin}(k = 3,p = 0.43)</annotation></semantics></math></p>
<p>So, <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>5</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><msup><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn><mo>−</mo><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn><mo>−</mo><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mn>0.43</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mn>3</mn></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mn>0.43</mn><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mn>5</mn><mo>−</mo><mn>3</mn></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>0.43</mn><mo stretchy="true" form="postfix">)</mo></mrow><mn>3</mn></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>0.57</mn><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>≈</mo><mn>0.155</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
P(X = 5) &amp; = {\begin{pmatrix}
5 - 1 \\
3 - 1
\end{pmatrix}(0.43)}^{3}(1 - 0.43)^{5 - 3} \\
 &amp; = \begin{pmatrix}
4 \\
2
\end{pmatrix}(0.43)^{3}(0.57)^{2} \\
 &amp; \approx 0.155
\end{aligned}</annotation></semantics></math></p>
<h3 id="poisson-distribution">Poisson distribution</h3>
<p>This p.m.f. follows from the Taylor expansion</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>e</mi><mi>λ</mi></msup><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mo accent="false">∞</mo></munderover><mfrac><msup><mi>λ</mi><mi>k</mi></msup><mrow><mi>k</mi><mi>!</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">e^{\lambda} = \sum_{k = 0}^{\infty}\frac{\lambda^{k}}{k!}</annotation></semantics></math></p>
<p>which implies that</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mo accent="false">∞</mo></munderover><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi></mrow></msup><mfrac><msup><mi>λ</mi><mi>k</mi></msup><mrow><mi>k</mi><mi>!</mi></mrow></mfrac><mo>=</mo><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi></mrow></msup><msup><mi>e</mi><mi>λ</mi></msup><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\sum_{k = 0}^{\infty}e^{- \lambda}\frac{\lambda^{k}}{k!} = e^{- \lambda}e^{\lambda} = 1</annotation></semantics></math></p>
<p><em>Definition. </em></p>
<p>For an integer valued random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>, we say
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Poisson</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Poisson}(\lambda)</annotation></semantics></math> if it has p.m.f.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi></mrow></msup><mfrac><msup><mi>λ</mi><mi>k</mi></msup><mrow><mi>k</mi><mi>!</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">P(X = k) = e^{- \lambda}\frac{\lambda^{k}}{k!}</annotation></semantics></math></p>
<p>for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>∈</mo><mrow><mo stretchy="true" form="prefix">{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo stretchy="true" form="postfix">}</mo></mrow></mrow><annotation encoding="application/x-tex">k \in \left\{ 0,1,2,\ldots \right\}</annotation></semantics></math> for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\lambda &gt; 0</annotation></semantics></math> and</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mo accent="false">∞</mo></munderover><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\sum_{k = 0}^{\infty}P(X = k) = 1</annotation></semantics></math></p>
<p>The Poisson arises from the Binomial. It applies in the binomial context
when <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> is very large (<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>≥</mo><mn>100</mn></mrow><annotation encoding="application/x-tex">n \geq 100</annotation></semantics></math>) and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math> is very small
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>≤</mo><mn>0.05</mn></mrow><annotation encoding="application/x-tex">p \leq 0.05</annotation></semantics></math>, such that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mi>p</mi></mrow><annotation encoding="application/x-tex">np</annotation></semantics></math> is a moderate number (<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mi>p</mi><mo>&lt;</mo><mn>10</mn></mrow><annotation encoding="application/x-tex">np &lt; 10</annotation></semantics></math>).</p>
<p>Then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> follows a Poisson distribution with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>=</mo><mi>n</mi><mi>p</mi></mrow><annotation encoding="application/x-tex">\lambda = np</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mtext mathvariant="normal">Bin</mtext><mrow><mo stretchy="true" form="prefix">(</mo><mi>n</mi><mo>,</mo><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≈</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mtext mathvariant="normal">Poisson</mtext><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo>=</mo><mi>n</mi><mi>p</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P\left( \text{Bin}(n,p) = k \right) \approx P\left( \text{Poisson}(\lambda = np) = k \right)</annotation></semantics></math></p>
<p>for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>=</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">k = 0,1,\ldots,n</annotation></semantics></math>.</p>
<p>The Poisson distribution is useful for finding the probabilities of rare
events over a continuous interval of time. By knowing <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>=</mo><mi>n</mi><mi>p</mi></mrow><annotation encoding="application/x-tex">\lambda = np</annotation></semantics></math> for
small <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math>, we can calculate many probabilities.</p>
<p><em>Example. </em></p>
<p>The number of typing errors in the page of a textbook.</p>
<p>Let</p>
<ul>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> be the number of letters of symbols per page (large)</p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math> be the probability of error, small enough such that</p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munder><mo>lim</mo><mrow><mi>n</mi><mo>→</mo><mi>∞</mi></mrow></munder><munder><mo>lim</mo><mrow><mi>p</mi><mo>→</mo><mn>0</mn></mrow></munder><mi>n</mi><mi>p</mi><mo>=</mo><mi>λ</mi><mo>=</mo><mn>0.1</mn></mrow><annotation encoding="application/x-tex">\lim\limits_{n \rightarrow \infty}\lim\limits_{p \rightarrow 0}np = \lambda = 0.1</annotation></semantics></math></p></li>
</ul>
<p>What is the probability of exactly 1 error?</p>
<p>We can approximate the distribution of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> with a
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mtext mathvariant="normal">Poisson</mtext><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo>=</mo><mn>0.1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\text{Poisson}(\lambda = 0.1)</annotation></semantics></math> distribution</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><msup><mi>e</mi><mrow><mi>−</mi><mn>0.1</mn></mrow></msup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>0.1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mn>1</mn></msup></mrow><mrow><mn>1</mn><mi>!</mi></mrow></mfrac><mo>=</mo><mn>0.09048</mn></mrow><annotation encoding="application/x-tex">P(X = 1) = \frac{e^{- 0.1}(0.1)^{1}}{1!} = 0.09048</annotation></semantics></math></p>
<h2 id="continuous-distributions">Continuous distributions</h2>
<p>All of the distributions we’ve been analyzing have been discrete, that
is, they apply to random variables with a
<a href="https://en.wikipedia.org/wiki/Countable_set">countable</a> state space.
Even when the state space is infinite, as in the negative binomial, it
is countable. We can think of it as indexing each trial with a natural
number <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo>,</mo><mi>…</mi></mrow><annotation encoding="application/x-tex">0,1,2,3,\ldots</annotation></semantics></math>.</p>
<p>Now we turn our attention to continuous random variables that operate on
uncountably infinite state spaces. For example, if we sample uniformly
inside of the interval <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,1\rbrack</annotation></semantics></math>, there are an uncountably
infinite number of possible values we could obtain. We cannot index
these values by the natural numbers, by some theorems of set theory we
in fact know that the interval <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,1\rbrack</annotation></semantics></math> has a bijection to
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>ℝ</mi><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics></math> and has cardinality <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>א</mi><mn>1</mn></msub><annotation encoding="application/x-tex">א_{1}</annotation></semantics></math>.</p>
<p>Additionally we notice that asking for the probability that we pick a
certain point in the interval <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,1\rbrack</annotation></semantics></math> makes no sense, there
are an infinite amount of sample points! Intuitively we should think
that the probability of choosing any particular point is 0. However, we
should be able to make statements about whether we can choose a point
that lies within a subset, like <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>0.5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,0.5\rbrack</annotation></semantics></math>.</p>
<p>Let’s formalize these ideas.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be a random variable. If we have a function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> such that</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>b</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(X \leq b) = \int_{- \infty}^{b}f(x)dx</annotation></semantics></math> for all
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi><mo>∈</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">b \in {\mathbb{R}}</annotation></semantics></math>, then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> is the <strong>probability density function</strong>
of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p>The probability that the value of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> lies in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">(</mo><mi>−</mi><mi>∞</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">( - \infty,b\rbrack</annotation></semantics></math>
equals the area under the curve of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> from <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>−</mi><mi>∞</mi></mrow><annotation encoding="application/x-tex">- \infty</annotation></semantics></math> to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>b</mi><annotation encoding="application/x-tex">b</annotation></semantics></math>.</p>
<p>If <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> satisfies this definition, then for any <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mo>⊂</mo><mi>ℝ</mi></mrow><annotation encoding="application/x-tex">B \subset {\mathbb{R}}</annotation></semantics></math>
for which integration makes sense,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>∈</mo><mi>B</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mo>∫</mo><mi>B</mi></msub><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">P(X \in B) = \int_{B}f(x)dx</annotation></semantics></math></p>
<p><em>Remark. </em></p>
<p>Recall from our previous discussion of random variables that the PDF is
the analogue of the PMF for discrete random variables.</p>
<p>Properties of a CDF:</p>
<p>Any CDF <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">F(x) = P(X \leq x)</annotation></semantics></math> satisfies</p>
<ol>
<li><p>Integrates to unity: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mi>∞</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">F( - \infty) = 0</annotation></semantics></math>, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>∞</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">F(\infty) = 1</annotation></semantics></math></p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">F(x)</annotation></semantics></math> is non-decreasing in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> (monotonically increasing)</p></li>
</ol>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi><mo>&lt;</mo><mi>t</mi><mo>⇒</mo><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≤</mo><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">s &lt; t \Rightarrow F(s) \leq F(t)</annotation></semantics></math></p>
<ol>
<li><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mo>&lt;</mo><mi>X</mi><mo>≤</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>−</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>b</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>−</mo><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(a &lt; X \leq b) = P(X \leq b) - P(X \leq a) = F(b) - F(a)</annotation></semantics></math></li>
</ol>
<p>Like we mentioned before, we can only ask about things like
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \leq k)</annotation></semantics></math>, but not <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X = k)</annotation></semantics></math>. In fact <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">P(X = k) = 0</annotation></semantics></math> for all <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">k</annotation></semantics></math>.
An immediate corollary of this is that we can freely interchange <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>≤</mo><annotation encoding="application/x-tex">\leq</annotation></semantics></math>
and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>&lt;</mo><annotation encoding="application/x-tex">&lt;</annotation></semantics></math> and likewise for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>≥</mo><annotation encoding="application/x-tex">\geq</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>&gt;</mo><annotation encoding="application/x-tex">&gt;</annotation></semantics></math>, since <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&lt;</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \leq k) = P(X &lt; k)</annotation></semantics></math>
if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">P(X = k) = 0</annotation></semantics></math>.</p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be a continuous random variable with density (pdf)</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mi>c</mi><msup><mi>x</mi><mn>2</mn></msup></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mn>0</mn><mo>&lt;</mo><mi>x</mi><mo>&lt;</mo><mn>2</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">otherwise </mtext><mspace width="0.333em"></mspace></mrow></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f(x) = \begin{cases}
cx^{2} &amp; \text{for }0 &lt; x &lt; 2 \\
0 &amp; \text{otherwise }
\end{cases}</annotation></semantics></math></p>
<ol>
<li>What is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math>?</li>
</ol>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> is such that
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mn>2</mn></msubsup><mi>c</mi><msup><mi>x</mi><mn>2</mn></msup><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">1 = \int_{- \infty}^{\infty}f(x)dx = \int_{0}^{2}cx^{2}dx</annotation></semantics></math></p>
<ol>
<li>Find the probability that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is between 1 and 1.4.</li>
</ol>
<p>Integrate the curve between 1 and 1.4.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><msubsup><mo>∫</mo><mn>1</mn><mn>1.4</mn></msubsup><mfrac><mn>3</mn><mn>8</mn></mfrac><msup><mi>x</mi><mn>2</mn></msup><mi>d</mi><mi>x</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><msup><mi>x</mi><mn>3</mn></msup><mn>8</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><msubsup><mo stretchy="false" form="prefix">|</mo><mn>1</mn><mn>1.4</mn></msubsup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mn>0.218</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
\int_{1}^{1.4}\frac{3}{8}x^{2}dx = \left( \frac{x^{3}}{8} \right)|_{1}^{1.4} \\
 = 0.218
\end{array}</annotation></semantics></math></p>
<p>This is the probability that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> lies between 1 and 1.4.</p>
<ol>
<li>Find the probability that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is between 1 and 3.</li>
</ol>
<p>Idea: integrate between 1 and 3, be careful after 2.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mo>∫</mo><mn>1</mn><mn>2</mn></msubsup><mfrac><mn>3</mn><mn>8</mn></mfrac><msup><mi>x</mi><mn>2</mn></msup><mi>d</mi><mi>x</mi><mo>+</mo><msubsup><mo>∫</mo><mn>2</mn><mn>3</mn></msubsup><mn>0</mn><mi>d</mi><mi>x</mi><mo>=</mo></mrow><annotation encoding="application/x-tex">\int_{1}^{2}\frac{3}{8}x^{2}dx + \int_{2}^{3}0dx =</annotation></semantics></math></p>
<ol>
<li>What is the CDF for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(X \leq x)</annotation></semantics></math>? Integrate the curve to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math>.</li>
</ol>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>x</mi></msubsup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>t</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mi>x</mi></msubsup><mfrac><mn>3</mn><mn>8</mn></mfrac><msup><mi>t</mi><mn>2</mn></msup><mi>d</mi><mi>t</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><msup><mi>x</mi><mn>3</mn></msup><mn>8</mn></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
F(x) = P(X \leq x) = \int_{- \infty}^{x}f(t)dt \\
 = \int_{0}^{x}\frac{3}{8}t^{2}dt \\
 = \frac{x^{3}}{8}
\end{array}</annotation></semantics></math></p>
<p>Important: include the range!</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>≤</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mfrac><msup><mi>x</mi><mn>3</mn></msup><mn>8</mn></mfrac></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mn>0</mn><mo>&lt;</mo><mi>x</mi><mo>&lt;</mo><mn>2</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>≥</mo><mn>2</mn></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">F(x) = \begin{cases}
0 &amp; \text{for }x \leq 0 \\
\frac{x^{3}}{8} &amp; \text{for }0 &lt; x &lt; 2 \\
1 &amp; \text{for }x \geq 2
\end{cases}</annotation></semantics></math></p>
<ol>
<li>Find a point <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>a</mi><annotation encoding="application/x-tex">a</annotation></semantics></math> such that you integrate up to the point to find
exactly <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mfrac><mn>1</mn><mn>2</mn></mfrac><annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics></math></li>
</ol>
<p>the area.</p>
<p>We want to find <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\frac{1}{2} = P(X \leq a)</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><msup><mi>a</mi><mn>3</mn></msup><mn>8</mn></mfrac><mo>⇒</mo><mi>a</mi><mo>=</mo><mroot><mn>4</mn><mn>3</mn></mroot></mrow><annotation encoding="application/x-tex">\frac{1}{2} = P(X \leq a) = F(a) = \frac{a^{3}}{8} \Rightarrow a = \sqrt[3]{4}</annotation></semantics></math></p>
<p>Now let us discuss some named continuous distributions.</p>
<h3 id="the-continuous-uniform-distribution">The (continuous) uniform distribution</h3>
<p>The most simple and the best of the named distributions!</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack a,b\rbrack</annotation></semantics></math> be a bounded interval on the real line. A
random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the uniform distribution on the interval
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack a,b\rbrack</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the density function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mfrac><mn>1</mn><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>∉</mo><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f(x) = \begin{cases}
\frac{1}{b - a} &amp; \text{for }x \in \lbrack a,b\rbrack \\
0 &amp; \text{for }x \notin \lbrack a,b\rbrack
\end{cases}</annotation></semantics></math></p>
<p>Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Unif </mtext><mspace width="0.333em"></mspace></mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Unif }\lbrack a,b\rbrack</annotation></semantics></math>.</p>
<p>The graph of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mtext mathvariant="normal">Unif </mtext><mspace width="0.333em"></mspace></mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\text{Unif }\lbrack a,b\rbrack</annotation></semantics></math> is a constant line at
height <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mfrac><mn>1</mn><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac><annotation encoding="application/x-tex">\frac{1}{b - a}</annotation></semantics></math> defined across <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack a,b\rbrack</annotation></semantics></math>. The
integral is just the area of a rectangle, and we can check it is 1.</p>
<p><em>Fact. </em></p>
<p>For <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Unif </mtext><mspace width="0.333em"></mspace></mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Unif }\lbrack a,b\rbrack</annotation></semantics></math>, its cumulative distribution
function (CDF) is given by:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>F</mi><mi>x</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>&lt;</mo><mi>a</mi></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mfrac><mrow><mi>x</mi><mo>−</mo><mi>a</mi></mrow><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>&gt;</mo><mi>b</mi></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">F_{x}(x) = \begin{cases}
0 &amp; \text{for }x &lt; a \\
\frac{x - a}{b - a} &amp; \text{for }x \in \lbrack a,b\rbrack \\
1 &amp; \text{for }x &gt; b
\end{cases}</annotation></semantics></math></p>
<p><em>Fact. </em></p>
<p>If <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Unif </mtext><mspace width="0.333em"></mspace></mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Unif }\lbrack a,b\rbrack</annotation></semantics></math>, and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>c</mi><mo>,</mo><mi>d</mi><mo stretchy="false" form="postfix">]</mo><mo>⊂</mo><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack c,d\rbrack \subset \lbrack a,b\rbrack</annotation></semantics></math>, then
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>c</mi><mo>≤</mo><mi>X</mi><mo>≤</mo><mi>d</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mi>c</mi><mi>d</mi></msubsup><mfrac><mn>1</mn><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac><mi>d</mi><mi>x</mi><mo>=</mo><mfrac><mrow><mi>d</mi><mo>−</mo><mi>c</mi></mrow><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">P(c \leq X \leq d) = \int_{c}^{d}\frac{1}{b - a}dx = \frac{d - c}{b - a}</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> be a uniform random variable on <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>−</mi><mn>2</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack - 2,5\rbrack</annotation></semantics></math>. Find the
probability that its absolute value is at least 1.</p>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> takes values in the interval <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>−</mi><mn>2</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack - 2,5\rbrack</annotation></semantics></math>, so the absolute
value is at least 1 iff.
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mi>−</mi><mn>2</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo><mo>∪</mo><mo stretchy="false" form="prefix">[</mo><mn>1</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">Y \in \lbrack - 2,1\rbrack \cup \lbrack 1,5\rbrack</annotation></semantics></math>.</p>
<p>The density function of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> is
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mrow><mn>5</mn><mo>−</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow></mfrac><mo>=</mo><mfrac><mn>1</mn><mn>7</mn></mfrac></mrow><annotation encoding="application/x-tex">f(x) = \frac{1}{5 - ( - 2)} = \frac{1}{7}</annotation></semantics></math> on <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>−</mi><mn>2</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack - 2,5\rbrack</annotation></semantics></math>
and 0 everywhere else.</p>
<p>So,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mrow><mo stretchy="true" form="prefix">|</mo><mi>Y</mi><mo stretchy="true" form="postfix">|</mo></mrow><mo>≥</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>Y</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mi>−</mi><mn>2</mn><mo>,</mo><mi>−</mi><mn>1</mn><mo stretchy="false" form="postfix">]</mo><mo>∪</mo><mo stretchy="false" form="prefix">[</mo><mn>1</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mn>2</mn><mo>≤</mo><mi>Y</mi><mo>≤</mo><mi>−</mi><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>≤</mo><mi>Y</mi><mo>≤</mo><mn>5</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mfrac><mn>5</mn><mn>7</mn></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
P\left( |Y| \geq 1 \right) &amp; = P\left( Y \in \lbrack - 2, - 1\rbrack \cup \lbrack 1,5\rbrack \right) \\
 &amp; = P( - 2 \leq Y \leq - 1) + P(1 \leq Y \leq 5) \\
 &amp; = \frac{5}{7}
\end{aligned}</annotation></semantics></math></p>
<h3 id="the-exponential-distribution">The exponential distribution</h3>
<p>The geometric distribution can be viewed as modeling waiting times, in a
discrete setting, i.e. we wait for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">n - 1</annotation></semantics></math> failures to arrive at the
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> success.</p>
<p>The exponential distribution is the continuous analogue to the geometric
distribution, in that we often use it to model waiting times in the
continuous sense. For example, the first custom to enter the barber
shop.</p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>&lt;</mo><mi>λ</mi><mo>&lt;</mo><mi>∞</mi></mrow><annotation encoding="application/x-tex">0 &lt; \lambda &lt; \infty</annotation></semantics></math>. A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the exponential
distribution with parameter <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>λ</mi><annotation encoding="application/x-tex">\lambda</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has PDF</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mi>λ</mi><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>≥</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>&lt;</mo><mn>0</mn></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f(x) = \begin{cases}
\lambda e^{- \lambda x} &amp; \text{for }x \geq 0 \\
0 &amp; \text{for }x &lt; 0
\end{cases}</annotation></semantics></math></p>
<p>Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Exp</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Exp}(\lambda)</annotation></semantics></math>, the exponential
distribution with rate <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>λ</mi><annotation encoding="application/x-tex">\lambda</annotation></semantics></math>.</p>
<p>The CDF of the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mtext mathvariant="normal">Exp</mtext><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\text{Exp}(\lambda)</annotation></semantics></math> distribution is given by:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">if </mtext><mspace width="0.333em"></mspace></mrow><mi>t</mi><mo>&lt;</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn><mo>−</mo><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>t</mi></mrow></msup></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">if </mtext><mspace width="0.333em"></mspace></mrow><mi>t</mi><mo>≥</mo><mn>0</mn></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">F(t) + \begin{cases}
0 &amp; \text{if }t &lt; 0 \\
1 - e^{- \lambda t} &amp; \text{if }t \geq 0
\end{cases}</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Suppose the length of a phone call, in minutes, is well modeled by an
exponential random variable with a rate <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>=</mo><mfrac><mn>1</mn><mn>10</mn></mfrac></mrow><annotation encoding="application/x-tex">\lambda = \frac{1}{10}</annotation></semantics></math>.</p>
<ol>
<li><p>What is the probability that a call takes more than 8 minutes?</p></li>
<li><p>What is the probability that a call takes between 8 and 22 minutes?</p></li>
</ol>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be the length of the phone call, so that
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Exp</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><mn>10</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Exp}\left( \frac{1}{10} \right)</annotation></semantics></math>. Then we can find the
desired probability by:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mn>8</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mn>1</mn><mo>−</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mn>8</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mn>1</mn><mo>−</mo><msub><mi>F</mi><mi>x</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mn>8</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mn>1</mn><mo>−</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><msup><mi>e</mi><mrow><mi>−</mi><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><mn>10</mn></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mn>8</mn></mrow></msup><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><msup><mi>e</mi><mrow><mi>−</mi><mfrac><mn>8</mn><mn>10</mn></mfrac></mrow></msup><mo>≈</mo><mn>0.4493</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
P(X &gt; 8) &amp; = 1 - P(X \leq 8) \\
 &amp; = 1 - F_{x}(8) \\
 &amp; = 1 - \left( 1 - e^{- \left( \frac{1}{10} \right) \cdot 8} \right) \\
 &amp; = e^{- \frac{8}{10}} \approx 0.4493
\end{aligned}</annotation></semantics></math></p>
<p>Now to find <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>8</mn><mo>&lt;</mo><mi>X</mi><mo>&lt;</mo><mn>22</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P(8 &lt; X &lt; 22)</annotation></semantics></math>, we can take the difference in CDFs:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mn>8</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>−</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≥</mo><mn>22</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><msup><mi>e</mi><mrow><mi>−</mi><mfrac><mn>8</mn><mn>10</mn></mfrac></mrow></msup><mo>−</mo><msup><mi>e</mi><mrow><mi>−</mi><mfrac><mn>22</mn><mn>10</mn></mfrac></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>≈</mo><mn>0.3385</mn></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
 &amp; P(X &gt; 8) - P(X \geq 22) \\
 &amp; = e^{- \frac{8}{10}} - e^{- \frac{22}{10}} \\
 &amp; \approx 0.3385
\end{aligned}</annotation></semantics></math></p>
<p><em>Fact (Memoryless property of the exponential distribution).</em></p>
<p>Suppose that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Exp</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Exp}(\lambda)</annotation></semantics></math>. Then for any <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi><mo>,</mo><mi>t</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">s,t &gt; 0</annotation></semantics></math>, we
have <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo>+</mo><mi>s</mi><mspace width="0.222em"></mspace><mo stretchy="false" form="prefix">|</mo><mspace width="0.222em"></mspace><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">P\left( X &gt; t + s~|~X &gt; t \right) = P(X &gt; s)</annotation></semantics></math></p>
<p>This is like saying if I’ve been waiting 5 minutes and then 3 minutes
for the bus, what is the probability that I’m gonna wait more than 5 + 3
minutes, given that I’ve already waited 5 minutes? And that’s precisely
equal to just the probability I’m gonna wait more than 3 minutes.</p>
<p><em>Proof. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo>+</mo><mi>s</mi><mspace width="0.222em"></mspace><mo stretchy="false" form="prefix">|</mo><mspace width="0.222em"></mspace><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo>+</mo><mi>s</mi><mo>∩</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></mfrac></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo>+</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></mfrac><mo>=</mo><mfrac><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo>+</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></msup><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>t</mi></mrow></msup></mfrac><mo>=</mo><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>s</mi></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>≡</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>&gt;</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
P\left( X &gt; t + s~|~X &gt; t \right) = \frac{P(X &gt; t + s \cap X &gt; t)}{P(X &gt; t)} \\
 = \frac{P(X &gt; t + s)}{P(X &gt; t)} = \frac{e^{- \lambda(t + s)}}{e^{- \lambda t}} = e^{- \lambda s} \\
 \equiv P(X &gt; s)
\end{array}</annotation></semantics></math></p>
<h3 id="gamma-distribution">Gamma distribution</h3>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo>,</mo><mi>λ</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">r,\lambda &gt; 0</annotation></semantics></math>. A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the <strong>gamma
distribution</strong> with parameters <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mi>r</mi><mo>,</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">(r,\lambda)</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is nonnegative and
has probability density function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mfrac><mrow><msup><mi>λ</mi><mi>r</mi></msup><msup><mi>x</mi><mrow><mi>r</mi><mo>−</mo><mn>2</mn></mrow></msup></mrow><mrow><mi>Γ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>r</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></mfrac><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>≥</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn></mtd><mtd columnalign="left" style="text-align: left"><mrow><mtext mathvariant="normal">for </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>&lt;</mo><mn>0</mn></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f(x) = \begin{cases}
\frac{\lambda^{r}x^{r - 2}}{\Gamma(r)}e^{- \lambda x} &amp; \text{for }x \geq 0 \\
0 &amp; \text{for }x &lt; 0
\end{cases}</annotation></semantics></math></p>
<p>Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Gamma</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>r</mi><mo>,</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Gamma}(r,\lambda)</annotation></semantics></math>.</p>
<p>The gamma function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Γ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>r</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\Gamma(r)</annotation></semantics></math> generalizes the factorial function and is
defined as</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Γ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>r</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><msup><mi>x</mi><mrow><mi>r</mi><mo>−</mo><mn>1</mn></mrow></msup><msup><mi>e</mi><mrow><mi>−</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> for </mtext><mspace width="0.333em"></mspace></mrow><mi>r</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\Gamma(r) = \int_{0}^{\infty}x^{r - 1}e^{- x}dx,\text{ for }r &gt; 0</annotation></semantics></math></p>
<p>Special case: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Γ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>n</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>n</mi><mo>−</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mi>!</mi></mrow><annotation encoding="application/x-tex">\Gamma(n) = (n - 1)!</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>∈</mo><msup><mi>ℤ</mi><mo>+</mo></msup></mrow><annotation encoding="application/x-tex">n \in {\mathbb{Z}}^{+}</annotation></semantics></math>.</p>
<p><em>Remark. </em></p>
<p>The <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mtext mathvariant="normal">Exp</mtext><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\text{Exp}(\lambda)</annotation></semantics></math> distribution is a special case of the gamma
distribution, with parameter <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">r = 1</annotation></semantics></math>.</p>
<h2 id="the-normal-distribution">The normal distribution</h2>
<p>Also known as the Gaussian distribution, this is so important it gets
its own section.</p>
<p><em>Definition. </em></p>
<p>A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Z</mi><annotation encoding="application/x-tex">Z</annotation></semantics></math> has the <strong>standard normal distribution</strong> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Z</mi><annotation encoding="application/x-tex">Z</annotation></semantics></math>
has density function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><msqrt><mrow><mn>2</mn><mi>π</mi></mrow></msqrt></mfrac><msup><mi>e</mi><mrow><mi>−</mi><mfrac><msup><mi>x</mi><mn>2</mn></msup><mn>2</mn></mfrac></mrow></msup></mrow><annotation encoding="application/x-tex">\varphi(x) = \frac{1}{\sqrt{2\pi}}e^{- \frac{x^{2}}{2}}</annotation></semantics></math> on the real
line. Abbreviate this by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Z</mi><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">Z\sim N(0,1)</annotation></semantics></math>.</p>
<p><em>Fact (CDF of a standard normal random variable).</em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Z</mi><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">Z\sim N(0,1)</annotation></semantics></math> be normally distributed. Then its CDF is given by
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>x</mi></msubsup><mi>φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>s</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>s</mi><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>x</mi></msubsup><mfrac><mn>1</mn><msqrt><mrow><mn>2</mn><mi>π</mi></mrow></msqrt></mfrac><msup><mi>e</mi><mfrac><mrow><mi>−</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><msup><mi>s</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mn>2</mn></mfrac></msup><mi>d</mi><mi>s</mi></mrow><annotation encoding="application/x-tex">\Phi(x) = \int_{- \infty}^{x}\varphi(s)ds = \int_{- \infty}^{x}\frac{1}{\sqrt{2\pi}}e^{\frac{- \left( - s^{2} \right)}{2}}ds</annotation></semantics></math></p>
<p>The normal distribution is so important, instead of the standard
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>f</mi><mrow><mi>Z</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></msub><annotation encoding="application/x-tex">f_{Z(x)}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>F</mi><mrow><mi>z</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow></msub><annotation encoding="application/x-tex">F_{z(x)}</annotation></semantics></math>, we use the special <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\varphi(x)</annotation></semantics></math> and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\Phi(x)</annotation></semantics></math>.</p>
<p><em>Fact. </em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><msup><mi>e</mi><mrow><mi>−</mi><mfrac><msup><mi>s</mi><mn>2</mn></msup><mn>2</mn></mfrac></mrow></msup><mi>d</mi><mi>s</mi><mo>=</mo><msqrt><mrow><mn>2</mn><mi>π</mi></mrow></msqrt></mrow><annotation encoding="application/x-tex">\int_{- \infty}^{\infty}e^{- \frac{s^{2}}{2}}ds = \sqrt{2\pi}</annotation></semantics></math></p>
<p>No closed form of the standard normal CDF <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Φ</mi><annotation encoding="application/x-tex">\Phi</annotation></semantics></math> exists, so we are left
to either:</p>
<ul>
<li><p>approximate</p></li>
<li><p>use technology (calculator)</p></li>
<li><p>use the standard normal probability table in the textbook</p></li>
</ul>
<p>To evaluate negative values, we can use the symmetry of the normal
distribution to apply the following identity:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn><mo>−</mo><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\Phi( - x) = 1 - \Phi(x)</annotation></semantics></math></p>
<h3 id="general-normal-distributions">General normal distributions</h3>
<p>We can compute any other parameters of the normal distribution using the
standard normal.</p>
<p>The general family of normal distributions is obtained by linear or
affine transformations of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Z</mi><annotation encoding="application/x-tex">Z</annotation></semantics></math>. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>μ</mi><annotation encoding="application/x-tex">\mu</annotation></semantics></math> be real, and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>σ</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\sigma &gt; 0</annotation></semantics></math>, then</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>=</mo><mi>σ</mi><mi>Z</mi><mo>+</mo><mi>μ</mi></mrow><annotation encoding="application/x-tex">X = \sigma Z + \mu</annotation></semantics></math> is also a normally distributed random variable
with parameters <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mi>μ</mi><mo>,</mo><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\left( \mu,\sigma^{2} \right)</annotation></semantics></math>. The CDF of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> in terms
of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>⋅</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\Phi( \cdot )</annotation></semantics></math> can be expressed as</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><msub><mi>F</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>σ</mi><mi>Z</mi><mo>+</mo><mi>μ</mi><mo>≤</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>Z</mi><mo>≤</mo><mfrac><mrow><mi>x</mi><mo>−</mo><mi>μ</mi></mrow><mi>σ</mi></mfrac><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mrow><mi>x</mi><mo>−</mo><mi>μ</mi></mrow><mi>σ</mi></mfrac><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
F_{X}(x) &amp; = P(X \leq x) \\
 &amp; = P(\sigma Z + \mu \leq x) \\
 &amp; = P\left( Z \leq \frac{x - \mu}{\sigma} \right) \\
 &amp; = \Phi(\frac{x - \mu}{\sigma})
\end{aligned}</annotation></semantics></math></p>
<p>Also,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>F</mi><mi>′</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mi>d</mi><mrow><mi>d</mi><mi>x</mi></mrow></mfrac><mrow><mo stretchy="true" form="prefix">[</mo><mi>Φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mrow><mi>x</mi><mo>−</mo><mi>u</mi></mrow><mi>σ</mi></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mi>σ</mi></mfrac><mi>φ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mrow><mi>x</mi><mo>−</mo><mi>u</mi></mrow><mi>σ</mi></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><msqrt><mrow><mn>2</mn><mi>π</mi><msup><mi>σ</mi><mn>2</mn></msup></mrow></msqrt></mfrac><msup><mi>e</mi><mfrac><mrow><mi>−</mi><mrow><mo stretchy="true" form="prefix">(</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mn>2</mn><msup><mi>σ</mi><mn>2</mn></msup></mrow></mfrac></msup></mrow><annotation encoding="application/x-tex">f(x) = F\prime(x) = \frac{d}{dx}\left\lbrack \Phi(\frac{x - u}{\sigma}) \right\rbrack = \frac{1}{\sigma}\varphi(\frac{x - u}{\sigma}) = \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{\frac{- \left( (x - \mu)^{2} \right)}{2\sigma^{2}}}</annotation></semantics></math></p>
<p><em>Definition. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>μ</mi><annotation encoding="application/x-tex">\mu</annotation></semantics></math> be real and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>σ</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\sigma &gt; 0</annotation></semantics></math>. A random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has the
<em>normal distribution</em> with mean <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>μ</mi><annotation encoding="application/x-tex">\mu</annotation></semantics></math> and variance <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>σ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">\sigma^{2}</annotation></semantics></math> if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>
has density function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><msqrt><mrow><mn>2</mn><mi>π</mi><msup><mi>σ</mi><mn>2</mn></msup></mrow></msqrt></mfrac><msup><mi>e</mi><mfrac><mrow><mi>−</mi><mrow><mo stretchy="true" form="prefix">(</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mrow><mn>2</mn><msup><mi>σ</mi><mn>2</mn></msup></mrow></mfrac></msup></mrow><annotation encoding="application/x-tex">f(x) = \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{\frac{- \left( (x - \mu)^{2} \right)}{2\sigma^{2}}}</annotation></semantics></math></p>
<p>on the real line. Abbreviate this by
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>μ</mi><mo>,</mo><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim N\left( \mu,\sigma^{2} \right)</annotation></semantics></math>.</p>
<p><em>Fact. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>μ</mi><mo>,</mo><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim N\left( \mu,\sigma^{2} \right)</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>=</mo><mi>a</mi><mi>X</mi><mo>+</mo><mi>b</mi></mrow><annotation encoding="application/x-tex">Y = aX + b</annotation></semantics></math>. Then
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Y</mi><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mi>μ</mi><mo>+</mo><mi>b</mi><mo>,</mo><msup><mi>a</mi><mn>2</mn></msup><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">Y\sim N\left( a\mu + b,a^{2}\sigma^{2} \right)</annotation></semantics></math></p>
<p>That is, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> is normally distributed with parameters
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mi>a</mi><mi>μ</mi><mo>+</mo><mi>b</mi><mo>,</mo><msup><mi>a</mi><mn>2</mn></msup><msup><mi>σ</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\left( a\mu + b,a^{2}\sigma^{2} \right)</annotation></semantics></math>. In particular,
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>Z</mi><mo>=</mo><mfrac><mrow><mi>X</mi><mo>−</mo><mi>μ</mi></mrow><mi>σ</mi></mfrac><mo>∼</mo><mi>N</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">Z = \frac{X - \mu}{\sigma}\sim N(0,1)</annotation></semantics></math> is a standard normal variable.</p>
<h2 id="expectation">Expectation</h2>
<p>Let’s discuss the <em>expectation</em> of a random variable, which is a similar
idea to the basic concept of <em>mean</em>.</p>
<p><em>Definition. </em></p>
<p>The expectation or mean of a discrete random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is the
weighted average, with weights assigned by the corresponding
probabilities.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><munder><mo>∑</mo><mrow><mrow><mtext mathvariant="normal">all </mtext><mspace width="0.333em"></mspace></mrow><msub><mi>x</mi><mi>i</mi></msub></mrow></munder><msub><mi>x</mi><mi>i</mi></msub><mo>⋅</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">E(X) = \sum_{\text{all }x_{i}}x_{i} \cdot p\left( x_{i} \right)</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Find the expected value of a single roll of a fair die.</p>
<ul>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>=</mo><mfrac><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> score </mtext><mspace width="0.333em"></mspace></mrow><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> dots</mtext></mrow></mfrac></mrow><annotation encoding="application/x-tex">X = \frac{\text{ score }}{\text{ dots}}</annotation></semantics></math></p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo>,</mo><mn>4</mn><mo>,</mo><mn>5</mn><mo>,</mo><mn>6</mn></mrow><annotation encoding="application/x-tex">x = 1,2,3,4,5,6</annotation></semantics></math></p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>,</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>,</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>,</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>,</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>,</mo><mfrac><mn>1</mn><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">p(x) = \frac{1}{6},\frac{1}{6},\frac{1}{6},\frac{1}{6},\frac{1}{6},\frac{1}{6}</annotation></semantics></math></p></li>
</ul>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><mn>1</mn><mo>⋅</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>+</mo><mn>2</mn><mo>⋅</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mi>…</mi><mo>+</mo><mn>6</mn><mo>⋅</mo><mfrac><mn>1</mn><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">E\lbrack x\rbrack = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6}\ldots + 6 \cdot \frac{1}{6}</annotation></semantics></math></p>
<h3 id="binomial-expected-value">Binomial expected value</h3>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><mi>n</mi><mi>p</mi></mrow><annotation encoding="application/x-tex">E\lbrack x\rbrack = np</annotation></semantics></math></p>
<h3 id="bernoulli-expected-value">Bernoulli expected value</h3>
<p>Bernoulli is just binomial with one trial.</p>
<p>Recall that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">P(X = 1) = p</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>0</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mn>1</mn><mo>−</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">P(X = 0) = 1 - p</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>X</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><mn>1</mn><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mn>0</mn><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>0</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">E\lbrack X\rbrack = 1 \cdot P(X = 1) + 0 \cdot P(X = 0) = p</annotation></semantics></math></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> be an event on <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Ω</mi><annotation encoding="application/x-tex">\Omega</annotation></semantics></math>. Its <em>indicator random variable</em> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>I</mi><mi>A</mi></msub><annotation encoding="application/x-tex">I_{A}</annotation></semantics></math>
is defined for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ω</mi><mo>∈</mo><mi>Ω</mi></mrow><annotation encoding="application/x-tex">\omega \in \Omega</annotation></semantics></math> by</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>I</mi><mi>A</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>ω</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn><mrow><mtext mathvariant="normal">, if </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mi>ω</mi><mo>∈</mo><mi>A</mi></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn><mrow><mtext mathvariant="normal">, if </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mi>ω</mi><mo>∉</mo><mi>A</mi></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">I_{A}(\omega) = \begin{cases}
1\text{, if  } &amp; \omega \in A \\
0\text{, if } &amp; \omega \notin A
\end{cases}</annotation></semantics></math></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msub><mi>I</mi><mi>A</mi></msub><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><mn>1</mn><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>A</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>A</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">E\left\lbrack I_{A} \right\rbrack = 1 \cdot P(A) = P(A)</annotation></semantics></math></p>
<h2 id="geometric-expected-value">Geometric expected value</h2>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">p \in \lbrack 0,1\rbrack</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Geom</mtext></mrow><mo stretchy="false" form="prefix">[</mo><mi>p</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Geom}\lbrack p\rbrack</annotation></semantics></math>
be a geometric RV with probability of success <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>p</mi><annotation encoding="application/x-tex">p</annotation></semantics></math>. Recall that the
p.m.f. is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><msup><mi>q</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup></mrow><annotation encoding="application/x-tex">pq^{k - 1}</annotation></semantics></math>, where prob. of failure is defined by
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mo>≔</mo><mn>1</mn><mo>−</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">q ≔ 1 - p</annotation></semantics></math>.</p>
<p>Then</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>X</mi><mo stretchy="false" form="postfix">]</mo></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mi>k</mi><mi>p</mi><msup><mi>q</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"></mtd><mtd columnalign="left" style="text-align: left"><mo>=</mo><mi>p</mi><mo>⋅</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mi>k</mi><mo>⋅</mo><msup><mi>q</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
E\lbrack X\rbrack &amp; = \sum_{k = 1}^{\infty}kpq^{k - 1} \\
 &amp; = p \cdot \sum_{k = 1}^{\infty}k \cdot q^{k - 1}
\end{aligned}</annotation></semantics></math></p>
<p>Now recall from calculus that you can differentiate a power series term
by term inside its radius of convergence. So for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">|</mo><mi>t</mi><mo stretchy="true" form="postfix">|</mo></mrow><mo>&lt;</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">|t| &lt; 1</annotation></semantics></math>,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mi>k</mi><msup><mi>t</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mfrac><mi>d</mi><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><msup><mi>t</mi><mi>k</mi></msup><mo>=</mo><mfrac><mi>d</mi><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><msup><mi>t</mi><mi>k</mi></msup><mo>=</mo><mfrac><mi>d</mi><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>−</mo><mi>t</mi></mrow></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup></mfrac></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>∴</mo><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mi>k</mi><mi>p</mi><msup><mi>q</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mo>=</mo><mi>p</mi><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mo accent="false">∞</mo></munderover><mi>k</mi><msup><mi>q</mi><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow></msup><mo>=</mo><mi>p</mi><mrow><mo stretchy="true" form="prefix">(</mo><mfrac><mn>1</mn><msup><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo>−</mo><mi>q</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup></mfrac><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mi>p</mi></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
\sum_{k = 1}^{\infty}kt^{k - 1} = \sum_{k = 1}^{\infty}\frac{d}{dt}t^{k} = \frac{d}{dt}\sum_{k = 1}^{\infty}t^{k} = \frac{d}{dt}\left( \frac{1}{1 - t} \right) = \frac{1}{(1 - t)^{2}} \\
\therefore E\lbrack x\rbrack = \sum_{k = 1}^{\infty}kpq^{k - 1} = p\sum_{k = 1}^{\infty}kq^{k - 1} = p\left( \frac{1}{(1 - q)^{2}} \right) = \frac{1}{p}
\end{array}</annotation></semantics></math></p>
<h3 id="expected-value-of-a-continuous-rv">Expected value of a continuous RV</h3>
<p><em>Definition. </em></p>
<p>The expectation or mean of a continuous random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> with density
function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>f</mi><annotation encoding="application/x-tex">f</annotation></semantics></math> is</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">E\lbrack x\rbrack = \int_{- \infty}^{\infty}x \cdot f(x)dx</annotation></semantics></math></p>
<p>An alternative symbol is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>μ</mi><mo>=</mo><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\mu = E\lbrack x\rbrack</annotation></semantics></math>.</p>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>μ</mi><annotation encoding="application/x-tex">\mu</annotation></semantics></math> is the “first moment” of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>, analogous to physics, it’s the
“center of gravity” of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p><em>Remark. </em></p>
<p>In general when moving between discrete and continuous RV, replace sums
with integrals, p.m.f. with p.d.f., and vice versa.</p>
<p><em>Example. </em></p>
<p>Suppose <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is a continuous RV with p.d.f.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>2</mn><mi>x</mi><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mn>0</mn><mo>&lt;</mo><mi>x</mi><mo>&lt;</mo><mn>1</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mtext mathvariant="normal">elsewhere</mtext></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f_{X}(x) = \begin{cases}
2x\text{,  } &amp; 0 &lt; x &lt; 1 \\
0\text{, } &amp; \text{elsewhere}
\end{cases}</annotation></semantics></math></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>X</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mn>1</mn></msubsup><mi>x</mi><mo>⋅</mo><mn>2</mn><mi>x</mi><mi>d</mi><mi>x</mi><mo>=</mo><mfrac><mn>2</mn><mn>3</mn></mfrac></mrow><annotation encoding="application/x-tex">E\lbrack X\rbrack = \int_{- \infty}^{\infty}x \cdot f(x)dx = \int_{0}^{1}x \cdot 2xdx = \frac{2}{3}</annotation></semantics></math></p>
<p><em>Example (Uniform expectation).</em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be a uniform random variable on the interval
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack a,b\rbrack</annotation></semantics></math> with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Unif</mtext></mrow><mo stretchy="false" form="prefix">[</mo><mi>a</mi><mo>,</mo><mi>b</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Unif}\lbrack a,b\rbrack</annotation></semantics></math>. Find
the expected value of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>X</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mi>a</mi><mi>b</mi></msubsup><mfrac><mi>x</mi><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>1</mn><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac><msubsup><mo>∫</mo><mi>a</mi><mi>b</mi></msubsup><mi>x</mi><mi>d</mi><mi>x</mi><mo>=</mo><mfrac><mn>1</mn><mrow><mi>b</mi><mo>−</mo><mi>a</mi></mrow></mfrac><mo>⋅</mo><mfrac><mrow><msup><mi>b</mi><mn>2</mn></msup><mo>−</mo><msup><mi>a</mi><mn>2</mn></msup></mrow><mn>2</mn></mfrac><mo>=</mo><munder><munder><mfrac><mrow><mi>b</mi><mo>+</mo><mi>a</mi></mrow><mn>2</mn></mfrac><mo accent="true">⏟</mo></munder><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> midpoint formula</mtext></mrow></munder></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\lbrack X\rbrack = \int_{- \infty}^{\infty}x \cdot f(x)dx = \int_{a}^{b}\frac{x}{b - a}dx \\
 = \frac{1}{b - a}\int_{a}^{b}xdx = \frac{1}{b - a} \cdot \frac{b^{2} - a^{2}}{2} = \underset{\text{ midpoint formula}}{\underbrace{\frac{b + a}{2}}}
\end{array}</annotation></semantics></math></p>
<p><em>Example (Exponential expectation).</em></p>
<p>Find the expected value of an exponential RV, with p.d.f.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mi>λ</mi><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mi>x</mi><mo>&gt;</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mtext mathvariant="normal">elsewhere</mtext></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f_{X}(x) = \begin{cases}
\lambda e^{- \lambda x}\text{,  } &amp; x &gt; 0 \\
0\text{, } &amp; \text{elsewhere}
\end{cases}</annotation></semantics></math></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>x</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><mi>λ</mi><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mi>λ</mi><mo>⋅</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><mi>x</mi><mo>⋅</mo><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mi>λ</mi><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">[</mo><msubsup><mrow><mi>−</mi><mi>x</mi><mfrac><mn>1</mn><mi>λ</mi></mfrac><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mo stretchy="true" form="postfix">|</mo></mrow><mrow><mi>x</mi><mo>=</mo><mn>0</mn></mrow><mrow><mi>x</mi><mo>=</mo><mi>∞</mi></mrow></msubsup><mo>−</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><mo>−</mo><mfrac><mn>1</mn><mi>λ</mi></mfrac><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi><mo stretchy="true" form="postfix">]</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>1</mn><mi>λ</mi></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\lbrack x\rbrack = \int_{- \infty}^{\infty}x \cdot f(x)dx = \int_{0}^{\infty}x \cdot \lambda e^{- \lambda x}dx \\
 = \lambda \cdot \int_{0}^{\infty}x \cdot e^{- \lambda x}dx \\
 = \lambda \cdot \left\lbrack \left. -x\frac{1}{\lambda}e^{- \lambda x} \right|_{x = 0}^{x = \infty} - \int_{0}^{\infty} - \frac{1}{\lambda}e^{- \lambda x}dx \right\rbrack \\
 = \frac{1}{\lambda}
\end{array}</annotation></semantics></math></p>
<p><em>Example (Uniform dartboard).</em></p>
<p>Our dartboard is a disk of radius <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>r</mi><mn>0</mn></msub><annotation encoding="application/x-tex">r_{0}</annotation></semantics></math> and the dart lands uniformly
at random on the disk when thrown. Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>R</mi><annotation encoding="application/x-tex">R</annotation></semantics></math> be the distance of the dart
from the center of the disk. Find <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>R</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">E\lbrack R\rbrack</annotation></semantics></math> given density
function</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mi>R</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mfrac><mrow><mn>2</mn><mi>t</mi></mrow><msubsup><mi>r</mi><mn>0</mn><mn>2</mn></msubsup></mfrac><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mn>0</mn><mo>≤</mo><mi>t</mi><mo>≤</mo><msub><mi>r</mi><mn>0</mn></msub></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mi>t</mi><mo>&lt;</mo><mn>0</mn><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> or </mtext><mspace width="0.333em"></mspace></mrow><mi>t</mi><mo>&gt;</mo><msub><mi>r</mi><mn>0</mn></msub></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f_{R}(t) = \begin{cases}
\frac{2t}{r_{0}^{2}}\text{,  } &amp; 0 \leq t \leq r_{0} \\
0\text{,  } &amp; t &lt; 0\text{ or }t &gt; r_{0}
\end{cases}</annotation></semantics></math></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>R</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>t</mi><msub><mi>f</mi><mi>R</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>t</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>t</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><msub><mi>r</mi><mn>0</mn></msub></msubsup><mi>t</mi><mo>⋅</mo><mfrac><mrow><mn>2</mn><mi>t</mi></mrow><msubsup><mi>r</mi><mn>0</mn><mn>2</mn></msubsup></mfrac><mi>d</mi><mi>t</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>2</mn><mn>3</mn></mfrac><msub><mi>r</mi><mn>0</mn></msub></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\lbrack R\rbrack = \int_{- \infty}^{\infty}tf_{R}(t)dt \\
 = \int_{0}^{r_{0}}t \cdot \frac{2t}{r_{0}^{2}}dt \\
 = \frac{2}{3}r_{0}
\end{array}</annotation></semantics></math></p>
<h3 id="expectation-of-derived-values">Expectation of derived values</h3>
<p>If we can find the expected value of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>, can we find the expected value
of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>X</mi><mn>2</mn></msup><annotation encoding="application/x-tex">X^{2}</annotation></semantics></math>? More precisely, can we find
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{2} \right\rbrack</annotation></semantics></math>?</p>
<p>If the distribution is easy to see, then this is trivial. Otherwise we
have the following useful property:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msub><mo>∫</mo><mrow><mrow><mtext mathvariant="normal">all </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi></mrow></msub><msup><mi>x</mi><mn>2</mn></msup><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{2} \right\rbrack = \int_{\text{all }x}x^{2}f_{X}(x)dx</annotation></semantics></math></p>
<p>(for continuous RVs).</p>
<p>And in the discrete case,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><munder><mo>∑</mo><mrow><mrow><mtext mathvariant="normal">all </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi></mrow></munder><msup><mi>x</mi><mn>2</mn></msup><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{2} \right\rbrack = \sum_{\text{all }x}x^{2}p_{X}(x)</annotation></semantics></math></p>
<p>In fact <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{2} \right\rbrack</annotation></semantics></math> is so important that we call
it the <strong>mean square</strong>.</p>
<p><em>Fact. </em></p>
<p>More generally, a real valued function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">g(X)</annotation></semantics></math> defined on the range of
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is itself a random variable (with its own distribution).</p>
<p>We can find expected value of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">g(X)</annotation></semantics></math> by</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">E\left\lbrack g(x) \right\rbrack = \int_{- \infty}^{\infty}g(x)f(x)dx</annotation></semantics></math></p>
<p>or</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><munder><mo>∑</mo><mrow><mrow><mtext mathvariant="normal">all </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi></mrow></munder><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">E\left\lbrack g(x) \right\rbrack = \sum_{\text{all }x}g(x)f(x)</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>You roll a fair die to determine the winnings (or losses) <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>W</mi><annotation encoding="application/x-tex">W</annotation></semantics></math> of a
player as follows:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mi>−</mi><mn>1</mn><mo>,</mo><mspace width="0.222em"></mspace><mi>i</mi><mi>f</mi><mspace width="0.222em"></mspace><mi>t</mi><mi>h</mi><mi>e</mi><mspace width="0.222em"></mspace><mi>r</mi><mi>o</mi><mi>l</mi><mi>l</mi><mspace width="0.222em"></mspace><mi>i</mi><mi>s</mi><mspace width="0.222em"></mspace><mn>1</mn><mo>,</mo><mspace width="0.222em"></mspace><mn>2</mn><mo>,</mo><mspace width="0.222em"></mspace><mi>o</mi><mi>r</mi><mspace width="0.222em"></mspace><mn>3</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn><mo>,</mo><mspace width="0.222em"></mspace><mi>i</mi><mi>f</mi><mspace width="0.222em"></mspace><mi>t</mi><mi>h</mi><mi>e</mi><mspace width="0.222em"></mspace><mi>r</mi><mi>o</mi><mi>l</mi><mi>l</mi><mspace width="0.222em"></mspace><mi>i</mi><mi>s</mi><mspace width="0.222em"></mspace><mi>a</mi><mspace width="0.222em"></mspace><mn>4</mn></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>3</mn><mo>,</mo><mspace width="0.222em"></mspace><mi>i</mi><mi>f</mi><mspace width="0.222em"></mspace><mi>t</mi><mi>h</mi><mi>e</mi><mspace width="0.222em"></mspace><mi>r</mi><mi>o</mi><mi>l</mi><mi>l</mi><mspace width="0.222em"></mspace><mi>i</mi><mi>s</mi><mspace width="0.222em"></mspace><mn>5</mn><mspace width="0.222em"></mspace><mi>o</mi><mi>r</mi><mspace width="0.222em"></mspace><mn>6</mn></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">W = \begin{cases}
 - 1,\ if\ the\ roll\ is\ 1,\ 2,\ or\ 3 \\
1,\ if\ the\ roll\ is\ a\ 4 \\
3,\ if\ the\ roll\ is\ 5\ or\ 6
\end{cases}</annotation></semantics></math></p>
<p>What is the expected winnings/losses for the player during 1 roll of the
die?</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> denote the outcome of the roll of the die. Then we can define
our random variable as <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">W = g(X)</annotation></semantics></math> where the function <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>g</mi><annotation encoding="application/x-tex">g</annotation></semantics></math> is defined by
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>2</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>−</mi><mn>1</mn></mrow><annotation encoding="application/x-tex">g(1) = g(2) = g(3) = - 1</annotation></semantics></math> and so on.</p>
<p>Note that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mi>−</mi><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>1</mn><mo>∪</mo><mi>X</mi><mo>=</mo><mn>2</mn><mo>∪</mo><mi>X</mi><mo>=</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac></mrow><annotation encoding="application/x-tex">P(W = - 1) = P(X = 1 \cup X = 2 \cup X = 3) = \frac{1}{2}</annotation></semantics></math>.
Likewise <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>4</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">P(W = 1) = P(X = 4) = \frac{1}{6}</annotation></semantics></math>, and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>=</mo><mn>5</mn><mo>∪</mo><mi>X</mi><mo>=</mo><mn>6</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mn>3</mn></mfrac></mrow><annotation encoding="application/x-tex">P(W = 3) = P(X = 5 \cup X = 6) = \frac{1}{3}</annotation></semantics></math>.</p>
<p>Then <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>W</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mi>−</mi><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>W</mi><mo>=</mo><mn>3</mn><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mi>−</mi><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>+</mo><mfrac><mn>1</mn><mn>6</mn></mfrac><mo>+</mo><mn>1</mn><mo>=</mo><mfrac><mn>2</mn><mn>3</mn></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\left\lbrack g(X) \right\rbrack = E\lbrack W\rbrack = ( - 1) \cdot P(W = - 1) + (1) \cdot P(W = 1) + (3) \cdot P(W = 3) \\
 = - \frac{1}{2} + \frac{1}{6} + 1 = \frac{2}{3}
\end{array}</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>A stick of length <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>l</mi><annotation encoding="application/x-tex">l</annotation></semantics></math> is broken at a uniformly chosen random location.
What is the expected length of the longer piece?</p>
<p>Idea: if you break it before the halfway point, then the longer piece
has length given by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>l</mi><mo>−</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">l - x</annotation></semantics></math>. If you break it after the halfway point,
the longer piece has length <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math>.</p>
<p>Let the interval <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>l</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,l\rbrack</annotation></semantics></math> represent the stick and let
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Unif</mtext></mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>l</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">X\sim\text{ Unif}\lbrack 0,l\rbrack</annotation></semantics></math> be the location where the stick is
broken. Then <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has density <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mi>l</mi></mfrac></mrow><annotation encoding="application/x-tex">f(x) = \frac{1}{l}</annotation></semantics></math> on
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>l</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,l\rbrack</annotation></semantics></math> and 0 elsewhere.</p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">g(x)</annotation></semantics></math> be the length of the longer piece when the stick is broken at
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math>,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mn>1</mn><mo>−</mo><mi>x</mi><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mn>0</mn><mo>≤</mo><mi>x</mi><mo>&lt;</mo><mfrac><mi>l</mi><mn>2</mn></mfrac></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mi>x</mi><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mfrac><mn>1</mn><mn>2</mn></mfrac><mo>≤</mo><mi>x</mi><mo>≤</mo><mi>l</mi></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">g(x) = \begin{cases}
1 - x\text{,  } &amp; 0 \leq x &lt; \frac{l}{2} \\
x\text{,  } &amp; \frac{1}{2} \leq x \leq l
\end{cases}</annotation></semantics></math></p>
<p>Then <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>g</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mfrac><mi>l</mi><mn>2</mn></mfrac></msubsup><mfrac><mrow><mi>l</mi><mo>−</mo><mi>x</mi></mrow><mi>l</mi></mfrac><mi>d</mi><mi>x</mi><mo>+</mo><msubsup><mo>∫</mo><mfrac><mi>l</mi><mn>2</mn></mfrac><mi>l</mi></msubsup><mfrac><mi>x</mi><mi>l</mi></mfrac><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>3</mn><mn>4</mn></mfrac><mi>l</mi></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\left\lbrack g(X) \right\rbrack = \int_{- \infty}^{\infty}g(x)f(x)dx = \int_{0}^{\frac{l}{2}}\frac{l - x}{l}dx + \int_{\frac{l}{2}}^{l}\frac{x}{l}dx \\
 = \frac{3}{4}l
\end{array}</annotation></semantics></math></p>
<p>So we expect the longer piece to be <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mfrac><mn>3</mn><mn>4</mn></mfrac><annotation encoding="application/x-tex">\frac{3}{4}</annotation></semantics></math> of the total length,
which is a bit pathological.</p>
<h3 id="moments-of-a-random-variable">Moments of a random variable</h3>
<p>We continue discussing expectation but we introduce new terminology.</p>
<p><em>Fact. </em></p>
<p>The <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> moment (or <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> raw moment) of a discrete
random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> with p.m.f. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">p_{X}(x)</annotation></semantics></math> is the expectation</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><munder><mo>∑</mo><mi>k</mi></munder><msup><mi>k</mi><mi>n</mi></msup><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mi>μ</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{n} \right\rbrack = \sum_{k}k^{n}p_{X}(k) = \mu_{n}</annotation></semantics></math></p>
<p>If <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is continuous, then we have analogously</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><msup><mi>x</mi><mi>n</mi></msup><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mi>μ</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{n} \right\rbrack = \int_{- \infty}^{\infty}x^{n}f_{X}(x) = \mu_{n}</annotation></semantics></math></p>
<p>The <strong>deviation</strong> is given by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>σ</mi><annotation encoding="application/x-tex">\sigma</annotation></semantics></math> and the <strong>variance</strong> is given by
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>σ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">\sigma^{2}</annotation></semantics></math> and</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>σ</mi><mn>2</mn></msup><mo>=</mo><msub><mi>μ</mi><mn>2</mn></msub><mo>−</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><msub><mi>μ</mi><mn>1</mn></msub><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">\sigma^{2} = \mu_{2} - \left( \mu_{1} \right)^{2}</annotation></semantics></math></p>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>μ</mi><mn>3</mn></msub><annotation encoding="application/x-tex">\mu_{3}</annotation></semantics></math> is used to measure “skewness” / asymmetry of a distribution.
For example, the normal distribution is very symmetric.</p>
<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>μ</mi><mn>4</mn></msub><annotation encoding="application/x-tex">\mu_{4}</annotation></semantics></math> is used to measure kurtosis/peakedness of a distribution.</p>
<h3 id="central-moments">Central moments</h3>
<p>Previously we discussed “raw moments.” Be careful not to confuse them
with <em>central moments</em>.</p>
<p><em>Fact. </em></p>
<p>The <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> central moment of a discrete random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>
with p.m.f. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">p_{X}(x)</annotation></semantics></math> is the expected value of the difference about the
mean raised to the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> power</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><munder><mo>∑</mo><mi>k</mi></munder><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>μ</mi><msub><mi>′</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">E\left\lbrack (X - \mu)^{n} \right\rbrack = \sum_{k}(k - \mu)^{n}p_{X}(k) = \mu\prime_{n}</annotation></semantics></math></p>
<p>And of course in the continuous case,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>n</mi></msup><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>μ</mi><msub><mi>′</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">E\left\lbrack (X - \mu)^{n} \right\rbrack = \int_{- \infty}^{\infty}(x - \mu)^{n}f_{X}(x) = \mu\prime_{n}</annotation></semantics></math></p>
<p>In particular,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>μ</mi><msub><mi>′</mi><mn>1</mn></msub><mo>=</mo><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>1</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>1</mn></msup><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><msubsup><mo>∫</mo><mi>∞</mi><mi>∞</mi></msubsup><mi>x</mi><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mi>μ</mi><msub><mi>f</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi><mo>=</mo><mi>μ</mi><mo>−</mo><mi>μ</mi><mo>⋅</mo><mn>1</mn><mo>=</mo><mn>0</mn></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mi>μ</mi><msub><mi>′</mi><mn>2</mn></msub><mo>=</mo><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>−</mo><mi>μ</mi><mo stretchy="true" form="postfix">)</mo></mrow><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mi>σ</mi><mi>X</mi><mn>2</mn></msubsup><mo>=</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Var</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo stretchy="true" form="postfix">)</mo></mrow></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
\mu\prime_{1} = E\left\lbrack (X - \mu)^{1} \right\rbrack = \int_{- \infty}^{\infty}(x - \mu)^{1}f_{X}(x)dx \\
 = \int_{\infty}^{\infty}xf_{X}(x)dx = \int_{- \infty}^{\infty}\mu f_{X}(x)dx = \mu - \mu \cdot 1 = 0 \\
\mu\prime_{2} = E\left\lbrack (X - \mu)^{2} \right\rbrack = \sigma_{X}^{2} = \text{ Var}(X)
\end{array}</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> be a uniformly chosen integer from
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo>,</mo><mi>m</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ 0,1,2,\ldots,m \right\}</annotation></semantics></math>. Find the first and second moment of
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math>.</p>
<p>The p.m.f. of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>Y</mi><annotation encoding="application/x-tex">Y</annotation></semantics></math> is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>Y</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>k</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mn>1</mn><mrow><mi>m</mi><mo>+</mo><mn>1</mn></mrow></mfrac></mrow><annotation encoding="application/x-tex">p_{Y}(k) = \frac{1}{m + 1}</annotation></semantics></math> for
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>m</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">k \in \lbrack 0,m\rbrack</annotation></semantics></math>. Thus,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>Y</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mi>m</mi></munderover><mi>k</mi><mfrac><mn>1</mn><mrow><mi>m</mi><mo>+</mo><mn>1</mn></mrow></mfrac><mo>=</mo><mfrac><mn>1</mn><mrow><mi>m</mi><mo>+</mo><mn>1</mn></mrow></mfrac><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mi>m</mi></munderover><mi>k</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mi>m</mi><mn>2</mn></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\lbrack Y\rbrack = \sum_{k = 0}^{m}k\frac{1}{m + 1} = \frac{1}{m + 1}\sum_{k = 0}^{m}k \\
 = \frac{m}{2}
\end{array}</annotation></semantics></math></p>
<p>Then,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>Y</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow><mi>m</mi></munderover><msup><mi>k</mi><mn>2</mn></msup><mfrac><mn>1</mn><mrow><mi>m</mi><mo>+</mo><mn>1</mn></mrow></mfrac><mo>=</mo><mfrac><mn>1</mn><mrow><mi>m</mi><mo>+</mo><mn>1</mn></mrow></mfrac><mo>=</mo><mfrac><mrow><mi>m</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>2</mn><mi>m</mi><mo>+</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><mn>6</mn></mfrac></mrow><annotation encoding="application/x-tex">E\left\lbrack Y^{2} \right\rbrack = \sum_{k = 0}^{m}k^{2}\frac{1}{m + 1} = \frac{1}{m + 1} = \frac{m(2m + 1)}{6}</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>c</mi><mo>&gt;</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">c &gt; 0</annotation></semantics></math> and let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>U</mi><annotation encoding="application/x-tex">U</annotation></semantics></math> be a uniform random variable on the interval
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>c</mi><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">\lbrack 0,c\rbrack</annotation></semantics></math>. Find the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> moment for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>U</mi><annotation encoding="application/x-tex">U</annotation></semantics></math> for all
positive integers <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math>.</p>
<p>The density function of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>U</mi><annotation encoding="application/x-tex">U</annotation></semantics></math> is</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">{</mo><mtable><mtr><mtd columnalign="left" style="text-align: left"><mfrac><mn>1</mn><mi>c</mi></mfrac><mrow><mtext mathvariant="normal">, if </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mi>x</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mn>0</mn><mo>,</mo><mi>c</mi><mo stretchy="false" form="postfix">]</mo></mtd></mtr><mtr><mtd columnalign="left" style="text-align: left"><mn>0</mn><mrow><mtext mathvariant="normal">, </mtext><mspace width="0.333em"></mspace></mrow></mtd><mtd columnalign="left" style="text-align: left"><mtext mathvariant="normal">otherwise</mtext></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex">f(x) = \begin{cases}
\frac{1}{c}\text{, if } &amp; x \in \lbrack 0,c\rbrack \\
0\text{,  } &amp; \text{otherwise}
\end{cases}</annotation></semantics></math></p>
<p>Therefore the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> moment of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>U</mi><annotation encoding="application/x-tex">U</annotation></semantics></math> is,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>U</mi><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><msup><mi>x</mi><mi>n</mi></msup><mi>f</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>x</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>d</mi><mi>x</mi></mrow><annotation encoding="application/x-tex">E\left\lbrack U^{n} \right\rbrack = \int_{- \infty}^{\infty}x^{n}f(x)dx</annotation></semantics></math></p>
<p><em>Example. </em></p>
<p>Suppose the random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Exp</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Exp}(\lambda)</annotation></semantics></math>. Find the second
moment of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable><mtr><mtd columnalign="right" style="text-align: right"><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mn>2</mn></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><msup><mi>x</mi><mn>2</mn></msup><mi>λ</mi><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>1</mn><msup><mi>λ</mi><mn>2</mn></msup></mfrac><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><msup><mi>u</mi><mn>2</mn></msup><msup><mi>e</mi><mrow><mi>−</mi><mi>u</mi></mrow></msup><mi>d</mi><mi>u</mi></mtd></mtr><mtr><mtd columnalign="right" style="text-align: right"><mo>=</mo><mfrac><mn>1</mn><msup><mi>λ</mi><mn>2</mn></msup></mfrac><mi>Γ</mi><mrow><mo stretchy="true" form="prefix">(</mo><mn>2</mn><mo>+</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mfrac><mrow><mn>2</mn><mi>!</mi></mrow><msup><mi>λ</mi><mn>2</mn></msup></mfrac></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{array}{r}
E\left\lbrack X^{2} \right\rbrack = \int_{0}^{\infty}x^{2}\lambda e^{- \lambda x}dx \\
 = \frac{1}{\lambda^{2}}\int_{0}^{\infty}u^{2}e^{- u}du \\
 = \frac{1}{\lambda^{2}}\Gamma(2 + 1) = \frac{2!}{\lambda^{2}}
\end{array}</annotation></semantics></math></p>
<p><em>Fact. </em></p>
<p>In general, to find teh <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> moment of
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>∼</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> Exp</mtext></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>λ</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">X\sim\text{ Exp}(\lambda)</annotation></semantics></math>,
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mrow><mo stretchy="true" form="prefix">[</mo><msup><mi>X</mi><mi>n</mi></msup><mo stretchy="true" form="postfix">]</mo></mrow><mo>=</mo><msubsup><mo>∫</mo><mn>0</mn><mi>∞</mi></msubsup><msup><mi>x</mi><mi>n</mi></msup><mi>λ</mi><msup><mi>e</mi><mrow><mi>−</mi><mi>λ</mi><mi>x</mi></mrow></msup><mi>d</mi><mi>x</mi><mo>=</mo><mfrac><mrow><mi>n</mi><mi>!</mi></mrow><msup><mi>λ</mi><mi>n</mi></msup></mfrac></mrow><annotation encoding="application/x-tex">E\left\lbrack X^{n} \right\rbrack = \int_{0}^{\infty}x^{n}\lambda e^{- \lambda x}dx = \frac{n!}{\lambda^{n}}</annotation></semantics></math></p>
<h3 id="median-and-quartiles">Median and quartiles</h3>
<p>When a random variable has rare (abnormal) values, its expectation may
be a bad indicator of where the center of the distribution lies.</p>
<p><em>Definition. </em></p>
<p>The <strong>median</strong> of a random variable <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> is any real value <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>m</mi><annotation encoding="application/x-tex">m</annotation></semantics></math> that
satisfies</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≥</mo><mi>m</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≥</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> and </mtext><mspace width="0.333em"></mspace></mrow><mi>P</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>X</mi><mo>≤</mo><mi>m</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>≥</mo><mfrac><mn>1</mn><mn>2</mn></mfrac></mrow><annotation encoding="application/x-tex">P(X \geq m) \geq \frac{1}{2}\text{ and }P(X \leq m) \geq \frac{1}{2}</annotation></semantics></math></p>
<p>With half the probability on both <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mi>X</mi><mo>≤</mo><mi>m</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ X \leq m \right\}</annotation></semantics></math> and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mi>X</mi><mo>≥</mo><mi>m</mi><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ X \geq m \right\}</annotation></semantics></math>, the median is representative of the
midpoint of the distribution. We say that the median is more <em>robust</em>
because it is less affected by outliers. It is not necessarily unique.</p>
<p><em>Example. </em></p>
<p>Let <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> be discretely uniformly distributed in the set
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">{</mo><mi>−</mi><mn>100</mn><mo>,</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>,</mo><mn>3</mn><mo>,</mo><mi>…</mi><mo>,</mo><mn>9</mn><mo stretchy="true" form="postfix">}</mo></mrow><annotation encoding="application/x-tex">\left\{ - 100,1,2,,3,\ldots,9 \right\}</annotation></semantics></math> so <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math> has probability mass
function <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mn>100</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>⋯</mi><mo>=</mo><msub><mi>p</mi><mi>X</mi></msub><mrow><mo stretchy="true" form="prefix">(</mo><mn>9</mn><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">p_{X}( - 100) = p_{X}(1) = \cdots = p_{X}(9)</annotation></semantics></math></p>
<p>Find the expected value and median of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>X</mi><annotation encoding="application/x-tex">X</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo stretchy="false" form="prefix">[</mo><mi>X</mi><mo stretchy="false" form="postfix">]</mo><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>−</mi><mn>100</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mfrac><mn>1</mn><mn>10</mn></mfrac><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>1</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mfrac><mn>1</mn><mn>10</mn></mfrac><mo>+</mo><mi>⋯</mi><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mn>9</mn><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mfrac><mn>1</mn><mn>10</mn></mfrac><mo>=</mo><mi>−</mi><mn>5.5</mn></mrow><annotation encoding="application/x-tex">E\lbrack X\rbrack = ( - 100) \cdot \frac{1}{10} + (1) \cdot \frac{1}{10} + \cdots + (9) \cdot \frac{1}{10} = - 5.5</annotation></semantics></math></p>
<p>While the median is any number <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mo>∈</mo><mo stretchy="false" form="prefix">[</mo><mn>4</mn><mo>,</mo><mn>5</mn><mo stretchy="false" form="postfix">]</mo></mrow><annotation encoding="application/x-tex">m \in \lbrack 4,5\rbrack</annotation></semantics></math>.</p>
<p>The median reflects the fact that 90% of the values and probability is
in the range <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mi>…</mi><mo>,</mo><mn>9</mn></mrow><annotation encoding="application/x-tex">1,2,\ldots,9</annotation></semantics></math> while the mean is heavily influenced by the
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>−</mi><mn>100</mn></mrow><annotation encoding="application/x-tex">- 100</annotation></semantics></math> value.</p></main>
</article>
]]></summary>
</entry>
<entry>
    <title>An assortment of preliminaries on linear algebra</title>
    <link href="https://blog.youwen.dev/an-assortment-of-preliminaries-on-linear-algebra.html" />
    <id>https://blog.youwen.dev/an-assortment-of-preliminaries-on-linear-algebra.html</id>
    <published>2025-02-15T00:00:00Z</published>
    <updated>2025-02-15T00:00:00Z</updated>
    <summary type="html"><![CDATA[<article>
  <header>
    <h1 class="text-4xl">
      <a href="./an-assortment-of-preliminaries-on-linear-algebra.html">An assortment of preliminaries on linear algebra</a>
    </h1>
    <p
      class="mb-1 mt-2 italic font-light text-lg text-accent-light dark:text-accent-dark"
    >
      and also a test for pandoc
    </p>
    <div class="mt-2">2025-02-15</div>
    <div class="mt-1 text-sm">
      
    </div>
  </header>
  <main class="post mt-4"><p>This entire document was written entirely in <a href="https://typst.app/">Typst</a> and
directly translated to this file by Pandoc. It serves as a proof of concept of
a way to do static site generation from Typst files instead of Markdown.</p>
<hr />
<p>I figured I should write this stuff down before I forgot it.</p>
<h1 id="basic-notions">Basic Notions</h1>
<h2 id="vector-spaces">Vector spaces</h2>
<p>Before we can understand vectors, we need to first discuss <em>vector
spaces</em>. Thus far, you have likely encountered vectors primarily in
physics classes, generally in the two-dimensional plane. You may
conceptualize them as arrows in space. For vectors of size <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>&gt;</mo><mn>3</mn></mrow><annotation encoding="application/x-tex">&gt; 3</annotation></semantics></math>, a hand
waving argument is made that they are essentially just arrows in higher
dimensional spaces.</p>
<p>It is helpful to take a step back from this primitive geometric
understanding of the vector. Let us build up a rigorous idea of vectors
from first principles.</p>
<h3 id="vector-axioms">Vector axioms</h3>
<p>The so-called <em>axioms</em> of a <em>vector space</em> (which we’ll call the vector
space <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math>) are as follows:</p>
<ol>
<li><p>Commutativity: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>u</mi><mo>+</mo><mi>v</mi><mo>=</mo><mi>v</mi><mo>+</mo><mi>u</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>u</mi><mo>,</mo><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">u + v = v + u,\text{   }\forall u,v \in V</annotation></semantics></math></p></li>
<li><p>Associativity:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>u</mi><mo>+</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>w</mi><mo>=</mo><mi>u</mi><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo>+</mo><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>u</mi><mo>,</mo><mi>v</mi><mo>,</mo><mi>w</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">(u + v) + w = u + (v + w),\text{   }\forall u,v,w \in V</annotation></semantics></math></p></li>
<li><p>Zero vector: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mo>∃</mo><annotation encoding="application/x-tex">\exists</annotation></semantics></math> a special vector, denoted <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mn>0</mn><annotation encoding="application/x-tex">0</annotation></semantics></math>, such that
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>+</mo><mn>0</mn><mo>=</mo><mi>v</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v + 0 = v,\text{   }\forall v \in V</annotation></semantics></math></p></li>
<li><p>Additive inverse:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∃</mo><mi>w</mi><mo>∈</mo><mi>V</mi><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> such that </mtext><mspace width="0.333em"></mspace></mrow><mi>v</mi><mo>+</mo><mi>w</mi><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\forall v \in V,\text{   }\exists w \in V\text{ such that }v + w = 0</annotation></semantics></math>.
Such an additive inverse is generally denoted <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>−</mi><mi>v</mi></mrow><annotation encoding="application/x-tex">- v</annotation></semantics></math></p></li>
<li><p>Multiplicative identity: <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mi>v</mi><mo>=</mo><mi>v</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">1v = v,\text{   }\forall v \in V</annotation></semantics></math></p></li>
<li><p>Multiplicative associativity:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>α</mi><mi>β</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>v</mi><mo>=</mo><mi>α</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>β</mi><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> scalars </mtext><mspace width="0.333em"></mspace></mrow><mi>α</mi><mo>,</mo><mi>β</mi></mrow><annotation encoding="application/x-tex">(\alpha\beta)v = \alpha(\beta v)\text{   }\forall v \in V,\text{ scalars }\alpha,\beta</annotation></semantics></math></p></li>
<li><p>Distributive property for vectors:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>u</mi><mo>+</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>α</mi><mi>u</mi><mo>+</mo><mi>α</mi><mi>v</mi><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>u</mi><mo>,</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> scalars </mtext><mspace width="0.333em"></mspace></mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">\alpha(u + v) = \alpha u + \alpha v\text{   }\forall u,v \in V,\text{ scalars }\alpha</annotation></semantics></math></p></li>
<li><p>Distributive property for scalars:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mi>α</mi><mo>+</mo><mi>β</mi><mo stretchy="true" form="postfix">)</mo></mrow><mi>v</mi><mo>=</mo><mi>α</mi><mi>v</mi><mo>+</mo><mi>β</mi><mi>v</mi><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> </mtext><mspace width="0.333em"></mspace></mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> scalars </mtext><mspace width="0.333em"></mspace></mrow><mi>α</mi><mo>,</mo><mi>β</mi></mrow><annotation encoding="application/x-tex">(\alpha + \beta)v = \alpha v + \beta v\text{   }\forall v \in V,\text{  scalars }\alpha,\beta</annotation></semantics></math></p></li>
</ol>
<p>It is easy to show that the zero vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mn>0</mn><annotation encoding="application/x-tex">0</annotation></semantics></math> and the additive inverse
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>−</mi><mi>v</mi></mrow><annotation encoding="application/x-tex">- v</annotation></semantics></math> are <em>unique</em>. We leave the proof of this fact as an exercise.</p>
<p>These may seem difficult to memorize, but they are essentially the same
familiar algebraic properties of numbers you know from high school. The
important thing to remember is which operations are valid for what
objects. For example, you cannot add a vector and scalar, as it does not
make sense.</p>
<p><em>Remark</em>. For those of you versed in computer science, you may recognize
this as essentially saying that you must ensure your operations are
<em>type-safe</em>. Adding a vector and scalar is not just “wrong” in the same
sense that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>+</mo><mn>1</mn><mo>=</mo><mn>3</mn></mrow><annotation encoding="application/x-tex">1 + 1 = 3</annotation></semantics></math> is wrong, it is an <em>invalid question</em> entirely
because vectors and scalars and different types of mathematical objects.
See [@chen2024digression] for more.</p>
<h3 id="vectors-big-and-small">Vectors big and small</h3>
<p>In order to begin your descent into what mathematicians colloquially
recognize as <em>abstract vapid nonsense</em>, let’s discuss which fields
constitute a vector space. We have the familiar field of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>ℝ</mi><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics></math>
where all scalars are real numbers, with corresponding vector spaces
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{n}</annotation></semantics></math>, where <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> is the length of the vector. We generally
discuss 2D or 3D vectors, corresponding to vectors of length 2 or 3; in
our case, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>3</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{3}</annotation></semantics></math>.</p>
<p>However, vectors in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{n}</annotation></semantics></math> can really be of any length.
Vectors can be viewed as arbitrary length lists of numbers (for the
computer science folk: think C++ <code>std::vector</code>).</p>
<p><em>Example</em>. <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>7</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>8</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>9</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>∈</mo><msup><mi>ℝ</mi><mn>9</mn></msup></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
2 \\
3 \\
4 \\
5 \\
6 \\
7 \\
8 \\
9
\end{pmatrix} \in {\mathbb{R}}^{9}</annotation></semantics></math></p>
<p>Keep in mind that vectors need not be in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{n}</annotation></semantics></math> at all.
Recall that a vector space need only satisfy the aforementioned <em>axioms
of a vector space</em>.</p>
<p><em>Example</em>. The vector space <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℂ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{C}}^{n}</annotation></semantics></math> is similar to
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{n}</annotation></semantics></math>, except it includes complex numbers. All complex
vector spaces are real vector spaces (as you can simply restrict them to
only use the real numbers), but not the other way around.</p>
<p>From now on, let us refer to vector spaces <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{n}</annotation></semantics></math> and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℂ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{C}}^{n}</annotation></semantics></math> as <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>𝔽</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{F}}^{n}</annotation></semantics></math>.</p>
<p>In general, we can have a vector space where the scalars are in an
arbitrary field, as long as the axioms are satisfied.</p>
<p><em>Example</em>. The vector space of all polynomials of at most degree 3, or
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℙ</mi><mn>3</mn></msup><annotation encoding="application/x-tex">{\mathbb{P}}^{3}</annotation></semantics></math>. It is not yet clear what this vector may look like.
We shall return to this example once we discuss <em>basis</em>.</p>
<h2 id="vector-addition-multiplication">Vector addition. Multiplication</h2>
<p>Vector addition, represented by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>+</mi><annotation encoding="application/x-tex">+</annotation></semantics></math> can be done entrywise.</p>
<p><em>Example.</em></p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn><mo>+</mo><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn><mo>+</mo><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn><mo>+</mo><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>7</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>9</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
2 \\
3
\end{pmatrix} + \begin{pmatrix}
4 \\
5 \\
6
\end{pmatrix} = \begin{pmatrix}
1 + 4 \\
2 + 5 \\
3 + 6
\end{pmatrix} = \begin{pmatrix}
5 \\
7 \\
9
\end{pmatrix}</annotation></semantics></math> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn><mo>⋅</mo><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn><mo>⋅</mo><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn><mo>⋅</mo><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>10</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>18</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
2 \\
3
\end{pmatrix} \cdot \begin{pmatrix}
4 \\
5 \\
6
\end{pmatrix} = \begin{pmatrix}
1 \cdot 4 \\
2 \cdot 5 \\
3 \cdot 6
\end{pmatrix} = \begin{pmatrix}
4 \\
10 \\
18
\end{pmatrix}</annotation></semantics></math></p>
<p>This is simple enough to understand. Again, the difficulty is simply
ensuring that you always perform operations with the correct <em>types</em>.
For example, once we introduce matrices, it doesn’t make sense to
multiply or add vectors and matrices in this fashion.</p>
<h2 id="vector-scalar-multiplication">Vector-scalar multiplication</h2>
<p>Multiplying a vector by a scalar simply results in each entry of the
vector being multiplied by the scalar.</p>
<p><em>Example</em>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>β</mi><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>a</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>c</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>β</mi><mo>⋅</mo><mi>a</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>β</mi><mo>⋅</mo><mi>b</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>β</mi><mo>⋅</mo><mi>c</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\beta\begin{pmatrix}
a \\
b \\
c
\end{pmatrix} = \begin{pmatrix}
\beta \cdot a \\
\beta \cdot b \\
\beta \cdot c
\end{pmatrix}</annotation></semantics></math></p>
<h2 id="linear-combinations">Linear combinations</h2>
<p>Given vector spaces <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>W</mi><annotation encoding="application/x-tex">W</annotation></semantics></math> and vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v \in V</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>w</mi><mo>∈</mo><mi>W</mi></mrow><annotation encoding="application/x-tex">w \in W</annotation></semantics></math>,
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>+</mo><mi>w</mi></mrow><annotation encoding="application/x-tex">v + w</annotation></semantics></math> is the <em>linear combination</em> of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>w</mi><annotation encoding="application/x-tex">w</annotation></semantics></math>.</p>
<h3 id="spanning-systems">Spanning systems</h3>
<p>We say that a set of vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mn>1</mn></msub><mo>,</mo><msub><mi>v</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>v</mi><mi>n</mi></msub><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v_{1},v_{2},\ldots,v_{n} \in V</annotation></semantics></math> <em>span</em> <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math>
if the linear combination of the vectors can represent any arbitrary
vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v \in V</annotation></semantics></math>.</p>
<p>Precisely, given scalars <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><mo>,</mo><msub><mi>α</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>α</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">\alpha_{1},\alpha_{2},\ldots,\alpha_{n}</annotation></semantics></math>,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><msub><mi>v</mi><mn>1</mn></msub><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><msub><mi>v</mi><mn>2</mn></msub><mo>+</mo><mi>…</mi><mo>+</mo><msub><mi>α</mi><mi>n</mi></msub><msub><mi>v</mi><mi>n</mi></msub><mo>=</mo><mi>v</mi><mo>,</mo><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">\alpha_{1}v_{1} + \alpha_{2}v_{2} + \ldots + \alpha_{n}v_{n} = v,\forall v \in V</annotation></semantics></math></p>
<p>Note that any scalar <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>α</mi><mi>k</mi></msub><annotation encoding="application/x-tex">\alpha_{k}</annotation></semantics></math> could be 0. Therefore, it is possible
for a subset of a spanning system to also be a spanning system. The
proof of this fact is left as an exercise.</p>
<h3 id="intuition-for-linear-independence-and-dependence">Intuition for linear independence and dependence</h3>
<p>We say that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>w</mi><annotation encoding="application/x-tex">w</annotation></semantics></math> are linearly independent if <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math> cannot be
represented by the scaling of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>w</mi><annotation encoding="application/x-tex">w</annotation></semantics></math>, and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>w</mi><annotation encoding="application/x-tex">w</annotation></semantics></math> cannot be represented by the
scaling of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math>. Otherwise, they are <em>linearly dependent</em>.</p>
<p>You may intuitively visualize linear dependence in the 2D plane as two
vectors both pointing in the same direction. Clearly, scaling one vector
will allow us to reach the other vector. Linear independence is
therefore two vectors pointing in different directions.</p>
<p>Of course, this definition applies to vectors in any <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>𝔽</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{F}}^{n}</annotation></semantics></math>.</p>
<h3 id="formal-definition-of-linear-dependence-and-independence">Formal definition of linear dependence and independence</h3>
<p>Let us formally define linear independence for arbitrary vectors in
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>𝔽</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{F}}^{n}</annotation></semantics></math>. Given a set of vectors</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mn>1</mn></msub><mo>,</mo><msub><mi>v</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>v</mi><mi>n</mi></msub><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v_{1},v_{2},\ldots,v_{n} \in V</annotation></semantics></math></p>
<p>we say they are linearly independent iff. the equation</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><msub><mi>v</mi><mn>1</mn></msub><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><msub><mi>v</mi><mn>2</mn></msub><mo>+</mo><mi>…</mi><mo>+</mo><msub><mi>α</mi><mi>n</mi></msub><msub><mi>v</mi><mi>n</mi></msub><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\alpha_{1}v_{1} + \alpha_{2}v_{2} + \ldots + \alpha_{n}v_{n} = 0</annotation></semantics></math></p>
<p>has only a unique set of solutions
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><mo>,</mo><msub><mi>α</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>α</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">\alpha_{1},\alpha_{2},\ldots,\alpha_{n}</annotation></semantics></math> such that all <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>α</mi><mi>n</mi></msub><annotation encoding="application/x-tex">\alpha_{n}</annotation></semantics></math> are
zero.</p>
<p>Equivalently,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">|</mo><msub><mi>α</mi><mn>1</mn></msub><mo stretchy="true" form="postfix">|</mo></mrow><mo>+</mo><mrow><mo stretchy="true" form="prefix">|</mo><msub><mi>α</mi><mn>2</mn></msub><mo stretchy="true" form="postfix">|</mo></mrow><mo>+</mo><mi>…</mi><mo>+</mo><mrow><mo stretchy="true" form="prefix">|</mo><msub><mi>α</mi><mi>n</mi></msub><mo stretchy="true" form="postfix">|</mo></mrow><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\left| \alpha_{1} \right| + \left| \alpha_{2} \right| + \ldots + \left| \alpha_{n} \right| = 0</annotation></semantics></math></p>
<p>More precisely,</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>k</mi></munderover><mrow><mo stretchy="true" form="prefix">|</mo><msub><mi>α</mi><mi>i</mi></msub><mo stretchy="true" form="postfix">|</mo></mrow><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\sum_{i = 1}^{k}\left| \alpha_{i} \right| = 0</annotation></semantics></math></p>
<p>Therefore, a set of vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mn>1</mn></msub><mo>,</mo><msub><mi>v</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>v</mi><mi>m</mi></msub></mrow><annotation encoding="application/x-tex">v_{1},v_{2},\ldots,v_{m}</annotation></semantics></math> is linearly
dependent if the opposite is true, that is there exists solution
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><mo>,</mo><msub><mi>α</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>α</mi><mi>m</mi></msub></mrow><annotation encoding="application/x-tex">\alpha_{1},\alpha_{2},\ldots,\alpha_{m}</annotation></semantics></math> to the equation</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><msub><mi>v</mi><mn>1</mn></msub><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><msub><mi>v</mi><mn>2</mn></msub><mo>+</mo><mi>…</mi><mo>+</mo><msub><mi>α</mi><mi>m</mi></msub><msub><mi>v</mi><mi>m</mi></msub><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\alpha_{1}v_{1} + \alpha_{2}v_{2} + \ldots + \alpha_{m}v_{m} = 0</annotation></semantics></math></p>
<p>such that</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>k</mi></munderover><mrow><mo stretchy="true" form="prefix">|</mo><msub><mi>α</mi><mi>i</mi></msub><mo stretchy="true" form="postfix">|</mo></mrow><mo>≠</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">\sum_{i = 1}^{k}\left| \alpha_{i} \right| \neq 0</annotation></semantics></math></p>
<h3 id="basis">Basis</h3>
<p>We say a system of vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>v</mi><mn>1</mn></msub><mo>,</mo><msub><mi>v</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>v</mi><mi>n</mi></msub><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v_{1},v_{2},\ldots,v_{n} \in V</annotation></semantics></math> is a <em>basis</em>
in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math> if the system is both linearly independent and spanning. That is,
the system must be able to represent any vector in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math> as well as
satisfy our requirements for linear independence.</p>
<p>Equivalently, we may say that a system of vectors in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math> is a basis in
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math> if any vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v \in V</annotation></semantics></math> admits a <em>unique representation</em> as a linear
combination of vectors in the system. This is equivalent to our previous
statement, that the system must be spanning and linearly independent.</p>
<h3 id="standard-basis">Standard basis</h3>
<p>We may define a <em>standard basis</em> for a vector space. By convention, the
standard basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> is</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
0
\end{pmatrix}\begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<p>Verify that the above is in fact a basis (that is, linearly independent
and generating).</p>
<p>Recalling the definition of the basis, we can represent any vector in
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> as the linear combination of the standard basis.</p>
<p>Therefore, for any arbitrary vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">v \in {\mathbb{R}}^{2}</annotation></semantics></math>, we can
represent it as</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>=</mo><msub><mi>α</mi><mn>1</mn></msub><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">v = \alpha_{1}\begin{pmatrix}
1 \\
0
\end{pmatrix} + \alpha_{2}\begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<p>Let us call <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>α</mi><mn>1</mn></msub><annotation encoding="application/x-tex">\alpha_{1}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>α</mi><mn>2</mn></msub><annotation encoding="application/x-tex">\alpha_{2}</annotation></semantics></math> the <em>coordinates</em> of the
vector. Then, we can write <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math> as</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>α</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>α</mi><mn>2</mn></msub></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">v = \begin{pmatrix}
\alpha_{1} \\
\alpha_{2}
\end{pmatrix}</annotation></semantics></math></p>
<p>For example, the vector</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
2
\end{pmatrix}</annotation></semantics></math></p>
<p>represents</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mn>2</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">1 \cdot \begin{pmatrix}
1 \\
0
\end{pmatrix} + 2 \cdot \begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<p>Verify that this aligns with your previous intuition of vectors.</p>
<p>You may recognize the standard basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> as the
familiar unit vectors</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover><mi>i</mi><mo accent="true">̂</mo></mover><mo>,</mo><mover><mi>j</mi><mo accent="true">̂</mo></mover></mrow><annotation encoding="application/x-tex">\hat{i},\hat{j}</annotation></semantics></math></p>
<p>This aligns with the fact that</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>α</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>β</mi></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>α</mi><mover><mi>i</mi><mo accent="true">̂</mo></mover><mo>+</mo><mi>β</mi><mover><mi>j</mi><mo accent="true">̂</mo></mover></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
\alpha \\
\beta
\end{pmatrix} = \alpha\hat{i} + \beta\hat{j}</annotation></semantics></math></p>
<p>However, we may define a standard basis in any arbitrary vector space.
So, let</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>e</mi><mn>1</mn></msub><mo>,</mo><msub><mi>e</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>e</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">e_{1},e_{2},\ldots,e_{n}</annotation></semantics></math></p>
<p>be a standard basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>𝔽</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{F}}^{n}</annotation></semantics></math>. Then, the coordinates
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mn>1</mn></msub><mo>,</mo><msub><mi>α</mi><mn>2</mn></msub><mo>,</mo><mi>…</mi><mo>,</mo><msub><mi>α</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">\alpha_{1},\alpha_{2},\ldots,\alpha_{n}</annotation></semantics></math> of a vector
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><msup><mi>𝔽</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">v \in {\mathbb{F}}^{n}</annotation></semantics></math> represent the following</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>α</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>α</mi><mn>2</mn></msub></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mi>⋮</mi></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><msub><mi>α</mi><mi>n</mi></msub></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><msub><mi>α</mi><mn>1</mn></msub><msub><mi>e</mi><mn>1</mn></msub><mo>+</mo><msub><mi>α</mi><mn>2</mn></msub><mo>+</mo><msub><mi>e</mi><mn>2</mn></msub><mo>+</mo><msub><mi>α</mi><mi>n</mi></msub><msub><mi>e</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
\alpha_{1} \\
\alpha_{2} \\
 \vdots \\
\alpha_{n}
\end{pmatrix} = \alpha_{1}e_{1} + \alpha_{2} + e_{2} + \alpha_{n}e_{n}</annotation></semantics></math></p>
<p>Using our new notation, the standard basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> is</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>e</mi><mn>1</mn></msub><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>,</mo><msub><mi>e</mi><mn>2</mn></msub><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">e_{1} = \begin{pmatrix}
1 \\
0
\end{pmatrix},e_{2} = \begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<h2 id="matrices">Matrices</h2>
<p>Before discussing any properties of matrices, let’s simply reiterate
what we learned in class about their notation. We say a matrix with rows
of length <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>m</mi><annotation encoding="application/x-tex">m</annotation></semantics></math>, and columns of size <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> (in less precise terms, a matrix
with length <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>m</mi><annotation encoding="application/x-tex">m</annotation></semantics></math> and height <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math>) is a <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mo>×</mo><mi>n</mi></mrow><annotation encoding="application/x-tex">m \times n</annotation></semantics></math> matrix.</p>
<p>Given a matrix</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>7</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>8</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>9</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">A = \begin{pmatrix}
1 &amp; 2 &amp; 3 \\
4 &amp; 5 &amp; 6 \\
7 &amp; 8 &amp; 9
\end{pmatrix}</annotation></semantics></math></p>
<p>we refer to the entry in row <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>j</mi><annotation encoding="application/x-tex">j</annotation></semantics></math> and column <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>k</mi><annotation encoding="application/x-tex">k</annotation></semantics></math> as <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msub><mi>A</mi><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow></msub><annotation encoding="application/x-tex">A_{j,k}</annotation></semantics></math> .</p>
<h3 id="matrix-transpose">Matrix transpose</h3>
<p>A formalism that is useful later on is called the <em>transpose</em>, and we
obtain it from a matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> by switching all the rows and columns. More
precisely, each row becomes a column instead. We use the notation
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>A</mi><mi>T</mi></msup><annotation encoding="application/x-tex">A^{T}</annotation></semantics></math> to represent the transpose of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mi>T</mi></msup><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>4</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>5</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>3</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>6</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 &amp; 2 &amp; 3 \\
4 &amp; 5 &amp; 6
\end{pmatrix}^{T} = \begin{pmatrix}
1 &amp; 4 \\
2 &amp; 5 \\
3 &amp; 6
\end{pmatrix}</annotation></semantics></math></p>
<p>Formally, we can say <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mrow><mo stretchy="true" form="prefix">(</mo><msup><mi>A</mi><mi>T</mi></msup><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>=</mo><msub><mi>A</mi><mrow><mi>k</mi><mo>,</mo><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">\left( A^{T} \right)_{j,k} = A_{k,j}</annotation></semantics></math></p>
<h2 id="linear-transformations">Linear transformations</h2>
<p>A linear transformation <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mo>:</mo><mi>V</mi><mo>→</mo><mi>W</mi></mrow><annotation encoding="application/x-tex">T:V \rightarrow W</annotation></semantics></math> is a mapping between two
vector spaces <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>V</mi><mo>→</mo><mi>W</mi></mrow><annotation encoding="application/x-tex">V \rightarrow W</annotation></semantics></math>, such that the following axioms are
satisfied:</p>
<ol>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo>+</mo><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>,</mo><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mo>∀</mo><mi>w</mi><mo>∈</mo><mi>W</mi></mrow><annotation encoding="application/x-tex">T(v + w) = T(v) + T(w),\forall v \in V,\forall w \in W</annotation></semantics></math></p></li>
<li><p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>α</mi><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>β</mi><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>α</mi><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>β</mi><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>,</mo><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mo>∀</mo><mi>w</mi><mo>∈</mo><mi>W</mi></mrow><annotation encoding="application/x-tex">T(\alpha v) + T(\beta w) = \alpha T(v) + \beta T(w),\forall v \in V,\forall w \in W</annotation></semantics></math>,
for all scalars <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi><mo>,</mo><mi>β</mi></mrow><annotation encoding="application/x-tex">\alpha,\beta</annotation></semantics></math></p></li>
</ol>
<p><em>Definition</em>. <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>T</mi><annotation encoding="application/x-tex">T</annotation></semantics></math> is a linear transformation iff.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>α</mi><mi>v</mi><mo>+</mo><mi>β</mi><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>α</mi><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mi>β</mi><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">T(\alpha v + \beta w) = \alpha T(v) + \beta T(w)</annotation></semantics></math></p>
<p><em>Abuse of notation</em>. From now on, we may elide the parentheses and say
that <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mrow><mo stretchy="true" form="prefix">(</mo><mi>v</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>T</mi><mi>v</mi><mo>,</mo><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">T(v) = Tv,\forall v \in V</annotation></semantics></math></p>
<p><em>Remark</em>. A phrase that you may commonly hear is that linear
transformations preserve <em>linearity</em>. Essentially, straight lines remain
straight, parallel lines remain parallel, and the origin remains fixed
at 0. Take a moment to think about why this is true (at least, in lower
dimensional spaces you can visualize).</p>
<p><em>Examples</em>.</p>
<ol>
<li><p>Rotation for <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>V</mi><mo>=</mo><mi>W</mi><mo>=</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">V = W = {\mathbb{R}}^{2}</annotation></semantics></math> (i.e. rotation in 2
dimensions). Given <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>,</mo><mi>w</mi><mo>∈</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">v,w \in {\mathbb{R}}^{2}</annotation></semantics></math>, and their linear
combination <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>+</mo><mi>w</mi></mrow><annotation encoding="application/x-tex">v + w</annotation></semantics></math>, a rotation of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>γ</mi><annotation encoding="application/x-tex">\gamma</annotation></semantics></math> radians of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>+</mo><mi>w</mi></mrow><annotation encoding="application/x-tex">v + w</annotation></semantics></math> is
equivalent to first rotating <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>v</mi><annotation encoding="application/x-tex">v</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>w</mi><annotation encoding="application/x-tex">w</annotation></semantics></math> individually by <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>γ</mi><annotation encoding="application/x-tex">\gamma</annotation></semantics></math>
and then taking their linear combination.</p></li>
<li><p>Differentiation of polynomials. In this case <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>V</mi><mo>=</mo><msup><mi>ℙ</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">V = {\mathbb{P}}^{n}</annotation></semantics></math>
and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><msup><mi>ℙ</mi><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></msup></mrow><annotation encoding="application/x-tex">W = {\mathbb{P}}^{n - 1}</annotation></semantics></math>, where <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℙ</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{P}}^{n}</annotation></semantics></math> is the
field of all polynomials of degree at most <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math>.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mi>d</mi><mrow><mi>d</mi><mi>x</mi></mrow></mfrac><mrow><mo stretchy="true" form="prefix">(</mo><mi>α</mi><mi>v</mi><mo>+</mo><mi>β</mi><mi>w</mi><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mi>α</mi><mfrac><mi>d</mi><mrow><mi>d</mi><mi>x</mi></mrow></mfrac><mi>v</mi><mo>+</mo><mi>β</mi><mfrac><mi>d</mi><mrow><mi>d</mi><mi>x</mi></mrow></mfrac><mi>w</mi><mo>,</mo><mo>∀</mo><mi>v</mi><mo>∈</mo><mi>V</mi><mo>,</mo><mi>w</mi><mo>∈</mo><mi>W</mi><mo>,</mo><mo>∀</mo><mrow><mspace width="0.333em"></mspace><mtext mathvariant="normal"> scalars </mtext><mspace width="0.333em"></mspace></mrow><mi>α</mi><mo>,</mo><mi>β</mi></mrow><annotation encoding="application/x-tex">\frac{d}{dx}(\alpha v + \beta w) = \alpha\frac{d}{dx}v + \beta\frac{d}{dx}w,\forall v \in V,w \in W,\forall\text{ scalars }\alpha,\beta</annotation></semantics></math></p></li>
</ol>
<h2 id="matrices-represent-linear-transformations">Matrices represent linear transformations</h2>
<p>Suppose we wanted to represent a linear transformation
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mo>:</mo><msup><mi>𝔽</mi><mi>n</mi></msup><mo>→</mo><msup><mi>𝔽</mi><mi>m</mi></msup></mrow><annotation encoding="application/x-tex">T:{\mathbb{F}}^{n} \rightarrow {\mathbb{F}}^{m}</annotation></semantics></math>. I propose that we
need encode how <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>T</mi><annotation encoding="application/x-tex">T</annotation></semantics></math> acts on the standard basis of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>𝔽</mi><mi>n</mi></msup><annotation encoding="application/x-tex">{\mathbb{F}}^{n}</annotation></semantics></math>.</p>
<p>Using our intuition from lower dimensional vector spaces, we know that
the standard basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> is the unit vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>i</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{i}</annotation></semantics></math>
and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>j</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{j}</annotation></semantics></math>. Because linear transformations preserve linearity (i.e.
all straight lines remain straight and parallel lines remain parallel),
we can encode any transformation as simply changing <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>i</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{i}</annotation></semantics></math> and
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>j</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{j}</annotation></semantics></math>. And indeed, if any vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">v \in {\mathbb{R}}^{2}</annotation></semantics></math> can be
represented as the linear combination of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>i</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{i}</annotation></semantics></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mover><mi>j</mi><mo accent="true">̂</mo></mover><annotation encoding="application/x-tex">\hat{j}</annotation></semantics></math> (this
is the definition of a basis), it makes sense both symbolically and
geometrically that we can represent all linear transformations as the
transformations of the basis vectors.</p>
<p><em>Example</em>. To reflect all vectors <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">v \in {\mathbb{R}}^{2}</annotation></semantics></math> across the
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>y</mi><annotation encoding="application/x-tex">y</annotation></semantics></math>-axis, we can simply change the standard basis to</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
 - 1 \\
0
\end{pmatrix}\begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<p>Then, any vector in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> using this new basis will be
reflected across the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>y</mi><annotation encoding="application/x-tex">y</annotation></semantics></math>-axis. Take a moment to justify this
geometrically.</p>
<h3 id="writing-a-linear-transformation-as-matrix">Writing a linear transformation as matrix</h3>
<p>For any linear transformation
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mo>:</mo><msup><mi>𝔽</mi><mi>m</mi></msup><mo>→</mo><msup><mi>𝔽</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">T:{\mathbb{F}}^{m} \rightarrow {\mathbb{F}}^{n}</annotation></semantics></math>, we can write it as an
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mo>×</mo><mi>m</mi></mrow><annotation encoding="application/x-tex">n \times m</annotation></semantics></math> matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math>. That is, there is a matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math> with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>n</mi><annotation encoding="application/x-tex">n</annotation></semantics></math> rows
and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>m</mi><annotation encoding="application/x-tex">m</annotation></semantics></math> columns that can represent any linear transformation from
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>𝔽</mi><mi>m</mi></msup><mo>→</mo><msup><mi>𝔽</mi><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">{\mathbb{F}}^{m} \rightarrow {\mathbb{F}}^{n}</annotation></semantics></math>.</p>
<p>How should we write this matrix? Naturally, from our previous
discussion, we should write a matrix with each <em>column</em> being one of our
new transformed <em>basis</em> vectors.</p>
<p><em>Example</em>. Our <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>y</mi><annotation encoding="application/x-tex">y</annotation></semantics></math>-axis reflection transformation from earlier. We write
the bases in a matrix</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
 - 1 &amp; 0 \\
0 &amp; 1
\end{pmatrix}</annotation></semantics></math></p>
<h3 id="matrix-vector-multiplication">Matrix-vector multiplication</h3>
<p>Perhaps you now see why the so-called matrix-vector multiplication is
defined the way it is. Recalling our definition of a basis, given a
basis in <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>V</mi><annotation encoding="application/x-tex">V</annotation></semantics></math>, any vector <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>v</mi><mo>∈</mo><mi>V</mi></mrow><annotation encoding="application/x-tex">v \in V</annotation></semantics></math> can be written as the linear
combination of the vectors in the basis. Then, given a linear
transformation represented by the matrix containing the new basis, we
simply write the linear combination with the new basis instead.</p>
<p><em>Example</em>. Let us first write a vector in the standard basis in
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>ℝ</mi><mn>2</mn></msup><annotation encoding="application/x-tex">{\mathbb{R}}^{2}</annotation></semantics></math> and then show how our matrix-vector multiplication
naturally corresponds to the definition of the linear transformation.</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>∈</mo><msup><mi>ℝ</mi><mn>2</mn></msup></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
2
\end{pmatrix} \in {\mathbb{R}}^{2}</annotation></semantics></math></p>
<p>is the same as</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mn>2</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">1 \cdot \begin{pmatrix}
1 \\
0
\end{pmatrix} + 2 \cdot \begin{pmatrix}
0 \\
1
\end{pmatrix}</annotation></semantics></math></p>
<p>Then, to perform our reflection, we need only replace the basis vector
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
1 \\
0
\end{pmatrix}</annotation></semantics></math> with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
 - 1 \\
0
\end{pmatrix}</annotation></semantics></math>.</p>
<p>Then, the reflected vector is given by</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>+</mo><mn>2</mn><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>=</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">1 \cdot \begin{pmatrix}
 - 1 \\
0
\end{pmatrix} + 2 \cdot \begin{pmatrix}
0 \\
1
\end{pmatrix} = \begin{pmatrix}
 - 1 \\
2
\end{pmatrix}</annotation></semantics></math></p>
<p>We can clearly see that this is exactly how the matrix multiplication</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mi>−</mi><mn>1</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>0</mn></mtd><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow><mo>⋅</mo><mrow><mo stretchy="true" form="prefix">(</mo><mtable><mtr><mtd columnalign="center" style="text-align: center"><mn>1</mn></mtd></mtr><mtr><mtd columnalign="center" style="text-align: center"><mn>2</mn></mtd></mtr></mtable><mo stretchy="true" form="postfix">)</mo></mrow></mrow><annotation encoding="application/x-tex">\begin{pmatrix}
 - 1 &amp; 0 \\
0 &amp; 1
\end{pmatrix} \cdot \begin{pmatrix}
1 \\
2
\end{pmatrix}</annotation></semantics></math> is defined! The <em>column-by-coordinate</em> rule for
matrix-vector multiplication says that we multiply the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math>
entry of the vector by the corresponding <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msup><mi>n</mi><mtext mathvariant="normal">th</mtext></msup><annotation encoding="application/x-tex">n^{\text{th}}</annotation></semantics></math> column of the
matrix and sum them all up (take their linear combination). This
algorithm intuitively follows from our definition of matrices.</p>
<h3 id="matrix-matrix-multiplication">Matrix-matrix multiplication</h3>
<p>As you may have noticed, a very similar natural definition arises for
the <em>matrix-matrix</em> multiplication. Multiplying two matrices <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>A</mi><mo>⋅</mo><mi>B</mi></mrow><annotation encoding="application/x-tex">A \cdot B</annotation></semantics></math>
is essentially just taking each column of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>B</mi><annotation encoding="application/x-tex">B</annotation></semantics></math>, and applying the linear
transformation defined by the matrix <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mi>A</mi><annotation encoding="application/x-tex">A</annotation></semantics></math>!</p></main>
</article>
]]></summary>
</entry>
<entry>
    <title>Nix automatic hash updates made easy</title>
    <link href="https://blog.youwen.dev/nix-automatic-hash-updates-made-easy.html" />
    <id>https://blog.youwen.dev/nix-automatic-hash-updates-made-easy.html</id>
    <published>2024-12-28T00:00:00Z</published>
    <updated>2024-12-28T00:00:00Z</updated>
    <summary type="html"><![CDATA[<article>
  <header>
    <h1 class="text-4xl">
      <a href="./nix-automatic-hash-updates-made-easy.html">Nix automatic hash updates made easy</a>
    </h1>
    <p
      class="mb-1 mt-2 italic font-light text-lg text-accent-light dark:text-accent-dark"
    >
      keep your flakes up to date
    </p>
    <div class="mt-2">2024-12-28</div>
    <div class="mt-1 text-sm">
      
    </div>
  </header>
  <main class="post mt-4"><p>Nix users often create flakes to package software out of tree, like this <a href="https://github.com/youwen5/zen-browser-flake">Zen
Browser flake</a> I’ve been
maintaining. Keeping them up to date is a hassle though, since you have to
update the Subresource Integrity (SRI) hashes that Nix uses to ensure
reproducibility.</p>
<p>Here’s a neat method I’ve been using to cleanly handle automatic hash updates.
I use <a href="https://www.nushell.sh/">Nushell</a> to easily work with data, prefetch
some hashes, and put it all in a JSON file that can be read by Nix at build
time.</p>
<p>First, let’s create a file called <code>update.nu</code>. At the top, place this shebang:</p>
<pre class="nu"><code>#!/usr/bin/env -S nix shell nixpkgs#nushell --command nu</code></pre>
<p>This will execute the script in a Nushell environment, which is fetched by Nix.</p>
<h2 id="get-the-up-to-date-urls">Get the up to date URLs</h2>
<p>We need to obtain the latest version of whatever software we want to update.
In this case, I’ll use GitHub releases as my source of truth.</p>
<p>You can use the GitHub API to fetch metadata about all the releases of a repository.</p>
<pre><code>https://api.github.com/repos/($repo)/releases</code></pre>
<p>Roughly speaking, the raw JSON returned by the GitHub releases API looks something like:</p>
<pre><code>[
   {tag_name: &quot;foo&quot;, prerelease: false, ...},
   {tag_name: &quot;bar&quot;, prerelease: true, ...},
   {tag_name: &quot;foobar&quot;, prerelease: false, ...},
]
</code></pre>
<p>Note that the ordering of the objects in the array is chronological.</p>
<blockquote>
<p>Even if you aren’t using GitHub releases, as long as there is a reliable way to
programmatically fetch the latest download URLs of whatever software you’re
packaging, you can adapt this approach for your specific case.</p>
</blockquote>
<p>We use Nushell’s <code>http get</code> to make a network request. Nushell will
automatically detect and parse the JSON reponse into a Nushell table.</p>
<p>In my case, Zen Browser frequently publishes prerelease “twilight” builds which
we don’t want to update to. So, we ignore any releases tagged “twilight” or
marked “prerelease” by filtering them out with the <code>where</code> selector.</p>
<p>Finally, we retrieve the tag name of the item at the first index, which would
be the latest release (since the JSON array was chronologically sorted).</p>
<pre class="nu"><code>#!/usr/bin/env -S nix shell nixpkgs#nushell --command nu

# get the latest tag of the latest release that isn&#39;t a prerelease
def get_latest_release [repo: string] {
  try {
	http get $&quot;https://api.github.com/repos/($repo)/releases&quot;
	  | where prerelease == false
	  | where tag_name != &quot;twilight&quot;
	  | get tag_name
	  | get 0
  } catch { |err| $&quot;Failed to fetch latest release, aborting: ($err.msg)&quot; }
}</code></pre>
<h2 id="prefetching-sri-hashes">Prefetching SRI hashes</h2>
<p>Now that we have the latest tags, we can easily obtain the latest download URLs, which are of the form:</p>
<pre><code>https://github.com/zen-browser/desktop/releases/download/$tag/zen.linux-x86_64.tar.bz2
https://github.com/zen-browser/desktop/releases/download/$tag/zen.aarch64-x86_64.tar.bz2</code></pre>
<p>However, we still need the corresponding SRI hashes to pass to Nix.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode nix"><code class="sourceCode nix"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>src = fetchurl <span class="op">{</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>   <span class="va">url</span> <span class="op">=</span> <span class="st">&quot;https://github.com/zen-browser/desktop/releases/download/1.0.2-b.5/zen.linux-x86_64.tar.bz2&quot;</span><span class="op">;</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>   <span class="va">hash</span> <span class="op">=</span> <span class="st">&quot;sha256-00000000000000000000000000000000000000000000&quot;</span><span class="op">;</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span></code></pre></div>
<p>The easiest way to obtain these new hashes is to update the URL and then set
the hash property to an empty string (<code>""</code>). Nix will spit out an hash mismatch
error with the correct hash. However, this is inconvenient for automated
command line scripting.</p>
<p>The Nix documentation mentions
<a href="https://nix.dev/manual/nix/2.18/command-ref/nix-prefetch-url">nix-prefetch-url</a>
as a way to obtain these hashes, but as usual, it doesn’t work quite right and
has also been replaced by a more powerful but underdocumented experimental
feature instead.</p>
<p>The <a href="https://nix.dev/manual/nix/2.18/command-ref/new-cli/nix3-store-prefetch-file">nix store
prefetch-file</a>
command does what <code>nix-prefetch-url</code> is supposed to do, but handles the caveats
that lead to the wrong hash being produced automatically.</p>
<p>Let’s write a Nushell function that outputs the SRI hash of the given URL. We
tell <code>prefetch-file</code> to output structured JSON that we can parse.</p>
<p>Since Nushell <em>is</em> a shell, we can directly invoke shell commands like usual,
and then process their output with pipes.</p>
<pre class="nu"><code>def get_nix_hash [url: string] {
  nix store prefetch-file --hash-type sha256 --json $url | from json | get hash
}</code></pre>
<p>Cool! Now <code>get_nix_hash</code> can give us SRI hashes that look like this:</p>
<pre><code>sha256-K3zTCLdvg/VYQNsfeohw65Ghk8FAjhOl8hXU6REO4/s=</code></pre>
<h2 id="putting-it-all-together">Putting it all together</h2>
<p>Now that we’re able to fetch the latest release, obtain the download URLs, and
compute their SRI hashes, we have all the information we need to make an
automated update. However, these URLs are typically hardcoded in our Nix
expressions. The question remains as to how to update these values.</p>
<p>A common way I’ve seen updates performed is using something like <code>sed</code> to
modify the Nix expressions in place. However, there’s actually a more
maintainable and easy to understand approach.</p>
<p>Let’s have our Nushell script generate the URLs and hashes and place them in a
JSON file! Then, we’ll be able to read the JSON file from Nix and obtain the
URL and hash.</p>
<pre class="nu"><code>def generate_sources [] {
  let tag = get_latest_release &quot;zen-browser/desktop&quot;
  let prev_sources = open ./sources.json

  if $tag == $prev_sources.version {
	# everything up to date
	return $tag
  }

  # generate the download URLs with the new tag
  let x86_64_url = $&quot;https://github.com/zen-browser/desktop/releases/download/($tag)/zen.linux-x86_64.tar.bz2&quot;
  let aarch64_url = $&quot;https://github.com/zen-browser/desktop/releases/download/($tag)/zen.linux-aarch64.tar.bz2&quot;

  # create a Nushell record that maps cleanly to JSON
  let sources = {
    # add a version field as well for convenience
	version: $tag

	x86_64-linux: {
	  url:  $x86_64_url
	  hash: (get_nix_hash $x86_64_url)
	}
	aarch64-linux: {
	  url: $aarch64_url
	  hash: (get_nix_hash $aarch64_url)
	}
  }

  echo $sources | save --force &quot;sources.json&quot;

  return $tag
}</code></pre>
<p>Running this script with</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">chmod</span> +x ./update.nu</span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="ex">./update.nu</span></span></code></pre></div>
<p>gives us the file <code>sources.json</code>:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode json"><code class="sourceCode json"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="fu">{</span></span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;version&quot;</span><span class="fu">:</span> <span class="st">&quot;1.0.2-b.5&quot;</span><span class="fu">,</span></span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;x86_64-linux&quot;</span><span class="fu">:</span> <span class="fu">{</span></span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a>    <span class="dt">&quot;url&quot;</span><span class="fu">:</span> <span class="st">&quot;https://github.com/zen-browser/desktop/releases/download/1.0.2-b.5/zen.linux-x86_64.tar.bz2&quot;</span><span class="fu">,</span></span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a>    <span class="dt">&quot;hash&quot;</span><span class="fu">:</span> <span class="st">&quot;sha256-K3zTCLdvg/VYQNsfeohw65Ghk8FAjhOl8hXU6REO4/s=&quot;</span></span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a>  <span class="fu">},</span></span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;aarch64-linux&quot;</span><span class="fu">:</span> <span class="fu">{</span></span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a>    <span class="dt">&quot;url&quot;</span><span class="fu">:</span> <span class="st">&quot;https://github.com/zen-browser/desktop/releases/download/1.0.2-b.5/zen.linux-aarch64.tar.bz2&quot;</span><span class="fu">,</span></span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a>    <span class="dt">&quot;hash&quot;</span><span class="fu">:</span> <span class="st">&quot;sha256-NwIYylGal2QoWhWKtMhMkAAJQ6iNHfQOBZaxTXgvxAk=&quot;</span></span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a>  <span class="fu">}</span></span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a><span class="fu">}</span></span></code></pre></div>
<p>Now, let’s read this from Nix. My file organization looks like the following:</p>
<pre><code>./
| flake.nix
| zen-browser-unwrapped.nix
| ...other files...</code></pre>
<p><code>zen-browser-unwrapped.nix</code> contains the derivation for Zen Browser. Let’s add
<code>version</code>, <code>url</code>, and <code>hash</code> to its inputs:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode nix"><code class="sourceCode nix"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a>  <span class="va">stdenv</span><span class="op">,</span></span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a>  <span class="va">fetchurl</span><span class="op">,</span></span>
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a>  <span class="co"># add these below</span></span>
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a>  <span class="va">version</span><span class="op">,</span></span>
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a>  <span class="va">url</span><span class="op">,</span></span>
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a>  <span class="va">hash</span><span class="op">,</span></span>
<span id="cb13-8"><a href="#cb13-8" aria-hidden="true" tabindex="-1"></a>  <span class="op">...</span></span>
<span id="cb13-9"><a href="#cb13-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>:</span>
<span id="cb13-10"><a href="#cb13-10" aria-hidden="true" tabindex="-1"></a>stdenv.mkDerivation <span class="op">{</span></span>
<span id="cb13-11"><a href="#cb13-11" aria-hidden="true" tabindex="-1"></a>   <span class="co"># inherit version from inputs</span></span>
<span id="cb13-12"><a href="#cb13-12" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inherit</span> version<span class="op">;</span></span>
<span id="cb13-13"><a href="#cb13-13" aria-hidden="true" tabindex="-1"></a>  <span class="va">pname</span> <span class="op">=</span> <span class="st">&quot;zen-browser-unwrapped&quot;</span><span class="op">;</span></span>
<span id="cb13-14"><a href="#cb13-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-15"><a href="#cb13-15" aria-hidden="true" tabindex="-1"></a>  <span class="va">src</span> <span class="op">=</span> fetchurl <span class="op">{</span></span>
<span id="cb13-16"><a href="#cb13-16" aria-hidden="true" tabindex="-1"></a>    <span class="co"># inherit the URL and hash we obtain from the inputs</span></span>
<span id="cb13-17"><a href="#cb13-17" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inherit</span> url hash<span class="op">;</span></span>
<span id="cb13-18"><a href="#cb13-18" aria-hidden="true" tabindex="-1"></a>  <span class="op">};</span></span>
<span id="cb13-19"><a href="#cb13-19" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Then in <code>flake.nix</code>, let’s provide the derivation with the data from <code>sources.json</code>:</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode nix"><code class="sourceCode nix"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span></span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a>   <span class="va">supportedSystems</span> <span class="op">=</span> <span class="op">[</span></span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a>     <span class="st">&quot;x86_64-linux&quot;</span></span>
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a>     <span class="st">&quot;aarch64-linux&quot;</span></span>
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a>   <span class="op">];</span></span>
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a>   <span class="va">forAllSystems</span> <span class="op">=</span> nixpkgs.lib.genAttrs supportedSystems<span class="op">;</span></span>
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a><span class="kw">in</span></span>
<span id="cb14-8"><a href="#cb14-8" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb14-9"><a href="#cb14-9" aria-hidden="true" tabindex="-1"></a>   <span class="co"># rest of file omitted for simplicity</span></span>
<span id="cb14-10"><a href="#cb14-10" aria-hidden="true" tabindex="-1"></a>   <span class="va">packages</span> <span class="op">=</span> forAllSystems <span class="op">(</span></span>
<span id="cb14-11"><a href="#cb14-11" aria-hidden="true" tabindex="-1"></a>     <span class="va">system</span><span class="op">:</span></span>
<span id="cb14-12"><a href="#cb14-12" aria-hidden="true" tabindex="-1"></a>     <span class="kw">let</span></span>
<span id="cb14-13"><a href="#cb14-13" aria-hidden="true" tabindex="-1"></a>       <span class="va">pkgs</span> <span class="op">=</span> <span class="bu">import</span> nixpkgs <span class="op">{</span> <span class="kw">inherit</span> system<span class="op">;</span> <span class="op">};</span></span>
<span id="cb14-14"><a href="#cb14-14" aria-hidden="true" tabindex="-1"></a>       <span class="co"># parse sources.json into a Nix attrset</span></span>
<span id="cb14-15"><a href="#cb14-15" aria-hidden="true" tabindex="-1"></a>       <span class="va">sources</span> <span class="op">=</span> <span class="bu">builtins</span>.fromJSON <span class="op">(</span><span class="bu">builtins</span>.readFile <span class="ss">./sources.json</span><span class="op">);</span></span>
<span id="cb14-16"><a href="#cb14-16" aria-hidden="true" tabindex="-1"></a>     <span class="kw">in</span></span>
<span id="cb14-17"><a href="#cb14-17" aria-hidden="true" tabindex="-1"></a>     <span class="kw">rec</span> <span class="op">{</span></span>
<span id="cb14-18"><a href="#cb14-18" aria-hidden="true" tabindex="-1"></a>       <span class="va">zen-browser-unwrapped</span> <span class="op">=</span> pkgs.callPackage <span class="ss">./zen-browser-unwrapped.nix</span> <span class="op">{</span></span>
<span id="cb14-19"><a href="#cb14-19" aria-hidden="true" tabindex="-1"></a>         <span class="kw">inherit</span> <span class="op">(</span>sources.$<span class="op">{</span><span class="va">system</span><span class="op">})</span> hash url<span class="op">;</span></span>
<span id="cb14-20"><a href="#cb14-20" aria-hidden="true" tabindex="-1"></a>         <span class="kw">inherit</span> <span class="op">(</span>sources<span class="op">)</span> version<span class="op">;</span></span>
<span id="cb14-21"><a href="#cb14-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb14-22"><a href="#cb14-22" aria-hidden="true" tabindex="-1"></a>         <span class="co"># if the above is difficult to understand, it is equivalent to the following:</span></span>
<span id="cb14-23"><a href="#cb14-23" aria-hidden="true" tabindex="-1"></a>         <span class="va">hash</span> <span class="op">=</span> sources.$<span class="op">{</span><span class="va">system</span><span class="op">}</span>.hash<span class="op">;</span></span>
<span id="cb14-24"><a href="#cb14-24" aria-hidden="true" tabindex="-1"></a>         <span class="va">url</span> <span class="op">=</span> sources.$<span class="op">{</span><span class="va">system</span><span class="op">}</span>.url<span class="op">;</span></span>
<span id="cb14-25"><a href="#cb14-25" aria-hidden="true" tabindex="-1"></a>         <span class="va">version</span> <span class="op">=</span> sources.version<span class="op">;</span></span>
<span id="cb14-26"><a href="#cb14-26" aria-hidden="true" tabindex="-1"></a>       <span class="op">};</span></span>
<span id="cb14-27"><a href="#cb14-27" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Now, running <code>nix build .#zen-browser-unwrapped</code> will be able to use the hashes
and URLs from <code>sources.json</code> to build the package!</p>
<h2 id="automating-it-in-ci">Automating it in CI</h2>
<p>We now have a script that can automatically fetch releases and generate hashes
and URLs, as well as a way for Nix to use the outputted JSON to build
derivations. All that’s left is to fully automate it using CI!</p>
<p>We are going to use GitHub actions for this, as it’s free and easy and you’re
probably already hosting on GitHub.</p>
<p>Ensure you’ve set up actions for your repo and given it sufficient permissions.</p>
<p>We’re gonna run it on a cron timer that checks for updates at 8 PM PST every day.</p>
<p>We use DeterminateSystems’ actions to help set up Nix. Then, we simply run our
update script. Since we made the script return the tag it fetched, we can store
it in a variable and then use it in our commit message.</p>
<pre><code>name: Update to latest version, and update flake inputs

on:
  schedule:
    - cron: &quot;0 4 * * *&quot;
  workflow_dispatch:

jobs:
  update:
    name: Update flake inputs and browser
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4

      - name: Check flake inputs
        uses: DeterminateSystems/flake-checker-action@v4

      - name: Install Nix
        uses: DeterminateSystems/nix-installer-action@main

      - name: Set up magic Nix cache
        uses: DeterminateSystems/magic-nix-cache-action@main

      - name: Check for update and perform update
        run: |
          git config --global user.name &quot;github-actions[bot]&quot;
          git config --global user.email &quot;github-actions[bot]@users.noreply.github.com&quot;

          chmod +x ./update.nu
          export ZEN_LATEST_VER=&quot;$(./update.nu)&quot;

          git add -A
          git commit -m &quot;github-actions: update to $ZEN_LATEST_VER&quot; || echo &quot;Latest version is $ZEN_LATEST_VER, no updates found&quot;

          nix flake update --commit-lock-file

          git push</code></pre>
<p>Now, our repository will automatically check for and perform updates every day!</p></main>
</article>
]]></summary>
</entry>
<entry>
    <title>a haskellian blog</title>
    <link href="https://blog.youwen.dev/a-haskellian-blog.html" />
    <id>https://blog.youwen.dev/a-haskellian-blog.html</id>
    <published>2024-05-25T00:00:00Z</published>
    <updated>2024-05-25T12:00:00Z</updated>
    <summary type="html"><![CDATA[<article>
  <header>
    <h1 class="text-4xl">
      <a href="./a-haskellian-blog.html">a haskellian blog</a>
    </h1>
    <p
      class="mb-1 mt-2 italic font-light text-lg text-accent-light dark:text-accent-dark"
    >
      a purely functional...blog?
    </p>
    <div class="mt-2">2024-05-25</div>
    <div class="mt-1 text-sm">
      (last updated: 2024-05-25T12:00:00Z)
    </div>
  </header>
  <main class="post mt-4"><p>Welcome! This is the first post on <em>The Involution</em> and also one that tests all
of the features.</p>
<!--<img-->
<!--  alt="conditional finality"-->
<!--  src="./images/conditional-finality.png"-->
<!--/>-->
<blockquote>
<p>A monad is just a monoid in the category of endofunctors, what’s the problem?</p>
</blockquote>
<h2 id="haskell">haskell?</h2>
<p>This entire blog is generated with <a href="https://jaspervdj.be/hakyll/">hakyll</a>. It’s
a library for generating static sites for Haskell, a purely functional
programming language. It’s a <em>library</em> because it doesn’t come with as many
batteries included as tools like Hugo or Astro. You set up most of the site
yourself by calling the library from Haskell.</p>
<p>Here’s a brief excerpt:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode haskell"><code class="sourceCode haskell"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="ot">main ::</span> <span class="dt">IO</span> ()</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>main <span class="ot">=</span> hakyllWith config <span class="op">$</span> <span class="kw">do</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>    forM_</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>        [ <span class="st">&quot;CNAME&quot;</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;favicon.ico&quot;</span></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;robots.txt&quot;</span></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;_config.yml&quot;</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;images/*&quot;</span></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;out/*&quot;</span></span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>        , <span class="st">&quot;fonts/*&quot;</span></span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>        ]</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>        <span class="op">$</span> \f <span class="ot">-&gt;</span> match f <span class="op">$</span> <span class="kw">do</span></span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>            route idRoute</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>            compile copyFileCompiler</span></code></pre></div>
<p>The code highlighting is also generated by hakyll.</p>
<hr />
<h2 id="why">why?</h2>
<p>Haskell is a purely functional language with no mutable state. Its syntax
actually makes it pretty elegant for declaring routes and “rendering” pipelines.</p>
<ol>
<li>Haskell is cool.</li>
<li>It comes with enough features that I don’t feel like I have to build
everything from scratch.</li>
<li>It comes with Pandoc, a Haskell library for converting between markdown
formats. It’s probably more powerful than anything you could do in <code>nodejs</code>.
It renders all of the markdown to HTML as well as the math.
<ol>
<li>It supports KaTeX as well as MathML. I’m a little disappointed with the
KaTeX though. It doesn’t directly render it, but simply injects the KaTeX
files and renders it client-side.</li>
</ol></li>
</ol>
<h3 id="speaking-of-math">speaking of math</h3>
<p>We can have math inline, like so:
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mo>∫</mo><mrow><mi>−</mi><mi>∞</mi></mrow><mi>∞</mi></msubsup><mspace width="0.167em"></mspace><msup><mi>e</mi><mrow><mi>−</mi><msup><mi>x</mi><mn>2</mn></msup></mrow></msup><mspace width="0.167em"></mspace><mi>d</mi><mi>x</mi><mo>=</mo><msqrt><mi>π</mi></msqrt></mrow><annotation encoding="application/x-tex">\int_{-\infty}^\infty \, e^{-x^2}\,dx = \sqrt{\pi}</annotation></semantics></math>. This site ships semantic
MathML math with its HTML, and the MathJax script to the client.</p>
<p>It’d be nice if MathML could just be used and supported across all browsers, but
unfortunately we still aren’t quite there yet. Firefox is the only one where
everything looks 80% of the way to LaTeX. On Safari and Chrome, even simple
equations like <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><msqrt><mi>π</mi></msqrt><annotation encoding="application/x-tex">\sqrt{\pi}</annotation></semantics></math> render improperly.</p>
<p>Pros of MathML:</p>
<ul>
<li>A little more accessible</li>
<li>Can be rendered without additional stylesheets. I just installed the Latin
Modern font, but this isn’t even really necessary</li>
<li>Built-in to most browsers (#UseThePlatform)</li>
</ul>
<p>Cons:</p>
<ul>
<li>Isn’t fully standardized. Might look different on different browsers</li>
<li>Rendering quality isn’t as good as KaTeX</li>
</ul>
<p>This site has MathJax render all of the math so it looks nice and standardized
across browsers, but the math still displays regardless (like say if MathJax
couldn’t load due to slow network) because of MathML. Best of both worlds.</p>
<p>Let’s try it now. Here’s a simple theorem:</p>
<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>a</mi><mi>n</mi></msup><mo>+</mo><msup><mi>b</mi><mi>n</mi></msup><mo>≠</mo><msup><mi>c</mi><mi>n</mi></msup><mspace width="0.167em"></mspace><mo>∀</mo><mspace width="0.167em"></mspace><mrow><mo stretchy="true" form="prefix">{</mo><mi>a</mi><mo>,</mo><mspace width="0.167em"></mspace><mi>b</mi><mo>,</mo><mspace width="0.167em"></mspace><mi>c</mi><mo stretchy="true" form="postfix">}</mo></mrow><mo>∈</mo><mi>ℤ</mi><mo>∧</mo><mi>n</mi><mo>≥</mo><mn>3</mn></mrow><annotation encoding="application/x-tex">
a^n + b^n \ne c^n \, \forall\,\left\{ a,\,b,\,c \right\} \in \mathbb{Z} \land n \ge 3
</annotation></semantics></math></p>
<p>The proof is trivial and will be left as an exercise to the reader.</p>
<h2 id="seems-a-little-overengineered">seems a little overengineered</h2>
<p>Probably is. Not as much as the old one, though.</p></main>
</article>
]]></summary>
</entry>

</feed>