1110 lines
35 KiB
Text
1110 lines
35 KiB
Text
#import "@youwen/zen:0.1.0": *
|
|
#import "@preview/cetz:0.3.1"
|
|
|
|
#set math.equation(numbering: "(1)")
|
|
#show math.equation: it => {
|
|
if it.block and not it.has("label") [
|
|
#counter(math.equation).update(v => v - 1)
|
|
#math.equation(it.body, block: true, numbering: none)#label("")
|
|
] else {
|
|
it
|
|
}
|
|
}
|
|
|
|
#show: zen.with(
|
|
title: "Math 6A Course Notes",
|
|
author: "Youwen Wu",
|
|
date: "Winter 2025",
|
|
subtitle: [Taught by Nathan Schley],
|
|
)
|
|
|
|
#outline()
|
|
|
|
= Lecture #datetime(day: 7, month: 1, year: 2025).display()
|
|
|
|
== Review of fundamental concepts
|
|
|
|
You can parameterize curves.
|
|
|
|
#example[Unit circle][
|
|
$
|
|
x = cos(t) \
|
|
y = sin(t)
|
|
$
|
|
]
|
|
|
|
For an implicit equation
|
|
$ y = f(t) $
|
|
Parameterize it by setting
|
|
$ x = t \ y = f(t) $
|
|
|
|
Parameterize a line passing through two points $arrow(p)_1$ and $arrow(p)_2$ by
|
|
$ arrow(c)(t) = arrow(p)_1 + t (arrow(p)_2 - arrow(p)_1) $
|
|
|
|
Take the derivative of each component to find the velocity vector. The
|
|
magnitude of velocity is speed.
|
|
|
|
#example[
|
|
$
|
|
arrow(c)(t) = <5t, sin(t)> \
|
|
arrow(v)(t) = <5, cos(t)>
|
|
$
|
|
]
|
|
|
|
== Polar coordinates
|
|
|
|
Write a set of Cartesian coordinates in $RR^2$ as polar coordinates instead, by
|
|
a distance from origin $r$ and angle about the origin $theta$.
|
|
|
|
$ (x,y) -> (r, theta) $
|
|
|
|
= Lecture #datetime(day: 9, month: 1, year: 2025).display()
|
|
|
|
== Vectors
|
|
|
|
A dot product of two vectors is a generalization of the sense of size for a
|
|
point or vector.
|
|
|
|
#example[
|
|
How far is the point $x_1, x_2, x_3$ from the origin? \
|
|
Answer: $x_1^2 + x_2^2 + x_3^2$
|
|
]
|
|
|
|
#definition[
|
|
For vectors $u$ and $v$, where
|
|
$ v = vec(v_1, v_2, dots.v, n), u = vec(u_1, u_2, dots.v, n) $
|
|
The dot product is defined as
|
|
$ sum_(i=1)^n v_i dot u_i $
|
|
]
|
|
|
|
#proposition[
|
|
The dot product of two vectors is the product of their magnitudes and the cosine of the angle between.
|
|
|
|
$ arrow(v) dot arrow(w) = ||arrow(v)|| dot ||arrow(w)|| cos theta $
|
|
]
|
|
|
|
= Lecture #datetime(day: 23, month: 1, year: 2025).display()
|
|
|
|
Midterm is next Thursday in class!
|
|
|
|
== Arclength and curvature
|
|
|
|
Easy way of finding curvature: reparameterize curve with speed 1, then
|
|
curvature is acceleration. If we can't do that then we need some other
|
|
technique.
|
|
|
|
Given $arrow(c)(t) = <2t^(-1), 6, 2t>$, find the curvature $kappa(t)$.
|
|
$
|
|
kappa (t) = (||arrow(c)'(t) times arrow(c)''(t)||) / (||arrow(c)'(t)||^3)
|
|
$
|
|
|
|
== Arclength parametrization
|
|
|
|
Find an arc-length parametrization of $arrow(c)(t) = <e^t sin(t), e^t cos(t), 5e^t>$.
|
|
|
|
Let $s = 0$ when $t = 0$ and let $s$ be the arc-length that has traveled along
|
|
the curve after $t$ seconds, then we can find $s$ by integrating the curve's
|
|
speed over $t$.
|
|
|
|
$
|
|
s(t) = integral^t_0 ||arrow(c)'(u)|| dif u
|
|
$
|
|
|
|
= Lecture #datetime(day: 12, year: 2025, month: 2).display()
|
|
|
|
== Chain rule for multivariate functions
|
|
|
|
We find motivation for the chain rule.
|
|
|
|
Consider a hiker whose path is given by
|
|
|
|
$
|
|
arrow(c) (t) = <x(t), y(t)>
|
|
$
|
|
|
|
and
|
|
|
|
$
|
|
f(x,y) = x dot y
|
|
$
|
|
|
|
What does $x'(t)$ represent? Speed in $x$-direction. Likewise for $y'(t)$.
|
|
|
|
Say $x'(t) = 3$, $y'(t) = 4$. Then how far did we travel in $t$ seconds?
|
|
|
|
Suppose our slope in the $x$ direction is given by $m_x = 2$. Suppose the slope
|
|
in $y$ is $m_y = -2$. In fact $m_x = f_x (x,y)$ and $m_y = f_y (x,y)$ (here
|
|
$f_k$ is the partial derivative with respect to $k$).
|
|
|
|
So each change in $t$ of 1 leads to a change in elevation up 6 meters in
|
|
$x$-axis and down 8 meters in $y$-axis.
|
|
|
|
So the total change $Delta z$ is given by
|
|
$
|
|
Delta z = m_x dot Delta x + m_y dot Delta y
|
|
$
|
|
and analogously in calculus land
|
|
|
|
$
|
|
(dif z) / (dif t) = (diff f) / (diff x) dot x'(t) + (diff f) / (diff y) dot y'(t)
|
|
$<chain-rule>
|
|
|
|
In fact @chain-rule is the chain rule.
|
|
|
|
#fact[
|
|
$
|
|
(dif f) / (dif t) = (diff f) / (diff x) dot (diff x) / (diff t) + (diff f) / (diff y) dot (diff y) / (diff t) + (diff f) / (diff z) dot (diff z) / (diff t)
|
|
$
|
|
]
|
|
|
|
#example[
|
|
Consider $f(x) = x^x$. What is $f'(x)$?
|
|
|
|
We can do this with logarithmic differentiation but we can also do this with the multivariable chain rule.
|
|
|
|
$
|
|
f(x,y) =
|
|
$
|
|
]
|
|
|
|
#example[
|
|
Find the derivative $dif/(dif t) (f(x,y))$, where $f(x,y) = x^y$, $x(t) = t$,
|
|
and $y(t) = 1$. Assume $t > 0$.
|
|
]
|
|
|
|
#example[
|
|
Find the partial derivative $diff/(diff s) f(x,y,z)$ where $f(x,y,z) = x^2 y^2 + z^3$, and
|
|
|
|
$
|
|
x(s,t) = s t \
|
|
y(s, t) = s^2 t \
|
|
z(s,t) = s t^2
|
|
$
|
|
]
|
|
|
|
== Implicit differentiation
|
|
|
|
Review from single variable: given $f(x,y)$ we can differentiate each term with
|
|
respect to $x$, then collect all $(dif y)/(dif x)$ terms together and solve for
|
|
it as a variable to obtain $(dif y)/(dif x) = f'(x,y)$.
|
|
|
|
We do something similar for more variables. Main idea: extraneous variables are
|
|
held constant in practice.
|
|
|
|
Example: consider the surface $3x^2 + 5y z + z^3 = 0$. We want $(diff y)/(diff
|
|
z)$ at some point. Use implicit differentiation by viewing the surface as a
|
|
level set of some larger function $F(x,y,z) = 3x^2 + 5y z + x^3$ (the level set
|
|
part is when $F(x,y,z) = 0$).
|
|
|
|
By applying the product rule (really the chain rule @chain-rule)
|
|
$
|
|
(diff F) / (diff x) = diff / (diff z) (3x^2 + 5 y z + z^3) = diff / (diff z) z^3 = 0 + (5 (diff y) / (diff x) z + 5y) + 3z^2 \
|
|
(diff y) / (diff z) = - (5y + 3z^2) / (5z)
|
|
$
|
|
|
|
= Lecture #datetime(day: 18, year: 2025, month: 2).display()
|
|
|
|
== Critical points
|
|
|
|
When optimizing in 2D, the strategy depends on whether we're
|
|
|
|
- optimizing for all of $RR^2$ (or a region in $RR^2$)
|
|
- optimizing on a constraint (like a curve through $RR^n$)
|
|
|
|
We find critical points where the tangent plane is "flat": $m_x = 0$ and $m_y =
|
|
0$.
|
|
|
|
We classify critical points using the determinant of the gradient.
|
|
|
|
$
|
|
D = f_(x x) f_(y y) = f_(x y)^2
|
|
$
|
|
|
|
- if $D >0$ and $f_(x x) (x_0, y_0) > 0$, then $f(x_0, y_0)$ is a relative minimum.
|
|
- if $D > 0$ and $f_(x x) (x_0, y_0) < 0$, $f(x_0, y_0)$ is a relative maximum.
|
|
- if $D < 0$ then $f(x_0, y_0)$ is neither and we call it a saddle point.
|
|
- if $D = 0$ then we don't know
|
|
|
|
== Lagrange multipliers
|
|
|
|
Optimizing constrained curves. Idea: navigate along the curve and look for where
|
|
the directional derivative is zero.
|
|
|
|
#example[
|
|
Find the highest and lowest points on $f(x,y) = 81x^2 + y^2$ with the
|
|
constraint $x^2 + y^2 = 0$.
|
|
|
|
For notational purposes, we'll call $g(x,y) = 4x^2 + y^2$ and keep in mind
|
|
we're looking for $g(x,y) = 9$.
|
|
|
|
1. Find the gradients of $f$ and $g$.
|
|
$
|
|
vec(f_x,f_g) &= vec(162x,2y) \
|
|
vec(g_x,g_y) &= vec(8x,2y)
|
|
$
|
|
2. We want to find the points where the gradients "align", i.e. we want these
|
|
vectors to be parallel, that is:
|
|
$
|
|
arrow(F) = lambda arrow(G) \
|
|
162x = 8x dot lambda \
|
|
2y = 2y dot lambda
|
|
$
|
|
Remember to keep the constraint!
|
|
$
|
|
4x^2 + y^2 = 9
|
|
$
|
|
Breaking it down into cases,
|
|
$
|
|
162 = 8lambda => lambda = 81 / 4 "for" x != 0
|
|
$
|
|
which implies
|
|
$
|
|
2y (lambda - 1) 0 \
|
|
y = 0
|
|
$
|
|
since $lambda - 1$ is nonzero. So if $x$ is nonzero, $y$ must be zero.
|
|
|
|
$
|
|
4x^2 = 9 \
|
|
x = plus.minus 3 / 2
|
|
$
|
|
Now consider when $x = 0$. Then $y = plus.minus 3$. So our critical points
|
|
are $(plus.minus 3/2, plus.minus 3)$. Finally, just plug in these 4
|
|
critical points into $f$ and find the biggest/smallest.
|
|
|
|
]
|
|
|
|
= Speedrun
|
|
|
|
In this chapter I wrote up notes for the entirety of the course, starting from
|
|
week 1, ending at week 8, because I skipped 80% of the classes up to the
|
|
midterm.
|
|
|
|
== Vector review
|
|
|
|
We know about functions $y = f(x)$. We can parameterize functions by expressing
|
|
them as a pair of coordinates $(x(t), y(t))$, modeling for example a particle
|
|
traveling through space with respect to time.
|
|
|
|
=== Derivative of parameterized curve
|
|
|
|
Given $(x(t), y(t))$, the derivative is given
|
|
$
|
|
(x'(t), y'(t))
|
|
$
|
|
|
|
=== Parameterizing an ellipse
|
|
|
|
Consider an ellipse
|
|
|
|
$
|
|
x^2 / n + y^2 / m = r^2
|
|
$
|
|
|
|
Then we note that this is just a circle with $x$ stretched by a factor of
|
|
$sqrt(n)$ and likewise for $y$ by a factor of $sqrt(m)$. Then we can
|
|
parameterize the ellipse by
|
|
|
|
$
|
|
(r sqrt(n) cos(theta), r sqrt(m) sin(theta))
|
|
$
|
|
|
|
A sanity check: when $n = 1$, $m = 1$, $r = 1$, we have a unit circle and the
|
|
parametrization reflects that.
|
|
|
|
#example[
|
|
Consider the ellipse $x^2/9 + y^2/4 = 16$. Then the parametrization is
|
|
$(12cos(theta), 8sin(theta))$, and this is indeed right.
|
|
]
|
|
|
|
To parameterize a line passing through points $arrow(p)_1$ and $arrow(p)_2$,
|
|
simply
|
|
$
|
|
arrow(c) (t) = arrow(p)_1 + t(arrow(p)_2 - arrow(p)_1)
|
|
$
|
|
Algebraically we can justify this by noting $t=0$ gives $arrow(p)_1$ and $t=1$ gives $arrow(p)_2$.
|
|
|
|
=== Polar coordinates
|
|
|
|
Notation: $(r,theta)$ instead of $(x,y)$. Note that this is just highlighting
|
|
that we're parameterizing a curve in terms of a radius $r$ (also called
|
|
_modulus_)and argument (angle) $theta$.
|
|
|
|
To get $r$, see that $r^2 = x^2 + y^2$ and it follows that $theta =
|
|
arctan(y/x)$. Blah blah.
|
|
|
|
To plot in terms of $x,y$, note that
|
|
$
|
|
x = r cos(theta) \
|
|
y = r sin(theta)
|
|
$
|
|
|
|
=== Vector properties
|
|
|
|
Many nonrigorous statements about vectors to add to our toolbox.
|
|
|
|
Two vectors are parallel if and only if the angle formed between them is 0.
|
|
Vectors can be added in linear combinations.
|
|
|
|
The magnitude of a vector length $n$ is given by the $n$ dimensional
|
|
Pythagorean theorem.
|
|
|
|
$
|
|
sqrt(a_1^2 + a_2^2 + dots.c + a_n^2)
|
|
$
|
|
|
|
A unit vector is a vector with magnitude 1.
|
|
|
|
=== Dot product
|
|
|
|
Dot products are useful for seeing properties about orthogonality, parallelism,
|
|
and projection.
|
|
|
|
#definition[
|
|
The dot product takes two vectors and returns a single scalar. It is sometimes
|
|
called the inner product.
|
|
]
|
|
|
|
We can view the dot product algebraically, and geometrically. In the first
|
|
sense, it's just the sum of the products of each pair of coordinates in the
|
|
vectors.
|
|
|
|
Let $A,B$ be vectors length $n$, and $a_i, b_i$ be the $i^"th"$ entry of their
|
|
respective vectors, then
|
|
$
|
|
A dot B = a_1 dot b_1 + a_2 dot b_2 + dots.c + a_n dot b_n
|
|
$
|
|
|
|
Geometrically, it's the product of the magnitudes of the vectors and the cosine
|
|
of angle between them.
|
|
|
|
$
|
|
A dot B = |A| dot |B| dot cos(theta)
|
|
$
|
|
|
|
where $theta$ is the angle between the vectors. It's nontrivial to prove this
|
|
and we don't have time.
|
|
|
|
Therefore, we know two vectors are parallel when $A dot B = |A| dot |B|$,
|
|
because $cos(theta) = 1$. We know two vectors are orthogonal if the dot product
|
|
is 0, because $cos(pi/2 + pi n)$ is 1.
|
|
|
|
Dot products are very useful for projection. Pick a normal vector $arrow(u)$.
|
|
For any vector $arrow(w)$, $arrow(u) dot arrow(w)$ gives the size of the
|
|
parallel part.
|
|
|
|
Also, note this gives the shortest distance between the tip of $arrow(w)$ and
|
|
the line passing in the direction of $arrow(u)$!
|
|
|
|
=== Normalizing vectors
|
|
|
|
Normalizing a vector means obtaining a vector pointing in the same direction,
|
|
with magnitude 1. For a nonzero vector $v$, its normalized vector is given
|
|
|
|
$
|
|
v / (|v|)
|
|
$
|
|
|
|
where $|v|$ is the magnitude.
|
|
|
|
=== Cross product
|
|
|
|
The cross product $times$ is a binary operation on vectors. For our purposes
|
|
it's only defined in $RR^3$ (in fact it's defined in some other dimensions, but
|
|
in general it is not defined).
|
|
|
|
It produces a third vector perpendicular to both original vectors. Its
|
|
direction is determined by the right hand rule.
|
|
|
|
The magnitude of the cross product is given by
|
|
|
|
$
|
|
|a times b| = |a| |b| sin theta
|
|
$
|
|
|
|
where $theta$ is the angle between $a$ and $b$. So we know two vectors are
|
|
parallel if the magnitude of the cross product is 0.
|
|
|
|
In fact this magnitude is also the area of the parallelegram spanned by the two
|
|
vectors. This is an alternative way to view the cross product being 0 implying
|
|
the vectors parallel.
|
|
|
|
We can compute the cross product by taking a determinant.
|
|
|
|
$
|
|
a times b = det mat(i,j,k; a_1, a_2, a_3; b_1,b_2,b_3)
|
|
$
|
|
|
|
The cross product is anticommutative, so $a times b = -(b times a)$.
|
|
|
|
== Vector applications, building geometric intuition
|
|
|
|
Let's build some geometric intuition for working with vectors, especially in
|
|
$RR^3$.
|
|
|
|
=== Moving the equation of a line or plane while maintaining orientation
|
|
|
|
We consider two cases. If we're working with a parametric equation, then we can
|
|
just add a vector to the equation to shift everything by said vector.
|
|
|
|
Otherwise, with an implicit equation, let's consider only the plane (since we
|
|
need two implicit equations to specify a line, and at that point it's better to
|
|
solve the system and parameterize).
|
|
|
|
The plane equation $a x + b y + c z = r$ can be shifted $n$ units in the
|
|
positive $x$, $y$, or $z$ direction by replacing all $x$, $y$, or $z$ with $x -
|
|
n$ and so on. Then we can just multiply out collect terms.
|
|
|
|
=== Moving equation of a line/plane to pass through a specific point
|
|
|
|
We want to do this without changing direction or orientation. We can just use
|
|
our technique discussed above for this.
|
|
|
|
First let's make sure the plane passes through the origin. If we have
|
|
$
|
|
3x - 2y + 7z = 12
|
|
$
|
|
we can just set the right hand to $0$
|
|
$
|
|
3x - 2y + 7z = 0
|
|
$
|
|
Note that by our technique of shifting the plane or line, we see that the
|
|
constant on the right side is determined entirely by shifts in space that
|
|
preserve direction/orientation. So we are sure that setting it to 0 does
|
|
nothing but move the plane/line through 0.
|
|
|
|
=== Equation of a line through a given point perpendicular to a
|
|
plane
|
|
|
|
Let's say we have a point $(5,2,3)$. Let's consider both parametric planes and
|
|
implicitly defined planes.
|
|
|
|
Suppose the plane is given by
|
|
$
|
|
2x + 3y + 4z = 12
|
|
$
|
|
Then observe that we need the line to be perpendicular to the plane but it
|
|
doesn't really matter where the plane is. Recall that we can easily shift a
|
|
plane around while preserving orientation. So let's just move the plane through
|
|
the origin again.
|
|
|
|
$
|
|
2x + 3y + 4z = 0
|
|
$
|
|
|
|
Now note that we can obtain a perpendicular vector to the plane by finding a
|
|
vector perpendicular to any particular vector on this plane.
|
|
|
|
Then note that $vec(2,3,4)$ is one such perpendicular vector. See this by
|
|
$
|
|
vec(2,3,4) dot vec(x,y,z) = 2x + 3y + 4z = 0
|
|
$
|
|
Then we can simply scale our perpendicular vector by a parameter $t$ to obtain
|
|
a parametric line that's perpendicular to the plane. Now we can just shift it
|
|
by our desired point, and it remains orthogonal while passing through the line
|
|
(at $t = 0$).
|
|
|
|
$
|
|
vec(5,2,3) + t vec(2,3,4)
|
|
$
|
|
|
|
Now consider when we have a parametric equation, say
|
|
$
|
|
vec(5,0,0) + s vec(2,0,-1) + t vec(0,4,-3)
|
|
$
|
|
Then as long as we're perpendicular to both of the vectors being multiplied by
|
|
$s$ and $t$, we're perpendicular. This is easy to show, just note that the
|
|
plane is given by the span of the basis vectors $vec(2,0,-1)$ (shifted by
|
|
$vec(5,0,0)$) and $vec(0,4,-3)$, so any vector perpendicular to both is
|
|
perpendicular to the entire plane.
|
|
|
|
So just take their cross product to get a desired vector.
|
|
|
|
=== Distance between point and a plane
|
|
|
|
Think geometrically. We really want to move in a perpendicular line from the
|
|
plane to the point (because that's the closest distance between them). We
|
|
should start on the point, but where do we stop on the plane?
|
|
|
|
Consider
|
|
$
|
|
4x + y + 3z = 1
|
|
$
|
|
and we want the distance to $(1,1,-5)$. The perpendicular line passing through the point is
|
|
$
|
|
vec(1,1,-5) + t vec(4,1,3)
|
|
$
|
|
The line "starts" at the point $t=0$, so let's find a value of $t$ that makes
|
|
it stop precisely on the plane. To do this, simply note that our line is really a parametric equation
|
|
$(x,y,z)$ where
|
|
$x = 1 + 4t, y = 1 + t, z = -5 + 3t$. Then we can simply plug these into the
|
|
equation of the plane and solve for $t$ to get the value of $t$ where the line
|
|
meets the plane. Then plug $t$ into our line equation (which gives a vector)
|
|
and the magnitude is the distance between the point and plane.
|
|
|
|
=== Area of a parallelogram formed by two vectors in $RR^3$
|
|
|
|
In $RR^2$ we can take the determinant. In $RR^3$ the determinant is the volume
|
|
of the parallelepiped. So instead we just take the magnitude of the cross
|
|
product.
|
|
|
|
=== Distance from point to line passing through two other points in $RR^2$
|
|
|
|
Note that there are multiple ways to do this. Let $P = (1,7)$, $A = (1,1)$, and
|
|
$B = (3,9)$. We want the distance from $P$ to the line between $A$ and $B$.
|
|
|
|
We could just find the line between them and then use a 2-dimensional version
|
|
of our point to plane technique (solve for a vector orthogonal to the line, in
|
|
the direction of $P$, passing through $P$), but since we're in $RR^2$, we can
|
|
just project $P$ onto the normalized line and do some stuff.
|
|
|
|
In particular, note that the magnitude of the cross product of $A times B$ is
|
|
$|A| dot |B| dot sin theta$. So if we want the distance from the tip of $B$ to
|
|
the line spanned by $A$, we should do $(|A times B|)/(|A|)$.
|
|
|
|
If instead we want the length of the projection of $B$ onto $A$, we should do
|
|
$(A dot B)/(|A|)$. There are multiple ways to interpret this geometrically.
|
|
|
|
== Derivative of a curve
|
|
|
|
What is the derivative of a curve? We can view the derivative at some point $x$
|
|
as the slope of the tangent line. But that doesn't give the derivative of a
|
|
parametric curve traveling through the plane.
|
|
|
|
However, this is simple. Because our curve is parameterized, each coordinate
|
|
$x,y,z$ and so on is independent of each other and given by $t$. Therefore, we
|
|
can collect another vector, taking the derivative of each coordinate, which
|
|
gives us a vector of the rates at which each coordinate is changing.
|
|
|
|
== Arclength parametrization
|
|
|
|
We want to find an arc length parametrization of a curve. That is, we want to
|
|
express a curve in terms of how far we've traveled on it.
|
|
|
|
Idea: let $s=0$ when $t=0$, and let $s$ be the arclength traveled after $t$
|
|
seconds. Then we can integrate the curve's speed over $t$ to find the arc
|
|
length.
|
|
|
|
$
|
|
s(t) = integral_0^t ||arrow(c)'(u)|| dif u
|
|
$
|
|
|
|
Then, we can solve for $t$ in terms of $s$, and plug it back into our original
|
|
vector in terms of $t$, $arrow(c)(t)$. Then its position will be expressed by
|
|
in terms of $s$, $arrow(c)(s)$, and we'll have a parametrization by arc
|
|
length.
|
|
|
|
A key notion here is now the velocity vectors are tangent, but also unit length
|
|
(since we should imagine that we are always moving at unit speed along the
|
|
curve at any given point).
|
|
|
|
What direction does the _acceleration_ vector point in, then?
|
|
|
|
For a parameterized curve $arrow(c)(t)$ with velocity $arrow(v)(t)$ and
|
|
acceleration $arrow(a)(t)$, then the speed is magnitude $|arrow(v)(t)|$. When
|
|
the speed is constant, $|arrow(v)(t)|$ doesn't change with time.
|
|
|
|
A prototypical example: consider uniform circular motion. Then the angular
|
|
velocity (speed) is always constant, yet there is always an acceleration vector
|
|
pointing perpendicular to the tangent velocity vector (the centripetal
|
|
acceleration).
|
|
|
|
== Curvature
|
|
|
|
The curvature in $RR^2$ is given by the second derivative. But this is just a
|
|
lucky coincidence. Let's think about the notion of curvature.
|
|
|
|
Somehow, the curvature measures the best-fitting second order approximation of
|
|
a curve. Curvature is a measure of concavity with respect to the direction
|
|
perpendicular to the direction of motion.
|
|
|
|
We do have a formula
|
|
|
|
$
|
|
kappa(t) = (|arrow(c)'(t) times arrow(c)'' (t)|) / (|arrow(c)'(t)|^3)
|
|
$
|
|
|
|
== Building intuition for curvature
|
|
|
|
Curvature is essentially asking how closely our curve resembles a unit circle
|
|
at a given point. It follows that a unit circle has a curvature 1, and a
|
|
straight line has curvature 0.
|
|
|
|
Let's consider a parametric curve
|
|
$
|
|
arrow(s)(t) = vec(t-sin(t), 1-cos(t))
|
|
$
|
|
Consider the unit tangent vectors to the curve at some points. We're
|
|
essentially asking "how much do these tangent vectors change direction?" and
|
|
considering points arbitrarily close. This is the essence of curvature.
|
|
|
|
Now let's think about some geometric intuition. Suppose you're moving along a
|
|
curve. It follows that if the curve is bending very sharply, the tangent
|
|
vectors are changing direction very fast, in larger increments. Likewise, if
|
|
the curve is straighter, the tangent vectors change direction less often. And
|
|
when you're traveling on a straight line, the tangent vectors don't change
|
|
direction at all.
|
|
|
|
We want a mathematical model of this notion of "changing tangent vectors." The
|
|
idea is that we can capture this with some sort of derivative, but with respect
|
|
to what? If we just want to capture when the tangent vector is changing
|
|
directions, we clearly want to ignore any change in the actual magnitude of the
|
|
tangent vector itself, since this has no bearing on the directional change. Put
|
|
another way, if you're traveling along the curved path, the speed at which you
|
|
go (the magnitude of the tangent velocity vector) really doesn't matter with
|
|
regard to curvature. We only care when the tangent velocity vector changes
|
|
direction!
|
|
|
|
So suppose $T$ is the unit tangent vector at each point. We want the rate of
|
|
change of $T$, its derivative, but _not_ $(dif T)/(dif t)$, with respect to
|
|
time. This is because we don't really care how _fast_ the tangent vector
|
|
changes with respect to time, curvature is about measuring how much the tangent
|
|
vector changes as we move some arbitrary distance on the curve!
|
|
|
|
Instead, we really want $(dif T)/(dif s)$, where $s$ is the arc length we've
|
|
traveled so far (from some arbitrarily chosen starting point). And this makes
|
|
intuitive sense, because we just want how fast the tangent vector changes
|
|
direction with regards to the distance we travel on the curve.
|
|
|
|
Now we may note that we can actually find curvature if we can find an
|
|
arc-length parametrization of $arrow(s)$! Because an arclength
|
|
parametrization always has unit speed, its derivative gives the tangent
|
|
vectors at every point, and we can differentiate with respect to arclength and
|
|
take the magnitude to obtain the curvature. That is,
|
|
$
|
|
kappa = abs((dif T) / (dif s))
|
|
$
|
|
|
|
But if we can't find an arclength parametrization, we're out of luck. Let's
|
|
continue investigating.
|
|
|
|
Consider a prototypical example:
|
|
|
|
#example[
|
|
Let $arrow(s)(t) = vec(cos(t) R, sin(t) R)$. We're drawing a circle with
|
|
radius $R$.
|
|
|
|
Let's differentiate with respect to $t$.
|
|
$
|
|
arrow(s)'(t) = vec(-sin(t) R, cos(t) R)
|
|
$
|
|
But we want unit tangent vectors, so let's normalize it. Call our unit
|
|
tangent $T$.
|
|
|
|
$
|
|
T(t) = (arrow(s)'(t)) / (|arrow(s)'(t)|)
|
|
$
|
|
We have
|
|
$
|
|
|arrow(s)'(t)| = lr(|vec(-sin(t) R, cos(t) R)|) \
|
|
= sqrt(sin^2(t) R^2 + cos^2(t) R^2) = R
|
|
$
|
|
Now we have our unit tangent vectors in terms of $t$.
|
|
$
|
|
T(t) = (arrow(s)'(t)) / R
|
|
$
|
|
We take
|
|
$
|
|
(dif T) / (dif t) = vec(-cos(t), -sin(t))
|
|
$
|
|
and the magnitude of this is just 1.
|
|
|
|
We should immediately note that not all cases will be so easy. When taking
|
|
$|arrow(s)'(t)|$, in general, we have a very disgusting square root that cannot
|
|
be simplified.
|
|
|
|
Now note:
|
|
$
|
|
abs((dif T)/(dif s)) = abs((dif T)/(dif t)) / abs((dif T)/(dif s))
|
|
$
|
|
|
|
So in fact $kappa = 1/R$!
|
|
]
|
|
|
|
The key here is this equation:
|
|
$
|
|
abs((dif T)/(dif s)) = abs((dif T)/(dif t)) / abs((dif s)/(dif t))
|
|
$
|
|
Although we didn't have an arclength parametrization of $T$, we note that its
|
|
magnitude is essentially given by the magnitude at which it's changing with
|
|
respect to time, and divided by the rate the curve is moving to "correct" for
|
|
the discrepancies introduced by taking the derivative with respect to time!
|
|
|
|
Obviously this is very nonrigorous. But I'm running out of time.
|
|
|
|
Now if we do a bunch more reasoning and nonsense we can obtain the formula
|
|
above, but at this point the goal seems to have been reached. We have an
|
|
intuitive understanding of what curvature should measure.
|
|
|
|
So recall the formula:
|
|
|
|
$
|
|
kappa(t) = (|arrow(c)'(t) times arrow(c)'' (t)|) / (|arrow(c)'(t)|^3)
|
|
$
|
|
|
|
Essentially, it's saying that the area of the parallelogram formed by the
|
|
curve's velocity and acceleration vectors, divided by the cube of the speed,
|
|
gives us the curvature. Intuitively we see that the more the acceleration
|
|
vector diverges from the velocity vector, the sharper the velocity vector is
|
|
changing direction, which gives us a notion of curvature. And somehow dividing
|
|
by the speed cubed is normalizing out any influence due to speed to give us our
|
|
curvature.
|
|
|
|
== Quadric surfaces
|
|
|
|
We really only need to know the identities and derivatives to do some integral
|
|
hacks.
|
|
|
|
The quadric surface is the generalization of the conic section to $n$ dimensions.
|
|
|
|
Now recall that one conic section is the hyperbola. It turns out we can define
|
|
analogues of the trigonometric functions that parameterize a so-called unit
|
|
hyperbola instead of the unit circle. These functions are
|
|
|
|
$
|
|
sinh(x) = (e^x -e^(-x)) / 2 \
|
|
cos(x) = (e^x + e^(-x)) / 2 \
|
|
tanh(x) = sinh(x) / cosh(x)
|
|
$
|
|
|
|
The derivatives are
|
|
|
|
$
|
|
(dif) / (dif x) sinh(x) = cosh(x) \
|
|
(dif) / (dif x) cosh(x) = sinh(x)
|
|
$
|
|
|
|
It's pretty easy to show these using their definitions, and derive the derivative of $tanh$.
|
|
|
|
== End of weeks 1-4
|
|
|
|
That was all of the content of week 1 to 4. Now we shift to weeks 5-7, where we studied more about vectors and their derivatives.
|
|
|
|
== Partial derivatives
|
|
|
|
These are the slopes of tangent lines for the graph in the direction of the
|
|
changing variable.
|
|
|
|
#theorem[Clairaut's theorem][
|
|
Suppose $f : RR^2 -> RR$ is defined on a disk $D$ that contains a point
|
|
$(a,b)$. If the functions $f_(x y)$ and $f_(y x)$ are ocntinuous on this disk
|
|
then they are the same.
|
|
]
|
|
|
|
#theorem[Extended Clairaut][
|
|
Suppose $f : RR^2 -> RR$ is defined on a disk $D$ that contains $(a,b)$. If
|
|
all of the mixed partial derivatives are continuous anywhere in the disk $D$,
|
|
then the mixed partials are equal.
|
|
]
|
|
|
|
== Multivariable chain rule
|
|
|
|
The product rule actually follows from it. Recall:
|
|
$
|
|
(f(x) g(x))' = f'(x) g(x) + f(x) g'(x)
|
|
$
|
|
Then instead let's replace $f$ and $g$ with $x$ and $y$, such that we have something like
|
|
$
|
|
z = x y
|
|
$
|
|
Then let $x$ and $y$ be functions of $t$. So
|
|
$
|
|
z = x(t) y(t)
|
|
$
|
|
The partial derivatives
|
|
$
|
|
(diff z) / (diff x) = y, (diff z) / (diff y) = x
|
|
$
|
|
By the multivariable chain rule,
|
|
$
|
|
(diff z) / (diff t) = x'(t) (diff z) / (diff x) + (diff z) / (diff y) y'(t) = x'(t) y(t) + x(t) y'(t)
|
|
$
|
|
|
|
== Implicit differentiation
|
|
|
|
It's similar to the single variable implicit differentiation, but remember to
|
|
hold the extraneous variables constant in practice.
|
|
|
|
#example[
|
|
Suppose you have a surface
|
|
$
|
|
3x^2 + 5 y z + z^3 = 0
|
|
$
|
|
And you want a partial derivative $(diff y)/(diff z)$ at some point. You can
|
|
use implicit differentiation by viewing the surface as a level set of some
|
|
larger function $F(x,y,z) = 3x^2 + 5y z + z^3$ where $F(x,y,z) = 0$.
|
|
|
|
Now we differentiate both sides:
|
|
$
|
|
(diff F) / (diff z) = diff / (diff z)(3x^2) + diff / (diff z)(5y z) + diff / (diff z)(z^3) \
|
|
= 0 + (5 (diff y) / (diff z) + 5y) + 3z^2
|
|
$
|
|
Then solve for $(diff y)/(diff z)$.
|
|
]
|
|
|
|
== Multivariable chain rule as matrix
|
|
|
|
Consider $z = f(x,y)$. Then the derivative of $z$ with respect to $x$ and $y$ would be a matrix:
|
|
$
|
|
mat((diff z)/(diff x), (diff z)/(diff y))
|
|
$
|
|
|
|
Now suppose the coordinate system changes to
|
|
$
|
|
x = 3u - v \
|
|
y = 2v
|
|
$
|
|
Now suppose we want $z_u$ and $z_v$ at $(x,y) = (3,6)$. Then this is actually
|
|
$(u,v) = (2,3)$ in $u v$ coordinates. The partials of $z$ with respect to $u$
|
|
and $v$ are just matrix multiplication:
|
|
$
|
|
mat(z_u,z_v) = mat(z_x, z_y) mat(x_u,x_v;y_u,y_v)
|
|
$
|
|
|
|
== Differentials
|
|
|
|
Differentials are about linear approximation. Recall in the single variable
|
|
case we use the tangent line approximation and differentials to approximate
|
|
functions. In the multivariable case it's the tangent plane approximation and
|
|
the directional derivative.
|
|
|
|
Let $f$ be a function with two inputs and one output, say
|
|
$
|
|
f(x,y) = x^2 + x cos(y)
|
|
$
|
|
|
|
Then the tangent plane at $(x_0,y_0)$ is the plane that best approximates the
|
|
function at that point. Now we need two slopes instead of one.
|
|
|
|
Take a look at $(1,pi/2)$. Then the partial derivatives at that point are 2 and
|
|
-1. The idea is we start at $(1,pi/2,1)$ and then move a slight nudge in either
|
|
the $x$ or $y$ directions. In the $x$ directions, we move by $Delta x$ and get
|
|
an increase in $z$ (height) of $2 dot Delta x$.
|
|
|
|
If we move a slight nudge in the $y$ direction $Delta y$, then our height
|
|
should increase (decrease) by $-1 dot Delta y$.
|
|
|
|
Then we have a tangent plane approximation of
|
|
$
|
|
Delta z approx 2 dot Delta x - 1 dot Delta y
|
|
$
|
|
|
|
And like in single variable, we can replace the $Delta$ with $dif$.
|
|
$
|
|
diff z approx 2 dot diff x - 1 dot diff y
|
|
$
|
|
Sidenote: how can we actually have the equation of the tangent plane in terms
|
|
of $x$,$y$,$z$? Just note that $x = 1 + Delta x$, $y = pi/2 + Delta y$, and $z
|
|
= 1 + Delta z$. So just substitute
|
|
$
|
|
(z - 1) = 2(x-1) - 1(y-pi / 2)
|
|
$
|
|
|
|
So in general, the differential version
|
|
$
|
|
dif z = m_x dif x + m_y dif y
|
|
$
|
|
|
|
and the tangent plane equation at $(x_0, y_0, z_0)$:
|
|
$
|
|
z = z_0 + m_x (x-x_0) + m_y (y-y_0)
|
|
$
|
|
|
|
== Directional derivative
|
|
|
|
Now take another look at the linear approximation in $z$
|
|
$
|
|
Delta z approx m_x Delta x + m_y Delta y
|
|
$
|
|
|
|
The directional derivative reinterprets this as matrix multiplication
|
|
$
|
|
Delta z approx mat(m_x,m_y) vec(Delta x, Delta y)
|
|
$
|
|
This is the same as a dot product, in fact, the dot product of the gradient
|
|
given by $vec(m_x,m_y)$ and a tiny movement vector.
|
|
|
|
== Directional derivative
|
|
|
|
We mentioned this before, now let's discuss in more detail. If you move by
|
|
$Delta x$ and $Delta y$, in $x$ and $y$ directions, then the direction
|
|
derivative computes your change in height on the tangent plane.
|
|
|
|
Consider the question "What is the derivative of $f$" in the direction of the
|
|
vector $vec(3,4)$?
|
|
|
|
Now consider $m_x (3) + m_y (4) = 3m_x + 4m_y$. This is almost the answer, but
|
|
really this is the change in $f$ resulting from a movement in the direction of
|
|
$vec(3,4)$. If we want the derivative, we're asking for the slope. We have
|
|
rise, now run is $sqrt(3^2 + 4^2) = 5$, so the answer is $(3m_x + 4m_y)/5$.
|
|
|
|
A more intuitive way is to consider a unit vector $arrow(u) = 1/lr(|<3,4>|)
|
|
<3,4>$ that points in the same direction. So now the "run" is simply 1. Clearly
|
|
we get the same answer, but we have a good formula now
|
|
|
|
$
|
|
"Direction derivative" = nabla arrow(f) dot arrow(u)
|
|
$
|
|
|
|
where $arrow(u)$ is the *unit vector* in our desired direction.
|
|
|
|
A geometric interpretation is that the direction derivative in a given
|
|
direction is just the gradient vector projected in that direction.
|
|
|
|
To recap:
|
|
|
|
We discuss two questions: what is the slope in a given direction, and what
|
|
direction has the steepest slope?
|
|
|
|
Let $arrow(F)$ be a gradient vector of our partial derivatives and $arrow(v)$
|
|
be the movement vector in the $x y$-plane. Let $arrow(u)$ be a unit vector
|
|
pointing in the same direction.
|
|
|
|
The answer to the first question is $nabla arrow(f) dot arrow(u)$, where
|
|
$arrow(u)$ is a unit vector in the given direction
|
|
|
|
The answer to the second question can be derived.
|
|
|
|
$
|
|
arrow(F) dot arrow(u) = lr(|arrow(F)|) lr(|arrow(u)|) cos(theta)
|
|
$
|
|
and $theta$ is the angle between $arrow(F)$ and $arrow(u)$. So since $cos
|
|
theta$ reaches maximum value at $theta = 0$, the maximum possible slope is
|
|
actually in the same direction as $arrow(F)$ with a slope equal the magnitude
|
|
of $arrow(F)$.
|
|
|
|
Takeaways:
|
|
|
|
- Movement in the direction of the gradient vector gives the "steepest" ascent of the function
|
|
- Movement perpendicular to the gradient has slope of 0
|
|
- Movement in the opposite direction of the gradient has maximum negative slope (sharpest descent) with same magnitude as gradient
|
|
- Any directional derivative in between can be calculated as a projection from the gradient
|
|
|
|
== Optimization
|
|
|
|
We spoke previously about the Lagrange multiplier. Now we discuss it in greater
|
|
detail.
|
|
|
|
When optimizing in two dimensions, we either optimize for all of $RR^2$, or on
|
|
a constraint in $RR^2$ (such as a curve). For the first case, we use critical
|
|
points. For the second, *Lagrange multipliers*.
|
|
|
|
== Critical points
|
|
|
|
Critical points occur when the gradient is zero or undefined. Both partials are
|
|
zero *or* at least one of them isn't defined.
|
|
|
|
Essentially, they occur when the tangent plane is flat. We can't just look at
|
|
$f''(x)$ like in single variable calculus, but we can take the determinant of
|
|
the second order partials for some sort of multivariable concavity. It measures
|
|
how much the pure partial derivatives dominate the mixed partial derivatives,
|
|
and they need to dominant to a certain extent such that there is consistently
|
|
upward or downward curvature in every direction.
|
|
|
|
We find the critical points when $m_x = 0$ and $m_y = 0$ or either are
|
|
undefined. Then we classify them as follows.
|
|
|
|
Recall second derivative test for single variable function, now consider the
|
|
two-variable case.
|
|
|
|
$
|
|
Dif (x_0,y_0) = det mat(f_(x x) (x_0, y_0), f_(x y) (x_0, y_0); f_(y x) (x_0,y_0), f_(y y) (x_0, y_0)) = f_(x x) (x_0, y_0) f_(y y) (x_0, y_0) - f_(x y) (x_0, y_0)^2
|
|
$
|
|
|
|
- If $Dif > 0$ and $f_(x x) (x_0, y_0) > 0$, then $f$ is a relative minimum.
|
|
- If $Dif > 0$ and $f_(x x) (x_0, y_0) < 0$, then $f$ is a relative maximum.
|
|
- If $Dif < 0$, then$f(x_0, y_0)$ is neither and it's a saddle point.
|
|
- If $Dif = 0$, then we don't know
|
|
|
|
== Lagrange multipliers
|
|
|
|
We discuss optimization on a restricted curve in our domain.
|
|
|
|
Idea: we should navigate along the curve and find where the direction
|
|
derivative is 0. Recall that this is the same as when the velocity vector is
|
|
perpendicular to the gradient. The issue is that we always have to parametrize
|
|
the curve.
|
|
|
|
To avoid this, Lagrange multipliers views the (implicit) constraint equation as
|
|
a level curve of another surface. If we take the gradient of that surface
|
|
everywhere on the level curve, then that gradient is parallel to the original
|
|
function's gradient at critical points.
|
|
|
|
So, we should be able to take the gradients of both the function and the
|
|
constraint function, and look for when one is a scalar multiple of the other.
|
|
|
|
Consider a function $f$ and a constraint $g$. Then we compute $nabla f$ and
|
|
$nabla g$, then solve for when $nabla f = lambda nabla g$. Then we can just plug in
|
|
points and figure it out.
|
|
|
|
#exercise[
|
|
Find the highest and lowest points on $f(x,y) = 81x^2 + y^2$ with the
|
|
constraint $4x^2 + y^2 = 9$. Let the second function be $g(x,y)$, and keep in
|
|
mind our constraint is essentially the level set where $g(x,y) = 9$.
|
|
]
|
|
|
|
Intuition: consider a constraint $g(x,y) = x^2 + y^2$, and our constraint is
|
|
the level set where $g(x,y) = 25$. Notice, the gradient of $g$ is perpendicular
|
|
to its level set at any given point. So, when optimizing on a function $f$ that
|
|
is $g$-constrained, we are really looking for where the gradient of $g$ is
|
|
parallel to the gradient $f$. That is why we are using a scalar multiple
|
|
$lambda$ to relate them.
|
|
|
|
Computation:
|
|
|
|
We have $g(x,y) = 25$. We construct the equation
|
|
|
|
$
|
|
vec(f_x, f_y) = lambda vec(g_x,g_y)
|
|
$
|
|
|
|
This gives three equations
|
|
|
|
$
|
|
f_x = lambda g_x \
|
|
f_y = lambda g_y \
|
|
g(x,y) = 25
|
|
$
|
|
|
|
We find
|
|
$
|
|
y = 4 lambda^2 y
|
|
$
|
|
|
|
If $y != 0$, then $lambda = plus.minus 1/2$. So we're looking for points on the
|
|
circle, with radius 5, such that $x = plus.minus y$. This gives 4 points to
|
|
consider: $(plus.minus 5/sqrt(2), plus.minus 5/sqrt(2))$.
|
|
|
|
If $y = 0$, then $x$ is forced to be 0, and $(0,0)$ is not on the circle. So we
|
|
ignore it.
|
|
|
|
Now we just compare our four candidates and find the greatest (or least) for
|
|
optimization!
|
|
|
|
=== Notes from Week 7 section
|
|
|
|
We have a function $f : RR^n -> RR$ that is subject to a constraint $g : RR^n -> RR^c$, where $c$ is our number of constraints. It's really a vector of $c$ constraints,
|
|
$
|
|
g = vec(g_1,g_2,dots.v,g_c)
|
|
$
|
|
|
|
Idea: define the so-called *Lagrangian* $cal(L) = f + (g,lambda)$.
|
|
|
|
#theorem[
|
|
If $f$ and $g$ are "nice" (partials continuous), there are no redundant constraints, and it's not overconstrained ($"Rank" Dif g = c < n$). Then any optimal solution that respects $g = 0$ solves $gradient f = lambda dot Dif g$.
|
|
]
|
|
|
|
= Lecture #datetime(day: 27, year: 2025, month:2).display()
|
|
|
|
== Volume
|
|
|
|
Any 3D shape can be built recursively of atomic objects.
|
|
|
|
#exercise[
|
|
Derive formulae for the volume of a pyramid and cone.
|
|
]
|
|
|
|
Schley what are you doing???
|
|
|
|
== Signed area and volume
|
|
|