diff --git a/documents/by-course/math-6a/course-notes/main.typ b/documents/by-course/math-6a/course-notes/main.typ index 69d4719..25e2415 100644 --- a/documents/by-course/math-6a/course-notes/main.typ +++ b/documents/by-course/math-6a/course-notes/main.typ @@ -772,5 +772,309 @@ $ (dif) / (dif x) cosh(x) = sinh(x) $ -It's pretty easy to show these using their definitions, and derive the -derivative of $tanh$. +It's pretty easy to show these using their definitions, and derive the derivative of $tanh$. + +== End of weeks 1-4 + +That was all of the content of week 1 to 4. Now we shift to weeks 5-7, where we studied more about vectors and their derivatives. + +== Partial derivatives + +These are the slopes of tangent lines for the graph in the direction of the +changing variable. + +#theorem[Clairaut's theorem][ + Suppose $f : RR^2 -> RR$ is defined on a disk $D$ that contains a point + $(a,b)$. If the functions $f_(x y)$ and $f_(y x)$ are ocntinuous on this disk + then they are the same. +] + +#theorem[Extended Clairaut][ + Suppose $f : RR^2 -> RR$ is defined on a disk $D$ that contains $(a,b)$. If + all of the mixed partial derivatives are continuous anywhere in the disk $D$, + then the mixed partials are equal. +] + +== Multivariable chain rule + +The product rule actually follows from it. Recall: +$ + (f(x) g(x))' = f'(x) g(x) + f(x) g'(x) +$ +Then instead let's replace $f$ and $g$ with $x$ and $y$, such that we have something like +$ + z = x y +$ +Then let $x$ and $y$ be functions of $t$. So +$ + z = x(t) y(t) +$ +The partial derivatives +$ + (diff z) / (diff x) = y, (diff z) / (diff y) = x +$ +By the multivariable chain rule, +$ + (diff z) / (diff t) = x'(t) (diff z) / (diff x) + (diff z) / (diff y) y'(t) = x'(t) y(t) + x(t) y'(t) +$ + +== Implicit differentiation + +It's similar to the single variable implicit differentiation, but remember to +hold the extraneous variables constant in practice. + +#example[ + Suppose you have a surface + $ + 3x^2 + 5 y z + z^3 = 0 + $ + And you want a partial derivative $(diff y)/(diff z)$ at some point. You can + use implicit differentiation by viewing the surface as a level set of some + larger function $F(x,y,z) = 3x^2 + 5y z + z^3$ where $F(x,y,z) = 0$. + + Now we differentiate both sides: + $ + (diff F) / (diff z) = diff / (diff z)(3x^2) + diff / (diff z)(5y z) + diff / (diff z)(z^3) \ + = 0 + (5 (diff y) / (diff z) + 5y) + 3z^2 + $ + Then solve for $(diff y)/(diff z)$. +] + +== Multivariable chain rule as matrix + +Consider $z = f(x,y)$. Then the derivative of $z$ with respect to $x$ and $y$ would be a matrix: +$ + mat((diff z)/(diff x), (diff z)/(diff y)) +$ + +Now suppose the coordinate system changes to +$ + x = 3u - v \ + y = 2v +$ +Now suppose we want $z_u$ and $z_v$ at $(x,y) = (3,6)$. Then this is actually +$(u,v) = (2,3)$ in $u v$ coordinates. The partials of $z$ with respect to $u$ +and $v$ are just matrix multiplication: +$ + mat(z_u,z_v) = mat(z_x, z_y) mat(x_u,x_v;y_u,y_v) +$ + +== Differentials + +Differentials are about linear approximation. Recall in the single variable +case we use the tangent line approximation and differentials to approximate +functions. In the multivariable case it's the tangent plane approximation and +the directional derivative. + +Let $f$ be a function with two inputs and one output, say +$ + f(x,y) = x^2 + x cos(y) +$ + +Then the tangent plane at $(x_0,y_0)$ is the plane that best approximates the +function at that point. Now we need two slopes instead of one. + +Take a look at $(1,pi/2)$. Then the partial derivatives at that point are 2 and +-1. The idea is we start at $(1,pi/2,1)$ and then move a slight nudge in either +the $x$ or $y$ directions. In the $x$ directions, we move by $Delta x$ and get +an increase in $z$ (height) of $2 dot Delta x$. + +If we move a slight nudge in the $y$ direction $Delta y$, then our height +should increase (decrease) by $-1 dot Delta y$. + +Then we have a tangent plane approximation of +$ + Delta z approx 2 dot Delta x - 1 dot Delta y +$ + +And like in single variable, we can replace the $Delta$ with $dif$. +$ + diff z approx 2 dot diff x - 1 dot diff y +$ +Sidenote: how can we actually have the equation of the tangent plane in terms +of $x$,$y$,$z$? Just note that $x = 1 + Delta x$, $y = pi/2 + Delta y$, and $z += 1 + Delta z$. So just substitute +$ + (z - 1) = 2(x-1) - 1(y-pi / 2) +$ + +So in general, the differential version +$ + dif z = m_x dif x + m_y dif y +$ + +and the tangent plane equation at $(x_0, y_0, z_0)$: +$ + z = z_0 + m_x (x-x_0) + m_y (y-y_0) +$ + +== Directional derivative + +Now take another look at the linear approximation in $z$ +$ + Delta z approx m_x Delta x + m_y Delta y +$ + +The directional derivative reinterprets this as matrix multiplication +$ + Delta z approx mat(m_x,m_y) vec(Delta x, Delta y) +$ +This is the same as a dot product, in fact, the dot product of the gradient +given by $vec(m_x,m_y)$ and a tiny movement vector. + +== Directional derivative + +We mentioned this before, now let's discuss in more detail. If you move by +$Delta x$ and $Delta y$, in $x$ and $y$ directions, then the direction +derivative computes your change in height on the tangent plane. + +Consider the question "What is the derivative of $f$" in the direction of the +vector $vec(3,4)$? + +Now consider $m_x (3) + m_y (4) = 3m_x + 4m_y$. This is almost the answer, but +really this is the change in $f$ resulting from a movement in the direction of +$vec(3,4)$. If we want the derivative, we're asking for the slope. We have +rise, now run is $sqrt(3^2 + 4^2) = 5$, so the answer is $(3m_x + 4m_y)/5$. + +A more intuitive way is to consider a unit vector $arrow(u) = 1/lr(|<3,4>|) +<3,4>$ that points in the same direction. So now the "run" is simply 1. Clearly +we get the same answer, but we have a good formula now + +$ + "Direction derivative" = nabla arrow(f) dot arrow(u) +$ + +where $arrow(u)$ is the *unit vector* in our desired direction. + +A geometric interpretation is that the direction derivative in a given +direction is just the gradient vector projected in that direction. + +To recap: + +We discuss two questions: what is the slope in a given direction, and what +direction has the steepest slope? + +Let $arrow(F)$ be a gradient vector of our partial derivatives and $arrow(v)$ +be the movement vector in the $x y$-plane. Let $arrow(u)$ be a unit vector +pointing in the same direction. + +The answer to the first question is $nabla arrow(f) dot arrow(u)$, where +$arrow(u)$ is a unit vector in the given direction + +The answer to the second question can be derived. + +$ + arrow(F) dot arrow(u) = lr(|arrow(F)|) lr(|arrow(u)|) cos(theta) +$ +and $theta$ is the angle between $arrow(F)$ and $arrow(u)$. So since $cos +theta$ reaches maximum value at $theta = 0$, the maximum possible slope is +actually in the same direction as $arrow(F)$ with a slope equal the magnitude +of $arrow(F)$. + +Takeaways: + +- Movement in the direction of the gradient vector gives the "steepest" ascent of the function +- Movement perpendicular to the gradient has slope of 0 +- Movement in the opposite direction of the gradient has maximum negative slope (sharpest descent) with same magnitude as gradient +- Any directional derivative in between can be calculated as a projection from the gradient + +== Optimization + +We spoke previously about the Lagrange multiplier. Now we discuss it in greater +detail. + +When optimizing in two dimensions, we either optimize for all of $RR^2$, or on +a constraint in $RR^2$ (such as a curve). For the first case, we use critical +points. For the second, *Lagrange multipliers*. + +== Critical points + +Critical points occur when the gradient is zero or undefined. Both partials are +zero *or* at least one of them isn't defined. + +Essentially, they occur when the tangent plane is flat. We can't just look at +$f''(x)$ like in single variable calculus, but we can take the determinant of +the second order partials for some sort of multivariable concavity. It measures +how much the pure partial derivatives dominate the mixed partial derivatives, +and they need to dominant to a certain extent such that there is consistently +upward or downward curvature in every direction. + +We find the critical points when $m_x = 0$ and $m_y = 0$ or either are +undefined. Then we classify them as follows. + +Recall second derivative test for single variable function, now consider the +two-variable case. + +$ + Dif (x_0,y_0) = det mat(f_(x x) (x_0, y_0), f_(x y) (x_0, y_0); f_(y x) (x_0,y_0), f_(y y) (x_0, y_0)) = f_(x x) (x_0, y_0) f_(y y) (x_0, y_0) - f_(x y) (x_0, y_0)^2 +$ + +- If $Dif > 0$ and $f_(x x) (x_0, y_0) > 0$, then $f$ is a relative minimum. +- If $Dif > 0$ and $f_(x x) (x_0, y_0) < 0$, then $f$ is a relative maximum. +- If $Dif < 0$, then$f(x_0, y_0)$ is neither and it's a saddle point. +- If $Dif = 0$, then we don't know + +== Lagrange multipliers + +We discuss optimization on a restricted curve in our domain. + +Idea: we should navigate along the curve and find where the direction +derivative is 0. Recall that this is the same as when the velocity vector is +perpendicular to the gradient. The issue is that we always have to parametrize +the curve. + +To avoid this, Lagrange multipliers views the (implicit) constraint equation as +a level curve of another surface. If we take the gradient of that surface +everywhere on the level curve, then that gradient is parallel to the original +function's gradient at critical points. + +So, we should be able to take the gradients of both the function and the +constraint function, and look for when one is a scalar multiple of the other. + +Consider a function $f$ and a constraint $g$. Then we compute $nabla f$ and +$nabla g$, then solve for when $nabla f = lambda nabla g$. Then we can just plug in +points and figure it out. + +#exercise[ + Find the highest and lowest points on $f(x,y) = 81x^2 + y^2$ with the + constraint $4x^2 + y^2 = 9$. Let the second function be $g(x,y)$, and keep in + mind our constraint is essentially the level set where $g(x,y) = 9$. +] + +Intuition: consider a constraint $g(x,y) = x^2 + y^2$, and our constraint is +the level set where $g(x,y) = 25$. Notice, the gradient of $g$ is perpendicular +to its level set at any given point. So, when optimizing on a function $f$ that +is $g$-constrained, we are really looking for where the gradient of $g$ is +parallel to the gradient $f$. That is why we are using a scalar multiple +$lambda$ to relate them. + +Computation: + +We have $g(x,y) = 25$. We construct the equation + +$ + vec(f_x, f_y) = lambda vec(g_x,g_y) +$ + +This gives three equations + +$ + f_x = lambda g_x \ + f_y = lambda g_y \ + g(x,y) = 25 +$ + +We find +$ + y = 4 lambda^2 y +$ + +If $y != 0$, then $lambda = plus.minus 1/2$. So we're looking for points on the +circle, with radius 5, such that $x = plus.minus y$. This gives 4 points to +consider: $(plus.minus 5/sqrt(2), plus.minus 5/sqrt(2))$. + +If $y = 0$, then $x$ is forced to be 0, and $(0,0)$ is not on the circle. So we +ignore it. + +Now we just compare our four candidates and find the greatest (or least) for +optimization! diff --git a/documents/by-course/math-8/course-notes/main.typ b/documents/by-course/math-8/course-notes/main.typ index 7c65073..bdf19c2 100644 --- a/documents/by-course/math-8/course-notes/main.typ +++ b/documents/by-course/math-8/course-notes/main.typ @@ -920,13 +920,15 @@ nonempty subsets of $A$ whose union is $A$. == Functions -Let $A$ and $B$ be sets. A relation $R$ from $A$ to $B$ is a subset $R subset.eq A times B$. - #definition[ - A *function* $f$ from $A$ to $B$ relates each element of $"Dom"(R)$ to to exactly - one element of $"Rng"(R)$. + A *function* $f$ is a relation from $A$ to $B$ such that + 1. $"Dom"(f) = A$. + 2. If $(x,y) in f$ and $(x,z) in f$ then $y = z$. ] +That is, every element in $A$ is related to exactly one element in $B$. Note +that (2) is the vertical line test. + #fact[ A function from $A$ to $B$ is written $