Derivatives
This unit covers the following ideas. In
preparation for the quiz and exam, make sure you have a lesson
plan containing examples that explain and illustrate the
following concepts.
- Find limits, and be able to explain when a function does
not have a limit by considering different approaches.
- Compute partial derivatives. Explain how to obtain the
total derivative from the partial derivatives (using a
matrix).
- Find equations of tangent lines and tangent planes to
surfaces. We'll do this three ways.
- Find derivatives of composite functions, using the chain
rule (matrix multiplication).
You'll have a chance to teach your examples to your peers
prior to the exam.
Limits
See Larson 13.2 for more about limits and continuity.
In the previous chapter, we learned how to
describe lots of different functions. In first-semester calculus,
after reviewing functions, you learned how to compute limits of
functions, and then used those ideas to develop the derivative of
a function. The exact same process is used to develop calculus in
high dimensions. One glitch that will prevent us from developing
calculus this way in high dimensions is the epsilon-delta
definition of a limit. We'll review it briefly. Those of you who
want to pursue further mathematical study will spend much more
time on this topic in future courses. In first-semester calculus,
you learned how to compute limits of functions. Here's the formal
epsilon-delta definition of a limit.
Let $f:\RR\to\RR$ be a function. We write $\ds \lim_{x\to c}
f(x)=L$ if and only if for every $\epsilon>0$, there exists
a $\delta>0$ such that $0<|x-c|<\delta$ implies
$|f(x)-L|<\epsilon$.
We're looking at this formal definition here because we can
compare it with the formal definition of limits in higher
dimensions. The only difference is that we just put vector
symbols above the input $x$ and the output $f(x)$.
Let $\vec f:\RR^n\to\RR^m$ be a function. We write $\ds
\lim_{\vec x\to \vec c} \vec f(\vec x)=\vec L$ if and only if
for every $\epsilon>0$, there exists a $\delta>0$ such
that $0<|\vec x-\vec c|<\delta$ implies $|\vec f(\vec
x)-\vec L|<\epsilon$.
We'll find that throughout this course, the key difference
between first-semester calculus and multivariate calculus is that
we replace the input $x$ and output $y$ of functions with the
vectors $\vec x$ and $\vec y$.
The point to this problem is to help you learn to recognize
the dimensions of the domain and codomain of the function. If
we write $\vec f:\RR^n\to \RR^m$, then $\vec x$ is a vector in
$\RR^n$ with $n$ components, and $\vec y$ is a vector in
$\RR^m$ with $m$ components.
For the function $f(x,y)=z$, we can write $f$ in the vector
notation $\vec y=\vec f(\vec x)$ if we let $\vec x=(x,y)$ and
$\vec y=(z)$. Notice that $\vec x$ is a vector of inputs, and
$\vec y$ is a vector of outputs. For each of the functions
below, state what $\vec x$ and $\vec y$ should be so that the
function can be written in the form $\vec y = \vec f (\vec x)$.
- $f(x,y,z)=w$
- $\vec r(t)=(x,y,z)$
- $\vec r(u,v)=(x,y,z)$
- $\vec F(x,y)=(M,N)$
- $\vec F(\rho,\phi,\theta)=(x,y,z)$
You learned to work with limits in first-semester calculus
without needing the formal definitions above. Many of those
techniques apply in higher dimensions. The following problem has
you review some of these technique, and apply them in higher
dimensions.
See 14.2: 1-30 for more practice.
Do these problems (without using L'Hopital's rule).
- Compute $\ds \lim_{x\to 2} x^2-3x+5$ and then
$\ds\lim_{(x,y)\to (2,1)} 9-x^2-y^2$.
- Compute $\ds\lim_{x\to 3}\frac{x^2-9}{x-3}$ and then
$\ds\lim_{(x,y)\to (4,4)} \frac{x-y}{x^2-y^2}$.
- Explain why $\ds\lim_{x\to 0}\frac{x}{|x|}$ does not
exist. [Hint: graph the function.]
In first semester calculus, we can show that a limit does
or does not exist by considering what happens from the left, and
comparing it to what happens on the right. You probably used the
following theorem extensively.
If $y=f(x)$ is a function defined on some open interval
containing $c$, then $\ds\lim_{x\to c}f(x)$ exists if and only
if $\ds\lim_{x\to c^-}f(x) = \ds\lim_{x\to c^+}f(x)$.
A limit exists precisely when the limits from every
direction exists, and all directional limits are equal. In
first-semester calculus, this required that you check two
directions (left and right). This theorem generalizes to higher
dimensions, but it becomes much more difficult to apply.
Consider the function $\ds f(x,y)=\frac{x^2-y^2}{x^2+y^2}$. Our
goal is to determine if the function has a limit at the origin
$(0,0)$. We can approach the origin along many different lines.
One line through the origin is the line $y=2x$. If we stay on
this line, then we can replace each $y$ with $2x$ and then
compute $$\ds\lim_{\substack{(x,y)\to(0,0)\\ y=2x
}}\frac{x^2-y^2}{x^2+y^2} = \lim_{x\to 0}
\frac{x^2-(2x)^2}{x^2+(2x)^2} = \lim_{x\to 0}
\frac{-3x^2}{5x^2} = \lim_{x\to 0} \frac{-3}{5}
=\frac{-3}{5}.$$ This means that if we approach the origin
along the line $y=2x$, we will have a height of $-3/5$ when we
arrive at the origin.
If the function $\ds f(x,y)=\frac{x^2-y^2}{x^2+y^2}$ has a
limit at the origin, the previous problem suggests that limit
will be $-3/5$.
Please read the previous example. Recall that we are looking
for the limit of the function $\ds
f(x,y)=\frac{x^2-y^2}{x^2+y^2}$ at the origin (0,0).
You may want to look at a graph in
Sage or
Wolfram
Alpha (try using the “contour lines”
option).
As you compute each limit, make sure you understand what that
limit means in the graph.
Our goal is to determine if the function has a limit at
the origin $(0,0)$.
- In the $xy$-plane, how many lines pass through the origin
$(0,0)$? Give an equation a line other than $y=2x$ that
passes through the origin. Then compute
$$\ds\lim_{\substack{(x,y)\to(0,0)\\ \text{your line}
}}\frac{x^2-y^2}{x^2+y^2} = \lim_{x\to 0}
\frac{x^2-(?)^2}{x^2+(?)^2}=\ldots.$$
- Give another equation a line that passes through the
origin. Then compute $$\ds\lim_{\substack{(x,y)\to(0,0)\\ \text{your line}}}\frac{x^2-y^2}{x^2+y^2}.$$
- Does this function have a limit at $(0,0)$? Explain.
See 14.2: 41-50 for more
practice.See Larson
13.2:23–36 and example 4 for more practice.
The theorem from first-semester calculus generalizes as
follows.
If $\vec y=\vec f(\vec x)$ is a function defined on some open
region containing $\vec c$, then $\ds\lim_{\vec x\to \vec
c}\vec f(\vec x)$ exists if and only if the limit exists along
every possible approach to $\vec c$ and all these limits are
equal.
There's a fundamental problem with using this theorem to
check if a limit exists. Once the domain is 2-dimensional or
higher, there are infinitely many ways to approach a point. There
is no longer just a left and right side. To prove a limit exists,
you must check infinitely many cases—that usually takes a really long time. The real power to this theorem is it allows us to show that a limit does not exist. All we have to do is find two approaches with different limits.
See 14.2: 41-50 for more
practice.See Larson
13.2:9–36 for more practice.
Consider the function $\ds f(x,y) =
\frac{xy}{x^2+y^2}$. Does this function have a limit at
$(0,0)$?
- Examine the function at $(0,0)$ by considering the limit
as you approach the origin along several lines.
- Convert $\frac{xy}{x^2+y^2}$ to polar coordinates (i.e.,
a function of $r$ and $\theta$). As $(x,y)$ approaches the
origin, what does $r$ approach? Take the limit of your polar
coordinate function as $r$ approaches that value and
interpret your result.
In all the examples above, we considered
approaching a point by traveling along a line. However, even if a
function has a consistent limit along every line, that is not
enough to always guarantee the function has a limit. The theorem
requires every approach be consistent, which includes parabolic
approaches, spiraling approaches, and more. Sometimes the
straight-line paths happen to be consistent with each other, but
a different path gives a different limit. Give some thought to
this in the optional challenge problem below.
Give an example of a function $f(x,y)$ so that the limit at
$(0,0)$ along every straight line $y=mx$ exists and equals 0.
However, show that the function has no limit at $(0,0)$ by
considering an approach that is not a straight line.
The Derivative
Before we introduce derivatives, let's
recall the definition of a differential. If $y=f(x)$ is a
function, then we say the differential $dy$ is the expression
$dy=f'(x) dx$ (we could also write this as $dy =
\frac{dy}{dx}dx$). Think of differential notation $dy=f'(x)dx$ in
the following way:
A small change in the output $y$ equals the derivative
multiplied by a small change in the input $x$. In other words,
if the input $x$ changes by a small amount $dx$, then we
multiply that small change by the derivative to determine how
much $y$ changes (i.e., $dy$).
To get the derivative in all dimensions, we just substitute
in vectors to obtain the differential notation $d\vec y = f'(\vec
x) d\vec x$. The derivative is precisely the thing that tells us
how to get $d\vec y$ from $d\vec x$. We'll quickly see that the
derivative $f'(\vec x)$ must be a matrix, and we'll start writing
it as $Df$ instead of $f'$. We've actually already dealt with
problems involving derivatives of multiple-variable functions in
first-semester calculus. The next few problems are very similar
to related rates or differential problems from first-semester
calculus, and we'll see how the derivative $f'$ in $dy=f'(x) dx$
naturally generalizes to a matrix.
See 3.10 for more
practice.See Larson 3.7 and
4.8.
The volume of a right circular cylinder is $V(r,h)= \pi
r^2 h$. Imagine that each of $V$, $r$, and $h$ depends on $t$
(we might be collecting rain water in a can, or crushing a
cylindrical concentrated juice can, etc.).
- If the height remains constant, but the radius changes,
what is $dV/dt$ in terms of $dr/dt$? Use this to find a
formula for $dV$ in terms of $dr$ when $h$ is constant.
- If the radius remains constant, but the height changes,
what is $dV/dt$ in terms of $dh/dt$? What is $dV$ when $r$ is
constant?
- If both the radius and height change, what is $dV/dt$ in
terms of $dh/dt$ and $dr/dt$? Solve for $dV$.
-
Make sure you ask me in class to show you physically
exactly how you can see these differential formulas.
Show that we can write $dV$ as the matrix product
of a 1-row by 2-column matrix with a 2-row by 1-column
matrix: $$dV = \begin{bmatrix}2\pi rh& \pi
r^2\end{bmatrix}\begin{bmatrix}dr\\dh\end{bmatrix}.$$ How
do the columns of the first matrix relate to the
calculations you did above?
The matrix $\begin{bmatrix}2\pi rh& \pi
r^2\end{bmatrix}$ is the derivative of $V$. The columns
of this matrix are the partial derivatives of $V$.
-
If we know that $r=3$ and $h=4$, and we know that
$r$ increases by about $.1$ and $h$ increases by about
$.2$, then approximate how much $V$ will increase. Use your
formula for $dV$ to approximate this.
The volume of a box is $V(x,y,z)=xyz$. Imagine that each
variable depends on $t$.
- If both $y$ and $z$ remain constant, what is $dV/dt$? Use
this to find a formula for $dV$ in terms of $dx$, assuming
that $y$ and $z$ are constant.
- Repeat the last step for when $y$ is the only variable
that changes, and then for when $z$ is the only variable that
changes.
- What is $dV/dt$ in terms of $dx/dt$, $dy/dt$, and $dz/dt$
when all three variables are changing? Solve for $dV$.
-
Show that we can write $dV$ as the matrix product
of a 1-row by 3-column matrix with a 3-row by 1-column
matrix: $$dV = \begin{bmatrix}yz&
?&?\end{bmatrix}\begin{bmatrix}dx\\dy\\dz\end{bmatrix}.$$
How do the columns of the first matrix relate to the
previous portions of the problem.
The matrix $\begin{bmatrix}yz& ?&?\end{bmatrix}$
is the derivative of $V$. The columns of this matrix are
the partial derivatives of $V$.
- If the current measurements of a box are $x=2$, $y=3$,
and $z=5$, and we know that $x$ increases by .01, $y$
increases by .02, and $z$ decreases by .03, then by about how
much will the volume change? Use your formula for $dV$ to
approximate this.
Part 4 in each problem above, expressing relationships
between changes in terms of differentials, is the KEY idea, let
me repeat, THE KEY IDEA, to the rest of this course. The
essential thing is that we can use differentials to understand
and approximate how small changes in the inputs of a function
will change the outputs of the function. For example, we can
approximate the change in a function $f(x,y)$ if we know how much
$x$ and $y$ will change.
Using differentials to analyze how outputs of a function change
when the inputs change comes up in many applications, such as
analyzing numerical roundoff error in calculations or analyzing
manufacturing tolerances.
Consider the function $f(x,y) = x^2y +3x+4\sin(5y)$.
- If both $x$ and $y$ depend on $t$, then use implicit
differentiation to obtain a formula for $df/dt$ in terms of
$dx/dt$ and $dy/dt$.
- Solve for $df$, and write your answer as the matrix
product (fill in the question mark) $$df = \begin{bmatrix}?&
x^2+20\cos(5y)\end{bmatrix}\begin{bmatrix}dx\\dy\end{bmatrix}.$$
- If you hold $y$ constant (so $dy=0$), then what is
$df/dx$? If you are at $(x,y)=(1,2)$, and if $x$ changes by
$.2$, then about how much does $z=f(x,y)$ change?
- If you hold $x$ constant (so $dx=0$), then what is
$df/dy$? If you are at $(x,y)=(1,2)$, and if $y$ changes by
$-.3$, then about how much does $z=f(x,y)$ change?
- If you are at the point $(x,y)=(1,2)$, and you move
to $(1.2, 1.7)$, then what is $dx$ and $dy$?
Approximate how much $z=f(x,y)$ changes.
- (Challenge) In the last part, why was our
calculation an approximation, rather than an exact
calculation of how much $f(x,y)$ changed?
We need to add some vocabulary to make it easier to talk
about what we just did. Let's introduce the vocabulary in terms
of the problem above, and then make a formal definition.
- The derivative of $f$ in the previous problem is the matrix
$$Df(x,y) = \begin{bmatrix} 2xy+3& x^2+20\cos(5y)
\end{bmatrix}.$$ Some people call $Df$ the total
derivative or the matrix derivative of $f$.
- The first column of this matrix is just part of the whole
derivative—the part that deals with how changes in $x$ affect
the output. We can get the first column by holding $y$
constant, and then differentiating with respect to $x$. We call
this the partial derivative of $f$ with respect to $x$.
We'll write this as $\frac{\partial f}{\partial x} = 2xy+3$ or
$f_x = 2xy+3$.
- The second column of the derivative is the partial
derivative of $f$ with respect to $y$—it tells us how
changes in $y$ affect the output. We can get the second column
by holding $x$ constant and differentiating with respect to
$y$. We'll write this as $\frac{\partial f}{\partial y} =
x^2+20\cos(5y) $ or $f_y = x^2+20\cos(5y)$.
- Remember, the derivative of $f$ is a matrix. The columns of
the matrix are the partial derivatives with respect to the
corresponding input variables—the first column is the partial
derivative with respect to the first input variable, the second column with respect to the second input variable, etc.
Now we generalize the above example.
The textbook only talks about partial derivatives (see Larson 13.3). We
emphasize the total derivative because it is more
powerful, simpler, and helps us understand the concepts
much better.
Let $f$ be a function.
- The partial derivative of $f$ with respect to $x$ is the
normal first-semester calculus derivative of $f$, provided we
hold every every input variable constant except $x$. We'll
use the notations $$ \frac{\partial f}{\partial x}, \quad
\frac{\partial}{\partial x}[f], \quad f_x, \quad \text{ and
}D_x f $$ to mean the partial of $f$ with respect to
$x$.
- The partial of $f$ with respect to $y$, written $\ds
\frac{\partial f}{\partial y}$, $\ds \frac{\partial
}{\partial y}[f]$, $f_y$, or $D_yf$, is the normal
first-semester calculus derivative of $f$ assuming that $y$ is the only variable changing (all other variables are constant). A similar
definition holds for partial derivatives with respect to any
variable.
-
The derivative of $f$ is a matrix. The columns of
the derivative are the partial derivatives. When there's
more than one input variable, we'll use the notation $Df$ rather than
$f'$ for the derivative. The order of the columns
must match the order of the variables that are inputs to the function. If the function is $f(x,y)$, then the derivative
is $Df(x,y) = \begin{bmatrix}\frac{\partial f}{\partial
x}&\frac{\partial f}{\partial y}\end{bmatrix}.$ If the
function is $V(x,y,z)$, then the derivative is $DV(x,y,z) =
\begin{bmatrix}\frac{\partial V}{\partial
x}&\frac{\partial V}{\partial y}&\frac{\partial
V}{\partial z}\end{bmatrix}.$
It's time to practice these new words in some problems.
Remember, we're doing the exact same thing as we did at the start of this section. We're just using the vocabulary of differentiation.
Let's apply the
problem and definition above. Fill in the blanks
- If $f(x,y)=9-x^2-y^2$, then $df =
f_xdx+f_ydy=\blank{1in}\,dx+\blank{1in}\,dy$. If we are on
the surface at the point $(x,y,z)=(2,1,4)$, then the
differential is $df=\blank{1cm}\,dx+\blank{1cm}\,dy$ (just
plug $x=2$ and $y=1$ into the partial derivatives). If we
move along the surface to $(x,y)=(2.1,1.1)$, then our
change in $x$ is $\Delta x=\blank{1cm}$, our change in $y$
is $\Delta y=\blank{1cm}$, and the differential $df$ at
$(x,y)=(2,1)$ estimates our change in height $\Delta f$ to
be about $\Delta f\approx \blank{1cm}$ (just plug in
$\Delta x$ for $dx$ and $\Delta y$ for $dy$ to get a single
number).
- If $f(x,y,z)=xy^2+yz^2$, then $$df =
\blank{1in}\,dx+\blank{1in}\,dy+\blank{1in}\,dz.$$ If we
are at the input vector $(x,y,z)=(1,-2,3)$, then the
differential is $df =
\blank{1cm}\,dx+\blank{1cm}\,dy+\blank{1cm}\,dz$. If we
move to $(x,y,z)=(0.9, -2.2, 2.8)$, then the change in $x$
is $\Delta x=\blank{1cm}$, the change in $y$ is $\Delta
y=\blank{1cm}$, and the change in $z$ is $\Delta
z=\blank{1cm}$. The differential $df$ at $(x,y,z)=(1,-2,3)$
helps us estimate the change in $f$ to be about $\Delta
f\approx \blank{1cm}$. [Hint: plug in our numeric $x$, $y$,
$z$, $\Delta x$, $\Delta y$, and $\Delta z$.]
- When a function has multiple outputs, we view the
differential as a multiple-row and 1-column matrix, where
each row has the partial derivative of one of the outputs.
If $\vec r(t)=(3t^2, 2/t)$, then $$d\vec r =
\begin{bmatrix}(3t^2)' \\ (2/t)'\end{bmatrix}
dt=\begin{bmatrix}\blank{1in}\\\blank{1in}\end{bmatrix}dt.$$
If we are at $t=2$, then our differential becomes $d\vec r
= \begin{bmatrix}\blank{1cm}\\\blank{1cm}\end{bmatrix}$. If
we move to $t=2.1$, then the change in $t$ is $\Delta
t=\blank{1cm}$ and the change in $\vec r$, as estimated by
the differential $d\vec r$ at $t=2$, is approximately
$\Delta \vec r\approx
\begin{bmatrix}\blank{1cm}\\\blank{1cm}\end{bmatrix}$.
- If $\vec r(u,\theta)=(u\cos(\theta), u\sin(\theta),
u^2)$, then $$d\vec r =
\begin{bmatrix}\blank{1in}\\\blank{1in}\\\blank{1in}\end{bmatrix}du
+
\begin{bmatrix}\blank{1in}\\\blank{1in}\\\blank{1in}\end{bmatrix}d\theta.$$
If we are at $(u,\theta)=(2,\pi/3)$, then the differential
is $$d\vec r =
\begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}du
+
\begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}d\theta.$$
If we move to $(u,\theta)=(1.9, \pi/2)$, then the change in
$u$ is $\Delta u=\blank{1cm}$, the change in $\theta$ is
$\Delta \theta=\blank{1cm}$, and the change in $\vec r$, as
estimated by the differential $d\vec r$ at
$(u,\theta)=(2,\pi/3)$, is approximately $\Delta \vec
r\approx
\begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}$.
Use
Sage to check your answers.
See 14.3:
1-40 for more practice.See
Larson 13.3:9–40 for more practice in doing partial
derivatives. I strongly suggest you practice a lot of
this type of problem until you can compute partial
derivatives with ease.
Compute the requested partial and total derivatives.
- For $f(x,y)=x^2+2xy+3y^2$, compute both
$\ds\frac{\partial f}{\partial x}$ and $f_y$. Then write down
$Df(x,y)$.
- For $f(x,y,z)=x^2y^3z^4$, compute $f_x$,
$\ds\frac{\partial f}{\partial y}$, and $D_z f$. Then write down
$Df(x,y,z)$.
When a function has multiple outputs, its partial
derivatives will have multiple components, which we write as a
column vector. For example, if $f(x,y)=(3x^2, \sin(x)+xy)$, then
$f_x=\left(\begin{matrix}6x\\\cos(x)+y\end{matrix}\right)$. This
is similar to when we were computing derivatives of space curves,
for example in this problem.
Let $\vec F(x,y)=(-y^3,2xy)$, a 2D vector field. Then $\vec
F_x=\begin{bmatrix}0\\2y\end{bmatrix}$, $\vec
F_y=\begin{bmatrix}-3y^2\\2x\end{bmatrix}$, and $D\vec F =
\begin{bmatrix}0&-3y^2\\2y&2x\end{bmatrix}$. Also,
$d\vec
F=\begin{bmatrix}0&-3y^2\\2y&2x\end{bmatrix}\begin{bmatrix}dx\\dy\end{bmatrix}$.
Remember that we can visualize a vector field by drawing arrows
on the plane (for example, like wind velocities at every point
on the plane). We can interpret this last equation as saying
that if we are at a point $(x,y)$ on the plane, and we move a
small distance away to $(x+dx, y+dy)$, then $d\vec F$
approximates how much the wind vector changes between the two
points (i.e., how much the output of $\vec F$ changes). The
derivative $D\vec F$ relates a small change in inputs (moving a
small distance) to changes the outputs (the wind vector). We
can then ask questions like: what direction should I move so
that the velocity of the wind goes down? What direction should
I move so that the wind blows more in a northern direction? If
I walk in this direction, will the wind keep pushing me in the
same direction? Will it get stronger or weaker?
Use
Sage to check your work in each part.
Do the following for each of the functions below:
- Compute the partial derivatives and the total (matrix)
derivative.
- Tell what the range and domain are (i.e.,
$\mathbb{R}^?\to\mathbb{R}^?$).
- Write down the relationship between small changes in
inputs and outputs (in the form analagous to $d\vec y = D\vec
f d\vec x$) and interpret this relationship graphically.
- The parametric curve $\vec r(t)=(t,\cos t,\sin t)$.
- The parametric surface $\vec r(u,v) = (u,v,u\cos(v))$.
[Hint: compute partial derivatives $\vec r_u$, $\vec r_v$ and
the total derivative $D\vec r(u,v)$, a 3-row by 2-column
matrix].
- The vector field $\vec F(x,y) = (-y,xe^{3y})$. [Hint:
compute partial derivatives $\vec F_x$ and $\vec F_y$ and the
total derivative $D\vec F(x,y)$, a 2-row by 2-column
matrix.]
- The parametric surface $\vec f(u,v)=(u^2,v^2,u-v)$.
- The space transformation $\vec
T(r,\theta,z)=(r\cos\theta, r\sin\theta, z)$.
As you completed the problems above, did you notice any
connections between the size of the matrix and the size of the
input and output vectors? Make sure you ask in class about this.
We'll make a connection. We've now seen that the derivative of
$z=f(x,y)$ is a matrix $Df(x,y) = \begin{bmatrix}f_x &
f_y\end{bmatrix}$. This means that $Df$ is itself a function from
$\mathbb{R}^2$ to $\mathbb{R}^2$ that has inputs $x$ and $y$ and
outputs $f_x$ and $f_y$. Therefore we can draw $Df$ as a 2d
vector field.
Check your work with
Sage.
Consider the function $f(x,y)=y-x^2$.
- In the $xy$ plane, please draw several level curves of
$f$ (maybe $z=0$, $z=2$, $z=-4$, etc.) Write the height on
each curve (so you're making a contour plot).
- Compute the derivative $Df$ (which we'll think of as a
vector field).
-
We'll examine the connection between the derivative and
level curves much more when we study optimization later.
Pick 8 points in the $xy$ plane that lie on the
level curves you drew above. At these 8 points, add the
vector given by the derivative evaluated at that point. For
example, at $(0,0)$, draw the vector $Df(0,0)=(0,1)$, and
at the point $(1,1)$, draw the vector $Df(1,1)=(-2,1)$.
What do you observe about the relationship between the
vectors and the contour lines?
The Geometry of Derivatives
Tangent planes of $z=f(x,y)$
We promised earlier in this
chapter that you can obtain most of the results in multivariate
calculus by replacing the $x$ and $y$ in $dy=f'dx$ with $\vec x$
and $\vec y$. Let's review how to find the tangent line for
functions of the form $y=f(x)$, and then generalize to finding
tangent planes for functions of the form $z=f(x,y)$.
Consider the function $y=f(x)=x^2$.
- The derivative is $f'(x) = ?$. At the point $x=3$ the
derivative is $f'(3)=?$ and the output $y$ is
$y=f(3)=?$.
- If we move from the point $(3,f(3))$ to the point $(x,y)$
along the tangent line, then a small change in $x$ is
$dx=x-3$. What is $dy$?
- Differential notation states that a change in the output
$dy$ equals the derivative times a change in the input $dx$,
which gives us the equation $dy=f'(3)dx$. Replace $dx$, $dy$,
and $f'(3)$ with what we know they equal, to obtain an
equation $y-?=?(x-?)$. What line does this equation
represent?
- Draw both $f$ and the equation from the previous part on
the same axes.
In first semester calculus, differential notation says
$dy=f' dx$. A small change in the inputs times the derivative
gives the change in the outputs. For the next problem, the output
is $z$, and input is $(x,y)$, which means differential notation
says $dz = Df \begin{bmatrix}dx\\dy\end{bmatrix}$.
See 14.6: 9-12 for more
practice.See Larson
13.7:17–30 for more practice.
Let $z=f(x,y)=9-x^2-y^2$.
- The derivative is $Df(x,y) =
\begin{bmatrix}-2x&?\end{bmatrix}$. At the point
$(x,y)=(2,1)$, the derivative is $Df(2,1) =
\begin{bmatrix}-4&?\end{bmatrix}$ and the output $z$ is
$z=f(2,1)=?$.
- If we move from the point $(2,1,f(2,1))$ to the point
$(x,y,z)$ along the tangent plane, then a small change in $x$
is $dx=x-2$. What are $dy$ and $dz$?
- Explain why an equation of the tangent plane is
We'll construct a graph of $f$ and it's tangent plane in
class.
$$ z-4=\begin{bmatrix}-4 & -2
\end{bmatrix}\begin{bmatrix}x-2\\y-1\end{bmatrix} \quad
\text{or}\quad z-4=-4(x-2)-2(y-1).$$ [Hint: What does
differential notation tell us?]
Look back at the previous two problems. The first semester
calculus tangent line equation, with differential notation,
generalized immediately to the tangent plane equation for
functions of the form $z=f(x,y)$. We just used the differential
notation $dy=f'dx$ in 2D, and generalized to $dz = Df \begin{bmatrix}dx\\dy\end{bmatrix}$. Let's repeat this on another
problem.
See 14.6: 9-12 for more
practice.See Larson
13.7:17–30 for more practice.
Let $f(x,y)=x^2+4xy+y^2$. Give an equation of the
tangent plane at $(3,-1)$. [Hint: Just as in the previous
problem, find $Df(x,y)$, $dx$, $dy$, and $dz$. Then use
differential notation.]
Partial Derivatives of $z=f(x,y)$ functions
We can also
understand tangent lines to surfaces using partial derivatives.
The next problem will help you visualize what a partial
derivative means in the graph of a surface.
See Larson 13.3:53–58 for more
practice.
Consider the function $f(x,y)=9-x^2-y^2$. Construct a
3D surface plot of $f$.
We'll focus on the point $(2,1)$.
- Let $y=1$ and construct a graph in the $xz$ plane of the
curve $z=f(x,1)=9-x^2-1^2$. Find an equation of the tangent
line to this curve at $x=2$. Write the equation in the form
$(z-z_0)=m(x-x_0)$ (find $z_0,m,x_0$). Also, find a direction
vector $(1,0,?)$ for this line.
- Let $x=2$ and construct a graph in the $yz$ plane of the
curve $z=f(2,y)=9-2^2-y^2$. Find an equation of the tangent
line to this curve at $y=1$. Write the equation in the form
$(z-z_0)=m(y-y_0)$ (find $z_0,m,y_0$). Also, find a direction
vector $(0,1,?)$ for this line.
- Compute $f_x$ and $f_y$ and then evaluate each at
$(2,1)$. What does this have to do with the previous two
parts?
- If the slope of a line $y=mx+b$ is $m$, then we know that
an increase of $1$ unit in the $x$ direction will increase
$y$ by $m$ units. Fill in the blanks by using the slopes of
tangent lines calculated above for the function
$z=f(x,y)=9-x^2-y^2$.
- Increasing $x$ by 1 unit when $y$ does not change
will cause $z$ to increase by about ?
units.
- Increasing $y$ by 1 unit when $x$ does not change
will cause $z$ to increase by about ?
units.
- Increasing $x$ by 1 unit and $y$ by 1 unit will cause
$z$ to increase by about ? units.
- In the previous part, we said that $z=9-x^2-y^2$
increased by about a certain amount each time. Why did
we not say that $z=9-x^2-y^2$ increases by exactly
that amount?
We'll conclude this section with a note about taking
derivatives of higher orders. Since a partial derivative is a
function, we can take partial derivatives of that function as
well. If we want to first compute a partial with respect to $x$,
and then with respect to $y$, we would use one of the following
notations: $$f_{xy}=\ds\frac{\partial}{\partial
y}\frac{\partial}{\partial x}f = \frac{\partial}{\partial
y}\frac{\partial f}{\partial x} = \frac{\partial^2 f}{\partial y
\partial x}.$$
See Larson 13.3:71–80 for more
practice.
Complete the following:
- Let $f(x,y)=3xy^3+e^{x}.$ Compute the four second
partials $$\ds \frac{\partial^2 f}{ \partial x^2},\quad
\ds\frac{\partial^2 f}{\partial y \partial x},\quad
\ds\frac{\partial^2 f}{\partial y^2}, \quad \text{ and
}\ds\frac{\partial^2 f}{\partial x \partial y}.$$
- For $f(x,y)=x^2\sin(y)+y^3$, compute both $f_{xy}$ and
$f_{yx}$.
- Make a conjecture about a relationship between $f_{xy}$
and $f_{yx}$. Then use your conjecture to quickly compute
$f_{xy}$ if $$f(x,y)=3xy^2+\tan^{2}(\cos(x))
(x^{49}+x)^{1000}.$$
Clairaut's
theorem implies that if a function $f$ is
“nice”—if $f$ and its partial derivatives and
second partials are defined and continuous around a point
$(a,b)$, then $f_{xy}=f_{yx}$ at that point. We will be dealing
with nice functions of this sort in this class, so we will have
this relationship between $f_{xy}$ and $f_{yx}$.
Let $z=f(x,y)=9-x^2-y^2$. We'll look at the point $(2,1)$, like
above.
- Compute $f_{xx}$, $f_{yy}$, $f_{xy}$, and $f_{yx}$ at the
point $(2,1)$.
- How do you interpret each of those second partials
graphically on this function?
Let $f(x,y)=3xy^2+y\ln x$.
- Compute $Df$, the matrix derivative.
- We can think of $Df$ as being a function with some inputs and some outputs. How many inputs and how many outputs does $Df$ have?
- Let $g(x,y)=Df(x,y)$. Compute $Dg$ (which is the second derivative of $f$, or $D^2f$). How does each entry in $D^2f$ relate to $f(x,y)$.
Partial derivatives of parametric functions
Now let's
examine computations similar to those in this problem in the light of parametric
surfaces. With parametric functions, partial derivatives are
vectors instead of just numbers. But they still represent how the
outputs change relative to changes in the inputs.
See 16.5: 27-30 for more
practice.See Larson
15.5:35–38 for more practice.
Let $z=f(x,y)=9-x^2-y^2$. We'll parameterize this
function by writing $x=x, y=y, z=9-x^2-y^2$, or in vector
notation we'd write $$\vec r(x,y) = (x,y,f(x,y)).$$
- Compute $\ds \frac{\partial \vec r}{\partial x}$ and $\ds
\frac{\partial \vec r}{\partial y}$. Then evaluate these
partials at $(x,y)=(2,1)$. What do these vectors mean? [Hint:
Draw the surface, and at the point $(2,1,4)$, draw these
vectors. See the Sage plot. Think about how the vectors are
telling you about how changes in the inputs are related to
changes in the outputs.]
- The vectors above are tangent to the surface. Use them to
obtain a normal vector to the tangent plane, and then give an
equation of the tangent plane. (You should compare it to your
equation from this problem.)
If the vectors you found in the previous problem matched up
with the direction vectors of the lines in this problem, you are doing things right.
Partial derivatives of parametric functions tell us tangent
directions. We can interpret this also in terms of motion.
Consider the change of coordinates $\vec T(r,\theta) = (r\cos
\theta, r\sin \theta)$.
-
Use
Sage to check your work.
Compute the partial derivatives $\ds\frac{\partial
\vec T}{\partial r}$ and $\ds\frac{\partial \vec
r}{\partial \theta}$, and then state the derivative $D\vec
T(r,\theta)$. [Hint: $D\vec T$ is a 2 by 2 matrix, and each
partial derivative is a column. Use Sage to check your
answer (see the link to Sage in the margin of
this problem for help
with how to do this)]
- Consider the polar point $(r,\theta) = (4,\pi/2)$:
- Compute $T(4,\pi/2)$ (i.e., the $x,y$ coordinates for
the polar point). Draw the point.
- Compute $\ds \frac{\partial \vec T}{\partial
r}(4,\pi/2)$, i.e., the partial with respect to $r$
evaluated at $r=4$, $\theta=\pi/2$. Plot this vector on
the graph you drew in the previous part, starting at the
point you drew.
- Compute $\ds \frac{\partial \vec T}{\partial
\theta}(4,\pi/2)$, i.e., the partial with respect to
$\theta$ evaluated at $r=4$, $\theta=\pi/2$. Plot this
vector on the graph you drew in the previous part,
starting at the point you drew.
- If you were standing at the polar point $(4,\pi/2)$ and
someone said, “Hey you, keep your angle constant, but
increase your radius,” then which direction would
you move? What if someone said, “Hey you, keep your
radius constant, but increase your angle”, which
direction would you move?
- Now change the polar point to $(r,\theta) = (2,3\pi/4)$.
Try, without doing any computations, to repeat part 2 (at the
point draw both partial derivatives vectors). Explain.
If your answers to the 2nd and 3rd part above were the
same, then you're doing this correctly. The partial derivatives
of parametric functions tell us about motion and tangents. The
next problem reinforces this concept. But first, a short review
about equations of lines.
If you know that a line passes through the point $(1,2,3)$ and
is parallel to the vector $(4,5,6)$, give a vector equation,
and parametric equations, of the line.
Answer: A vector
equation is $\vec r(t) = (4,5,6)t+(1,2,3)$ or $\vec r(t) =
(4t+1, 5t+2, 6t+3)$. Parametric equations for this line are
$x=4t+1$, $y=5t+2$, and $z=6t+3$.
Consider the parametric surface $\vec r(a,t) = (a\cos t, a\sin
t, t)$ for $2\leq a\leq 4$ and $0\leq t\leq 4\pi$. We
encountered this parametric surface in chapter 5 when we
considered a smoke screen left by multiple jets.
- Compute the partial derivatives $\vec r_a$ and $\vec r_t$
(they are vectors), and state the total derivative. (How big
is the matrix? What is the domain and range of $\vec
r$?)
-
Look at a plot of the surface (use one of the links
to the right). Now, suppose an object is on this surface at
the point $\vec r(3,\pi) = (-3,0,\pi)$. At that point,
please draw the partial derivatives $\vec r_a(3,\pi)$ and
$\vec r_t(3,\pi)$.
- If you were standing at $\vec r(3,\pi)$ and someone told
you, “Hey you, hold $t$ constant and increase
$a$,” then in which direction would you move? What
if someone told you, “Hey you, hold $a$ constant and
increase $t$”?
- Give vector equations for two tangent lines to the
surface at $\vec r(3,\pi)$. [Hint: You've got the point by
plugging $(3,\pi)$ into $\vec r$, and you've got two
different direction vectors from $D\vec r$. Once you have a
point and a vector, we know (from chapter 2) how to get an
equation of a line.]
In the previous problem, you should have noticed that the
partial derivatives of $\vec r(a,t)$ are tangent vectors to the
surface. Because we have two tangent vectors to the surface, we
should be able to use them to construct a normal vector to the
surface, and from that, a tangent plane. That's just cool.
If you know that a plane passes through the point $(1,2,3)$ and
has normal vector $(4,5,6)$, then give an equation of the
plane.
An equation of the plane is
$4(x-1)+5(y-2)+6(y-3)=0$. If $(x,y,z)$ is any point in the
plane, then the vector $(x-1,y-2,z-3)$ is a vector in the
plane, and hence orthogonal to $(4,5,6)$. The dot product of
these two vectors should be equal to zero, which is why the
plane's equation is $(4,5,6)\cdot (x-1,y-2,z-3)=0$.
Consider again the parametric surface $\vec r(a,t) = (a\cos t,
a\sin t, t)$ for $2\leq a\leq 4$ and $0\leq t\leq 4\pi$. We'd
like to obtain an equation of the tangent plane to this surface
at the point $\vec r(3,2\pi)$. Once you have a point on the
plane, and a normal vector to the surface, we can use the
concepts in chapter 2 to get an equation of the plane. Give an
equation of the tangent plane. [Hint: To get the point, what is
$\vec r(3,2\pi)$? The partial derivatives at $(3,2\pi)$ give us
two tangent vectors. How do I obtain a vector orthogonal to
both?]
See 16.5: 27-30 for more
practice.See Larson
15.5:35–38 for more practice.
Consider the cone parametrized by $\vec r(u,v)=(u\cos
v, u\sin v,u)$.
- Give vector equations of two tangent lines to the surface
at $\vec r(2,\pi/2)$ (so $u=2$ and $v=\pi/2$).
- Give a normal vector to the surface at $\vec
r(2,\pi/2)$.
- Give an equation of the tangent plane at $\vec
r(2,\pi/2)$.
We now have two different ways to compute tangent planes.
One way generalizes differential notation $dy=f'dx$ to $dz = Df
\begin{bmatrix}dx\\dy\end{bmatrix}$ and then uses matrix
multiplication. This way will extend to tangent objects in EVERY
dimension. It's the key idea needed to work on really large
problems. The other way requires that we parametrize the surface
$z=f(x,y)$ as $\vec r(x,y)=(x,y,f(x,y))$ and then use the cross
product on the partial derivatives. Both give the same answer.
The next problem has you give a general formula for a tangent
plane. To tackle this problem, you'll need to make sure you can
use symbolic notation. The review problem should help with this.
Joe wants to to find the tangent line to $y=x^3$ at $x=2$. He
knows the derivative is $y=3x^2$, and when $x=2$ the curve
passes through $8$. So he writes an equation of the tangent
line as $y-8=3x^2(x-2)$. What's wrong? What part of the general
formula $y-f(c) = f'(c) (x-c)$ did Joe forget?
Joe forgot to replace $x$ with $2$ in the derivative.
The equation should be $y-8=12(x-2)$. The notation $f'(c)$ is
the part he forgot. He used $f'(x)=3x^2$ instead of $f'(2)=8$.
Consider the function $z=f(x,y)$. Explain why an equation of
the tangent plane to $f$ at $(x,y)=(a,b)$ is given by
$$z-f(a,b) = \frac{\partial f}{\partial x}(a,b) (x-a) +
\frac{\partial f}{\partial y}(a,b) (y-b).$$ Then give an
equation of the tangent plane to $f(x,y) = x^2+3xy$ at
$(3,-1)$. [Hint: Use either differential notation or a
parametrization, or try both ways.]
The Chain Rule
We'll now see how the chain rule
generalizes to all dimensions. Just as before, we'll find that
the first semester calculus rule will generalize to all
dimensions if we replace $f'$ with the matrix $Df$. Let's recall
the chain rule from first-semester calculus.
Let $x$ be a real number and $f$ and $g$ be functions of a
single real variable. Suppose $f$ is differentiable at $g(x)$
and $g$ is differentiable at $x$. The derivative of $f\circ g$
at $x$ is $$(f\circ g)'(x) = \frac{d}{dx}(f\circ g)(x) =
f'(g(x))\cdot g'(x).$$
Some people remember the theorem above as “the
derivative of a composition is the derivative of the outside
(evaluated at the inside) multiplied by the derivative of the
inside.” If $u=g(x)$, we sometimes write $\ds
\frac{df}{dx}=\frac{df}{du}\frac{du}{dx}$. The following problem
should help us master this notation.
Suppose we know that $\ds
f'(x) = \frac{\sin(x)}{2x^2+3}$ and $g(x)=\sqrt{x^2+1}$. Notice
we don't know $f(x)$.
Not knowing a function $f$ is actually quite common in real
life. We can often measure how something changes (a
derivative) without knowing the function itself.
- State $f'(x)$ and $g'(x)$.
- State $f'(g(x))$, and explain the difference between
$f'(x)$ and $f'(g(x))$.
- Use the chain rule to compute $(f\circ g)'(x)$.
We now generalize to higher dimensions. If I want to write
$\vec f(\vec g(\vec x))$, then $\vec x$ must be a vector in the
domain of $g$. After computing $\vec g(\vec x)$, we must get a
vector that is in the domain of $f$. Since the chain rule in
first semester calculus states $(f(g(x))'=f'(g(x))g'(x)$, then in
high dimension it should state $D(f(g(x)) = Df(g(x))Dg(x)$, the
product of two matrices.
In
this problem, we
showed that for a circular cylinder with volume $V=\pi r^2 h$,
the derivative is $$DV(r,h)=\begin{bmatrix} 2\pi rh & \pi
r^2 \end{bmatrix}.$$ Suppose that the radius and height are
both changing with respect to time, where $r=3t$ and $h=t^2$.
We'll write this parametrically as $\vec g(t) =(3t, t^2)$
(i.e., $\vec g(t)=(r,h)$).
- In $V=\pi r^2 h$, replace $r$ and $h$ with what they are
in terms of $t$. Then compute $\dfrac{dV}{dt}$. This is a
first-semester calculus derivative; we'll use it to check our
work below.
- We know $DV(r,h)=\begin{bmatrix} 2\pi rh & \pi r^2
\end{bmatrix}$ and $Dg(t)= \begin{bmatrix} 3\\ 2t
\end{bmatrix}.$ In first semester calculus, the chain rule
was the product of derivatives. Multiply these matrices
together to get $$\dfrac{dV}{dt}=DV(g(t))\, D(r,h)(t).$$ Did
you get the same answer as the first part?
- To get the correct answer to the previous part, you had
to replace $r$ and $h$ with what they equaled in terms of
$t$. What part of the notation $\dfrac{dV}{dt}=DV(g(t))\,
Dg(t)$ tells you to replace $r$ and $h$ with what they equal
in terms of $t$?
Let's look at some physical examples involving motion and
temperature, and try to connect what we know should happen to
what the chain rule states.
Consider
$f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t, 3\sin t)$. Imagine
the following scenario: a horse runs around outside in the
cold. The horse's position at time $t$ is given parametrically
by the elliptical path $\vec r(t)$. The function $T=f(x,y)$
gives the temperature of the air at any point $(x,y)$.
- At time $t=0$, what is the horse's
position $\vec r(0)$, and what is the temperature $f(\vec
r(0))$ at that position? Find the temperatures at $t=\pi/2$,
$t=\pi$, and $t=3\pi/2$ as well.
-
If you end up with an ellipse and several concentric
circles, then you've done this right.
In the plane, draw the path of the horse for $t\in
[0,2\pi]$. Then, on the same 2D graph, include a contour
plot of the temperature function $f$. Make sure you include
the level curves that pass through the points in
this part, and write the temperature
on each level curve you draw.
-
This idea leads to an optimization technique, Lagrange
multipliers, later in the semester.
As the horse runs around, the temperature of the
air around the horse is constantly changing. At which $t$
does the temperature around the horse reach a maximum? A
minimum? Explain, using your graph.
- As the horse moves past the point at
$t=\pi/4$, is the temperature of the surrounding air
increasing or decreasing? In other words, is $\dfrac{df}{dt}$
positive or negative? Use your graph to explain.
- We'll complete this part in class, but you're welcome to
give it a try yourself.
Draw the 3D surface plot of $f$. In the
$xy$-plane of your 3D plot (so $z=0$) add the path of the
horse. In class, we'll project the path of the horse up into
the 3D surface.
Consider again $f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t,
3\sin t)$, which means $x=2\cos t$ and $y=3\sin t$.
- At the point $\vec r(t)$, we'd like a formula for the
temperature $f(\vec r(t))$. What is the temperature of the
horse at any time $t$? [In $f(x,y)$, replace $x$ and $y$ with
what they are in terms of $t$.]
- Compute $df/dt$ (the derivative as you did in
first-semester calculus).
- Construct a graph of $f(t)$ (use software to draw this if
you like). From your graph, at what time values do the maxima
and minima occur?
- What is $df/dt$ at $t=\pi/4$?
- Compare your work with the previous problem.
Consider again $f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t,
3\sin t)$.
- Compute both $Df(x,y)$ and $D\vec r(t)$ as matrices. One
should have two columns. The other should have one column
(but two rows).
- The temperature at any time $t$ we can write symbolically
as $f(r(t))$. First semester calculus suggests the derivative
should be the produce $(f(\vec r(t))) ' = f'(\vec r(t))\vec
r'(t)$. Write this using $D$ notation instead of prime
notation.
- Compute the matrix product $DfD\vec r$, and then
substitute $x=2\cos t$ and $y=3\sin t$.
- What is the change in temperature with respect to time at
$t=\pi/4$? Is it positive or negative? Compare with the
previous problem.
The previous three problems all focused on exactly the same
concept. The first looked at the concept graphically, showing
what it means to write $(f\circ \vec r)(t)=f(\vec r(t))$. The
second reduced the problem to first-semester calculus. The third
tackled the problem by considering matrix derivatives. In all
three cases, we wanted to understand the following problem.
If $z=f(x,y)$ is a function of $x$ and $y$, and both $x$ and
$y$ are functions of $t$ (i.e., $\vec r(t)=(x(t),y(t))$), then
how do we discover how do changes in $t$ affect $f$? In other
words, what is the derivative of $f$ with respect to $t$?
Notationally, we seek $\ds \frac{df}{dt}$ which we formally
write as $\ds \frac{d}{dt}[f(x(t),y(t))]$ or $\ds \frac{d}{dt}
[f(\vec r(t))].$
To answer this problem, we use the chain rule, which is
just matrix multiplication.
The Chain Rule
Let $\vec x$ be a vector and $\vec f$
and $\vec g$ be functions so that the composition $\vec f(\vec
g(\vec x))$ makes sense (we can use the output of $g$ as an
input to $f$). Suppose $\vec f$ is differentiable at $\vec
g(\vec x)$ and that $\vec g$ is differentiable at $\vec x$.
Then the derivative of $\vec f\circ \vec g$ at $\vec x$ is
$$D(\vec f\circ \vec g)(\vec x) = D\vec f(\vec g(\vec x))\cdot
D\vec g(\vec x).$$ The derivative of a composition is equal to
the derivative of the outside (evaluated at the inside),
multiplied by the derivative of the inside.
This is exactly the same as the chain rule in
first-semester calculus. The only difference is that now we have
vectors above every variable and function, and we replaced the
one-by-one matrices $f'$ and $g'$ with potentially larger
matrices $Df$ and $Dg$. If we write everything in vector
notation, the chain rule in all dimensions is the EXACT same as
the chain rule in one dimension.
See 14.4: 1-6 for more practice. Don't use
the formulas in the chapter, rather practice using matrix
multiplication. The formulas are just a way of writing matrix
multiplication without writing down the matrices, and only
work for functions from $\RR^n\to\RR$. Our matrix
multiplication method works for any function from
$\RR^n\to\RR^m$.See Larson
13.5:1–6 for more practice (you can check answers in the
back of the book). Don't use the formulas on pages 925–930.
Instead, use matrix multiplication. The formulas are just a
way of writing matrix multiplication without writing down the
matrices, and only work for functions from $\RR^n\to\RR$. Our
matrix multiplication method works for any function from
$\RR^n\to\RR^m$.
In class, I also replace $x$ and $y$ in
$f=x^2+xy$ with what they are in terms of $t$, and then use
first-semester calculus to find $df/dt$.
Suppose that $f(x,y)
= x^2+xy$ and that $x=2t+3$ and $y=3t^2+4$.
- Rewrite the parametric equations $x=2t+3$ and $y=3t^2+4$
in vector form, so we can apply the chain rule. This means
you need to create a function $\vec r(t) = (\blank{1in},
\blank{1in})$.
- Compute the derivatives $Df(x,y)$ and $D\vec r(t)$, and
then multiply the matrices together to obtain
$\dfrac{df}{dt}$. How can you make your answer only depend on
$t$ (not $x$ or $y$)?
- The chain rule states that $D(f\circ \vec r)(t) = Df(\vec
r(t))D\vec r(t)$. Explain why we write $Df(\vec r(t))$
instead of $Df(x,y)$.
If you'd like to make sure you are correct, try the
following: replace $x$ and $y$ in $f=x^2+xy$ with what they are
in terms of $t$, and then just use first-semester calculus to
find $df/dt$. Is it the same?
See 14.4: 7-12 for more
practice.See Larson
13.5:7–10 for more practice (remember to use matrix
multiplication, not the formulas from the book).
Suppose $f(x,y,z) = x+2y+3z^2$ and $x=u+v$, $y=2u-3v$,
and $z=uv$. Our goal is to find how much $f$ changes if we were
to change $u$ (so $\partial f/\partial u$) or if we were to
change $v$ (so $\partial f/\partial v$). Try doing this problem
without looking at the steps below, but instead try to follow
the patterns in the previous problem on your own.
- Rewrite the equations for $x$, $y$, and $z$ in vector form
$\vec r(u,v)=(x,y,z)$. If you were to graph $\vec r$, what
kind of graph would you make?
- Compute the derivatives $Df(x,y,z)$ and $D\vec r(u,v)$,
and then multiply them together. Notice that since this
composite function has 2 inputs, namely $u$ and $v$, we
should expect to get two columns when we are done.
- What are $\partial f/\partial u$ and $\partial f/\partial
v$? [Hint: remember, each input variable gets a column.]
Let $\vec F(s,t) = (2s+t,3s-4t,t)$ and $s=3pq$ and $t=2p+q^2$.
This means that changing $p$ and/or $q$ should cause $\vec F$
to change. Our goal is to find $\partial \vec F/\partial p$ and
$\partial \vec F/\partial q$. Note that since $\vec F$ is a
vector-valued function, the two partial derivatives should be
vectors. Try doing this problem without looking at the steps
below, but instead try to follow the patterns in the previous
problems.
- Rewrite the parametric equations for $s$
and $t$ in vector form.
- Compute $D\vec F(s,t)$ and the derivative of your vector
function from the previous part, and then
multiply them together to find the derivative of $\vec F$
with respect to $p$ and $q$. How many columns should we
expect to have when we are done multiplying matrices?
- What are $\partial \vec F/\partial p$ and $\partial \vec
F/\partial q$?
(Optional challenge)
Suppose $\vec F(u,v) = (3u-v,u+2v,3v)$, $\vec G(x,y,z)=(x^2+z,
4y-x)$, and $\vec r(t) = (t^3, 2t+1, 1-t)$. We want to examine
$\vec F(\vec G(\vec r(t))$. This means that $\vec F\circ \vec
G\circ \vec r$ is a function from $\RR^n\to\RR^m$ for what $n$
and $m$? Similar to first-semester calculus, since we have two
functions nested inside of each other, we'll just need to apply
the chain rule twice. Our goal is to find $d\vec F/dt$. Try to
do this problem without looking at the steps below.
- Compute $D\vec F(u,v)$, $D\vec G(x,y,z)$, and $D\vec
r(t)$.
- Use the chain rule (matrix multiplication) to find the
derivative of $\vec F$ with respect to $t$. What size of
matrix should we expect for the derivative?
Suppose $f(x,y)=x^2+3xy$ and $(x,y) = \vec r(t) = (3t,t^2)$.
Compute both $Df(x,y)$ and $D\vec r(t)$. Then explain how you
got your answer by writing what you did in terms of partial
derivatives and regular derivatives.
Answer: We have
$Df(x,y) = \begin{bmatrix}2x+3y&3y\end{bmatrix}$ and $D\vec
r(t) = \begin{bmatrix}3\\2t\end{bmatrix}$. We just computed
$f_x$ and $f_y$, and $dx/dt$ and $dy/dt$, which gave us
$Df(x,y) = \begin{bmatrix}\partial f/\partial x&\partial
f/\partial y\end{bmatrix}$ and $D\vec r(t) =
\begin{bmatrix}dx/dt\\dy/dt\end{bmatrix}$.
See 14.4: 13-24 for more practice. Practice
these problems by using matrix multiplication. The examples
problems in the text use a “branch diagram,”
which is just a way to express matrix multiplication without
having to introduce matrices.See Larson 13.5:7–10 for more practice.
Complete the following:
- Suppose that $w=f(x,y,z)$ and that $x,y,z$ are all
function of one variable $t$ (so $x=g(t), y=h(t), z=k(t)$).
Use the chain rule with matrix multiplication to explain why
$$\frac{dw}{dt} = \frac{\partial f}{\partial
x}\frac{dg}{dt}+\frac{\partial f}{\partial
y}\frac{dh}{dt}+\frac{\partial f}{\partial z}\frac{dk}{dt}
.$$ which is equivalent to writing $$\frac{dw}{dt} =
\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial
f}{\partial y}\frac{dy}{dt}+\frac{\partial f}{\partial
z}\frac{dz}{dt} .$$ [Hint: Rewrite the parametric equations
for $x$, $y$, and $z$ in vector form $\vec r(t) = (x,y,z)$
and compute $Dw(x,y,z)$ and $D\vec r(t)$.]
- Suppose that $R=f(V,T,n,P)$, and that $V,T,n,P$ are all
functions of $x$. Give a formula (similar to the above) for
$\dfrac{dR}{dx}.$
See Larson 13.5:19–26 for more
practice.
Make sure you practice problems 14.4:
13-24. Use matrix multiplication, rather than the
“branch diagram” referenced in the
text.See Larson 13.5:7–10
for more practice.
Suppose $z=f(s,t)$ and $s$ and $t$ are functions of
$u$, $v$ and $w$. Use the chain rule to give a general formula
for $\partial z/\partial u$, $\partial z/\partial v$, and
$\partial z/\partial w$.
If $w=f(x,y,z)$ and $x,y,z$ are functions of $u$ and $v$,
obtain formulas for $\dfrac{\partial f}{\partial u}$ and
$\dfrac{\partial f}{\partial v}$.
We have
$Df(x,y,z) =\begin{bmatrix}\dfrac{\partial f}{\partial
x}&\dfrac{\partial f}{\partial y}&\dfrac{\partial
f}{\partial z}\end{bmatrix}$. The parametrization $\vec
r(u,v)=(x,y,z)$ has derivative $D\vec r =\begin{bmatrix}
\dfrac{\partial x}{\partial u}&\dfrac{\partial x}{\partial
v}\\ \dfrac{\partial y}{\partial u}&\dfrac{\partial
y}{\partial v}\\ \dfrac{\partial z}{\partial
u}&\dfrac{\partial z}{\partial v} \end{bmatrix}$. The
product is $D(f(\vec r(u,v))) =\begin{bmatrix} \dfrac{\partial
f}{\partial x}\dfrac{\partial x}{\partial u}+ \dfrac{\partial
f}{\partial y}\dfrac{\partial y}{\partial u}+ \dfrac{\partial
f}{\partial z}\dfrac{\partial z}{\partial u}&
\dfrac{\partial f}{\partial x}\dfrac{\partial x}{\partial v}+
\dfrac{\partial f}{\partial y}\dfrac{\partial y}{\partial v}+
\dfrac{\partial f}{\partial z}\dfrac{\partial z}{\partial v}
\end{bmatrix} $. The first column is $\dfrac{\partial
f}{\partial u}$, and the second column is $\dfrac{\partial
f}{\partial v}$.
You've now got the key ideas needed to use the chain rule
in all dimensions. You'll find this shows up many places in
upper-level math, physics, and engineering courses. The following
problem will show you how you can use the general chain rule to
get an extremely quick way to perform implicit differentiation
from first-semester calculus.
See 14.4: 25-32 to practice using the
formula you developed. To practice the idea developed
in this problem, show that if $w=F(x,y,z)$ is held constant
at $w=c$ and we assume that $z=f(x,y)$ depends on $x$ and
$y$, then $\frac{\partial z}{\partial x} = -\frac{F_x}{F_z}$
and $\frac{\partial z}{\partial y} = -\frac{F_y}{F_z}$.
This is done on page 798 at the
bottom.
See Larson 13.5:27–30 for more
practice, and see pages 929–930 for how the book derives
these formulas.
Suppose $z=f(x,y)$. If $z$ is held constant, this
produces a level curve. As an example, if $f(x,y) =
x^2+3xy-y^3$ then $5=x^2+3xy-y^3$ is a level curve. Our goal in
this problem is to find $dy/dx$ in terms of partial derivatives
of $f$.
- Suppose $x=x$ and $y=y(x)$, so $y$ is a function of $x$.
We can write this in parametric form as $\vec r(x) =
(x,y(x))$. We now have $z=f(x,y)$ and $\vec r(x)=(x,y(x))$.
Compute both $Df(x,y)$ and $D\vec r(x)$ symbolically. Don't
use the function $f(x,y)=x^2+3xy-y^3$ until the last
step.
- Use the chain rule to compute $D(f(\vec
r(x)))$. What is $dz/dx$ (i.e., $df/dx$)?
- Since $z$ is held constant, we know that $dz/dx=0$. Use
this fact, together with previous part, to
explain why $\ds \frac{dy}{dx} = -\frac{f_x}{f_y} =
-\frac{\partial f/ \partial x}{\partial f/ \partial y}$.
- For the curve $5=x^2+3xy-y^3$, use this formula to
compute $dy/dx$.