Derivatives

This unit covers the following ideas. In preparation for the quiz and exam, make sure you have a lesson plan containing examples that explain and illustrate the following concepts.

  1. Find limits, and be able to explain when a function does not have a limit by considering different approaches.
  2. Compute partial derivatives. Explain how to obtain the total derivative from the partial derivatives (using a matrix).
  3. Find equations of tangent lines and tangent planes to surfaces. We'll do this three ways.
  4. Find derivatives of composite functions, using the chain rule (matrix multiplication).
You'll have a chance to teach your examples to your peers prior to the exam.

Limits

See Larson 13.2 for more about limits and continuity.

In the previous chapter, we learned how to describe lots of different functions. In first-semester calculus, after reviewing functions, you learned how to compute limits of functions, and then used those ideas to develop the derivative of a function. The exact same process is used to develop calculus in high dimensions. One glitch that will prevent us from developing calculus this way in high dimensions is the epsilon-delta definition of a limit. We'll review it briefly. Those of you who want to pursue further mathematical study will spend much more time on this topic in future courses. In first-semester calculus, you learned how to compute limits of functions. Here's the formal epsilon-delta definition of a limit.

Let $f:\RR\to\RR$ be a function. We write $\ds \lim_{x\to c} f(x)=L$ if and only if for every $\epsilon>0$, there exists a $\delta>0$ such that $0<|x-c|<\delta$ implies $|f(x)-L|<\epsilon$.

We're looking at this formal definition here because we can compare it with the formal definition of limits in higher dimensions. The only difference is that we just put vector symbols above the input $x$ and the output $f(x)$.

Let $\vec f:\RR^n\to\RR^m$ be a function. We write $\ds \lim_{\vec x\to \vec c} \vec f(\vec x)=\vec L$ if and only if for every $\epsilon>0$, there exists a $\delta>0$ such that $0<|\vec x-\vec c|<\delta$ implies $|\vec f(\vec x)-\vec L|<\epsilon$.

We'll find that throughout this course, the key difference between first-semester calculus and multivariate calculus is that we replace the input $x$ and output $y$ of functions with the vectors $\vec x$ and $\vec y$.

The point to this problem is to help you learn to recognize the dimensions of the domain and codomain of the function. If we write $\vec f:\RR^n\to \RR^m$, then $\vec x$ is a vector in $\RR^n$ with $n$ components, and $\vec y$ is a vector in $\RR^m$ with $m$ components.
For the function $f(x,y)=z$, we can write $f$ in the vector notation $\vec y=\vec f(\vec x)$ if we let $\vec x=(x,y)$ and $\vec y=(z)$. Notice that $\vec x$ is a vector of inputs, and $\vec y$ is a vector of outputs. For each of the functions below, state what $\vec x$ and $\vec y$ should be so that the function can be written in the form $\vec y = \vec f (\vec x)$.
  1. $f(x,y,z)=w$
  2. $\vec r(t)=(x,y,z)$
  3. $\vec r(u,v)=(x,y,z)$
  4. $\vec F(x,y)=(M,N)$
  5. $\vec F(\rho,\phi,\theta)=(x,y,z)$

You learned to work with limits in first-semester calculus without needing the formal definitions above. Many of those techniques apply in higher dimensions. The following problem has you review some of these technique, and apply them in higher dimensions.

See 14.2: 1-30 for more practice.
Do these problems (without using L'Hopital's rule).
  1. Compute $\ds \lim_{x\to 2} x^2-3x+5$ and then $\ds\lim_{(x,y)\to (2,1)} 9-x^2-y^2$.
  2. Compute $\ds\lim_{x\to 3}\frac{x^2-9}{x-3}$ and then $\ds\lim_{(x,y)\to (4,4)} \frac{x-y}{x^2-y^2}$.
  3. Explain why $\ds\lim_{x\to 0}\frac{x}{|x|}$ does not exist. [Hint: graph the function.]

In first semester calculus, we can show that a limit does or does not exist by considering what happens from the left, and comparing it to what happens on the right. You probably used the following theorem extensively.

If $y=f(x)$ is a function defined on some open interval containing $c$, then $\ds\lim_{x\to c}f(x)$ exists if and only if $\ds\lim_{x\to c^-}f(x) = \ds\lim_{x\to c^+}f(x)$.

A limit exists precisely when the limits from every direction exists, and all directional limits are equal. In first-semester calculus, this required that you check two directions (left and right). This theorem generalizes to higher dimensions, but it becomes much more difficult to apply.

Consider the function $\ds f(x,y)=\frac{x^2-y^2}{x^2+y^2}$. Our goal is to determine if the function has a limit at the origin $(0,0)$. We can approach the origin along many different lines. One line through the origin is the line $y=2x$. If we stay on this line, then we can replace each $y$ with $2x$ and then compute $$\ds\lim_{\substack{(x,y)\to(0,0)\\ y=2x }}\frac{x^2-y^2}{x^2+y^2} = \lim_{x\to 0} \frac{x^2-(2x)^2}{x^2+(2x)^2} = \lim_{x\to 0} \frac{-3x^2}{5x^2} = \lim_{x\to 0} \frac{-3}{5} =\frac{-3}{5}.$$ This means that if we approach the origin along the line $y=2x$, we will have a height of $-3/5$ when we arrive at the origin.

If the function $\ds f(x,y)=\frac{x^2-y^2}{x^2+y^2}$ has a limit at the origin, the previous problem suggests that limit will be $-3/5$.

Please read the previous example. Recall that we are looking for the limit of the function $\ds f(x,y)=\frac{x^2-y^2}{x^2+y^2}$ at the origin (0,0).
You may want to look at a graph in Sage or Wolfram Alpha (try using the “contour lines” option). As you compute each limit, make sure you understand what that limit means in the graph.
Our goal is to determine if the function has a limit at the origin $(0,0)$.
  1. In the $xy$-plane, how many lines pass through the origin $(0,0)$? Give an equation a line other than $y=2x$ that passes through the origin. Then compute $$\ds\lim_{\substack{(x,y)\to(0,0)\\ \text{your line} }}\frac{x^2-y^2}{x^2+y^2} = \lim_{x\to 0} \frac{x^2-(?)^2}{x^2+(?)^2}=\ldots.$$
  2. Give another equation a line that passes through the origin. Then compute $$\ds\lim_{\substack{(x,y)\to(0,0)\\ \text{your line}}}\frac{x^2-y^2}{x^2+y^2}.$$
  3. Does this function have a limit at $(0,0)$? Explain.
    See 14.2: 41-50 for more practice.See Larson 13.2:23–36 and example 4 for more practice.

The theorem from first-semester calculus generalizes as follows.

If $\vec y=\vec f(\vec x)$ is a function defined on some open region containing $\vec c$, then $\ds\lim_{\vec x\to \vec c}\vec f(\vec x)$ exists if and only if the limit exists along every possible approach to $\vec c$ and all these limits are equal.

There's a fundamental problem with using this theorem to check if a limit exists. Once the domain is 2-dimensional or higher, there are infinitely many ways to approach a point. There is no longer just a left and right side. To prove a limit exists, you must check infinitely many cases—that usually takes a really long time. The real power to this theorem is it allows us to show that a limit does not exist. All we have to do is find two approaches with different limits.

See Sage.
See 14.2: 41-50 for more practice.See Larson 13.2:9–36 for more practice.
Consider the function $\ds f(x,y) = \frac{xy}{x^2+y^2}$. Does this function have a limit at $(0,0)$?
  1. Examine the function at $(0,0)$ by considering the limit as you approach the origin along several lines.
  2. Convert $\frac{xy}{x^2+y^2}$ to polar coordinates (i.e., a function of $r$ and $\theta$). As $(x,y)$ approaches the origin, what does $r$ approach? Take the limit of your polar coordinate function as $r$ approaches that value and interpret your result.

In all the examples above, we considered approaching a point by traveling along a line. However, even if a function has a consistent limit along every line, that is not enough to always guarantee the function has a limit. The theorem requires every approach be consistent, which includes parabolic approaches, spiraling approaches, and more. Sometimes the straight-line paths happen to be consistent with each other, but a different path gives a different limit. Give some thought to this in the optional challenge problem below.

Give an example of a function $f(x,y)$ so that the limit at $(0,0)$ along every straight line $y=mx$ exists and equals 0. However, show that the function has no limit at $(0,0)$ by considering an approach that is not a straight line.

The Derivative

Before we introduce derivatives, let's recall the definition of a differential. If $y=f(x)$ is a function, then we say the differential $dy$ is the expression $dy=f'(x) dx$ (we could also write this as $dy = \frac{dy}{dx}dx$). Think of differential notation $dy=f'(x)dx$ in the following way:

A small change in the output $y$ equals the derivative multiplied by a small change in the input $x$. In other words, if the input $x$ changes by a small amount $dx$, then we multiply that small change by the derivative to determine how much $y$ changes (i.e., $dy$).

To get the derivative in all dimensions, we just substitute in vectors to obtain the differential notation $d\vec y = f'(\vec x) d\vec x$. The derivative is precisely the thing that tells us how to get $d\vec y$ from $d\vec x$. We'll quickly see that the derivative $f'(\vec x)$ must be a matrix, and we'll start writing it as $Df$ instead of $f'$. We've actually already dealt with problems involving derivatives of multiple-variable functions in first-semester calculus. The next few problems are very similar to related rates or differential problems from first-semester calculus, and we'll see how the derivative $f'$ in $dy=f'(x) dx$ naturally generalizes to a matrix.

See 3.10 for more practice.See Larson 3.7 and 4.8.
The volume of a right circular cylinder is $V(r,h)= \pi r^2 h$. Imagine that each of $V$, $r$, and $h$ depends on $t$ (we might be collecting rain water in a can, or crushing a cylindrical concentrated juice can, etc.).
  1. If the height remains constant, but the radius changes, what is $dV/dt$ in terms of $dr/dt$? Use this to find a formula for $dV$ in terms of $dr$ when $h$ is constant.
  2. If the radius remains constant, but the height changes, what is $dV/dt$ in terms of $dh/dt$? What is $dV$ when $r$ is constant?
  3. If both the radius and height change, what is $dV/dt$ in terms of $dh/dt$ and $dr/dt$? Solve for $dV$.
  4. Make sure you ask me in class to show you physically exactly how you can see these differential formulas.
    Show that we can write $dV$ as the matrix product of a 1-row by 2-column matrix with a 2-row by 1-column matrix: $$dV = \begin{bmatrix}2\pi rh& \pi r^2\end{bmatrix}\begin{bmatrix}dr\\dh\end{bmatrix}.$$ How do the columns of the first matrix relate to the calculations you did above?

    The matrix $\begin{bmatrix}2\pi rh& \pi r^2\end{bmatrix}$ is the derivative of $V$. The columns of this matrix are the partial derivatives of $V$.

  5. If we know that $r=3$ and $h=4$, and we know that $r$ increases by about $.1$ and $h$ increases by about $.2$, then approximate how much $V$ will increase. Use your formula for $dV$ to approximate this.
The volume of a box is $V(x,y,z)=xyz$. Imagine that each variable depends on $t$.
  1. If both $y$ and $z$ remain constant, what is $dV/dt$? Use this to find a formula for $dV$ in terms of $dx$, assuming that $y$ and $z$ are constant.
  2. Repeat the last step for when $y$ is the only variable that changes, and then for when $z$ is the only variable that changes.
  3. What is $dV/dt$ in terms of $dx/dt$, $dy/dt$, and $dz/dt$ when all three variables are changing? Solve for $dV$.
  4. Show that we can write $dV$ as the matrix product of a 1-row by 3-column matrix with a 3-row by 1-column matrix: $$dV = \begin{bmatrix}yz& ?&?\end{bmatrix}\begin{bmatrix}dx\\dy\\dz\end{bmatrix}.$$ How do the columns of the first matrix relate to the previous portions of the problem.

    The matrix $\begin{bmatrix}yz& ?&?\end{bmatrix}$ is the derivative of $V$. The columns of this matrix are the partial derivatives of $V$.

  5. If the current measurements of a box are $x=2$, $y=3$, and $z=5$, and we know that $x$ increases by .01, $y$ increases by .02, and $z$ decreases by .03, then by about how much will the volume change? Use your formula for $dV$ to approximate this.

Part 4 in each problem above, expressing relationships between changes in terms of differentials, is the KEY idea, let me repeat, THE KEY IDEA, to the rest of this course. The essential thing is that we can use differentials to understand and approximate how small changes in the inputs of a function will change the outputs of the function. For example, we can approximate the change in a function $f(x,y)$ if we know how much $x$ and $y$ will change.

Using differentials to analyze how outputs of a function change when the inputs change comes up in many applications, such as analyzing numerical roundoff error in calculations or analyzing manufacturing tolerances.

Consider the function $f(x,y) = x^2y +3x+4\sin(5y)$.
  1. If both $x$ and $y$ depend on $t$, then use implicit differentiation to obtain a formula for $df/dt$ in terms of $dx/dt$ and $dy/dt$.
  2. Solve for $df$, and write your answer as the matrix product (fill in the question mark) $$df = \begin{bmatrix}?& x^2+20\cos(5y)\end{bmatrix}\begin{bmatrix}dx\\dy\end{bmatrix}.$$
  3. If you hold $y$ constant (so $dy=0$), then what is $df/dx$? If you are at $(x,y)=(1,2)$, and if $x$ changes by $.2$, then about how much does $z=f(x,y)$ change?
  4. If you hold $x$ constant (so $dx=0$), then what is $df/dy$? If you are at $(x,y)=(1,2)$, and if $y$ changes by $-.3$, then about how much does $z=f(x,y)$ change?
  5. If you are at the point $(x,y)=(1,2)$, and you move to $(1.2, 1.7)$, then what is $dx$ and $dy$? Approximate how much $z=f(x,y)$ changes.
  6. (Challenge) In the last part, why was our calculation an approximation, rather than an exact calculation of how much $f(x,y)$ changed?

We need to add some vocabulary to make it easier to talk about what we just did. Let's introduce the vocabulary in terms of the problem above, and then make a formal definition.

Now we generalize the above example.

The textbook only talks about partial derivatives (see Larson 13.3). We emphasize the total derivative because it is more powerful, simpler, and helps us understand the concepts much better.
Let $f$ be a function.
  • The partial derivative of $f$ with respect to $x$ is the normal first-semester calculus derivative of $f$, provided we hold every every input variable constant except $x$. We'll use the notations $$ \frac{\partial f}{\partial x}, \quad \frac{\partial}{\partial x}[f], \quad f_x, \quad \text{ and }D_x f $$ to mean the partial of $f$ with respect to $x$.
  • The partial of $f$ with respect to $y$, written $\ds \frac{\partial f}{\partial y}$, $\ds \frac{\partial }{\partial y}[f]$, $f_y$, or $D_yf$, is the normal first-semester calculus derivative of $f$ assuming that $y$ is the only variable changing (all other variables are constant). A similar definition holds for partial derivatives with respect to any variable.
  • The derivative of $f$ is a matrix. The columns of the derivative are the partial derivatives. When there's more than one input variable, we'll use the notation $Df$ rather than $f'$ for the derivative. The order of the columns must match the order of the variables that are inputs to the function. If the function is $f(x,y)$, then the derivative is $Df(x,y) = \begin{bmatrix}\frac{\partial f}{\partial x}&\frac{\partial f}{\partial y}\end{bmatrix}.$ If the function is $V(x,y,z)$, then the derivative is $DV(x,y,z) = \begin{bmatrix}\frac{\partial V}{\partial x}&\frac{\partial V}{\partial y}&\frac{\partial V}{\partial z}\end{bmatrix}.$

It's time to practice these new words in some problems. Remember, we're doing the exact same thing as we did at the start of this section. We're just using the vocabulary of differentiation.

Let's apply the problem and definition above. Fill in the blanks
  1. If $f(x,y)=9-x^2-y^2$, then $df = f_xdx+f_ydy=\blank{1in}\,dx+\blank{1in}\,dy$. If we are on the surface at the point $(x,y,z)=(2,1,4)$, then the differential is $df=\blank{1cm}\,dx+\blank{1cm}\,dy$ (just plug $x=2$ and $y=1$ into the partial derivatives). If we move along the surface to $(x,y)=(2.1,1.1)$, then our change in $x$ is $\Delta x=\blank{1cm}$, our change in $y$ is $\Delta y=\blank{1cm}$, and the differential $df$ at $(x,y)=(2,1)$ estimates our change in height $\Delta f$ to be about $\Delta f\approx \blank{1cm}$ (just plug in $\Delta x$ for $dx$ and $\Delta y$ for $dy$ to get a single number).
  2. If $f(x,y,z)=xy^2+yz^2$, then $$df = \blank{1in}\,dx+\blank{1in}\,dy+\blank{1in}\,dz.$$ If we are at the input vector $(x,y,z)=(1,-2,3)$, then the differential is $df = \blank{1cm}\,dx+\blank{1cm}\,dy+\blank{1cm}\,dz$. If we move to $(x,y,z)=(0.9, -2.2, 2.8)$, then the change in $x$ is $\Delta x=\blank{1cm}$, the change in $y$ is $\Delta y=\blank{1cm}$, and the change in $z$ is $\Delta z=\blank{1cm}$. The differential $df$ at $(x,y,z)=(1,-2,3)$ helps us estimate the change in $f$ to be about $\Delta f\approx \blank{1cm}$. [Hint: plug in our numeric $x$, $y$, $z$, $\Delta x$, $\Delta y$, and $\Delta z$.]
  3. When a function has multiple outputs, we view the differential as a multiple-row and 1-column matrix, where each row has the partial derivative of one of the outputs. If $\vec r(t)=(3t^2, 2/t)$, then $$d\vec r = \begin{bmatrix}(3t^2)' \\ (2/t)'\end{bmatrix} dt=\begin{bmatrix}\blank{1in}\\\blank{1in}\end{bmatrix}dt.$$ If we are at $t=2$, then our differential becomes $d\vec r = \begin{bmatrix}\blank{1cm}\\\blank{1cm}\end{bmatrix}$. If we move to $t=2.1$, then the change in $t$ is $\Delta t=\blank{1cm}$ and the change in $\vec r$, as estimated by the differential $d\vec r$ at $t=2$, is approximately $\Delta \vec r\approx \begin{bmatrix}\blank{1cm}\\\blank{1cm}\end{bmatrix}$.
  4. If $\vec r(u,\theta)=(u\cos(\theta), u\sin(\theta), u^2)$, then $$d\vec r = \begin{bmatrix}\blank{1in}\\\blank{1in}\\\blank{1in}\end{bmatrix}du + \begin{bmatrix}\blank{1in}\\\blank{1in}\\\blank{1in}\end{bmatrix}d\theta.$$ If we are at $(u,\theta)=(2,\pi/3)$, then the differential is $$d\vec r = \begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}du + \begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}d\theta.$$ If we move to $(u,\theta)=(1.9, \pi/2)$, then the change in $u$ is $\Delta u=\blank{1cm}$, the change in $\theta$ is $\Delta \theta=\blank{1cm}$, and the change in $\vec r$, as estimated by the differential $d\vec r$ at $(u,\theta)=(2,\pi/3)$, is approximately $\Delta \vec r\approx \begin{bmatrix}\blank{1cm}\\\blank{1cm}\\\blank{1cm}\end{bmatrix}$.
Use Sage to check your answers. See 14.3: 1-40 for more practice.See Larson 13.3:9–40 for more practice in doing partial derivatives. I strongly suggest you practice a lot of this type of problem until you can compute partial derivatives with ease.
Compute the requested partial and total derivatives.
  1. For $f(x,y)=x^2+2xy+3y^2$, compute both $\ds\frac{\partial f}{\partial x}$ and $f_y$. Then write down $Df(x,y)$.
  2. For $f(x,y,z)=x^2y^3z^4$, compute $f_x$, $\ds\frac{\partial f}{\partial y}$, and $D_z f$. Then write down $Df(x,y,z)$.
When a function has multiple outputs, its partial derivatives will have multiple components, which we write as a column vector. For example, if $f(x,y)=(3x^2, \sin(x)+xy)$, then $f_x=\left(\begin{matrix}6x\\\cos(x)+y\end{matrix}\right)$. This is similar to when we were computing derivatives of space curves, for example in this problem.
Let $\vec F(x,y)=(-y^3,2xy)$, a 2D vector field. Then $\vec F_x=\begin{bmatrix}0\\2y\end{bmatrix}$, $\vec F_y=\begin{bmatrix}-3y^2\\2x\end{bmatrix}$, and $D\vec F = \begin{bmatrix}0&-3y^2\\2y&2x\end{bmatrix}$. Also, $d\vec F=\begin{bmatrix}0&-3y^2\\2y&2x\end{bmatrix}\begin{bmatrix}dx\\dy\end{bmatrix}$. Remember that we can visualize a vector field by drawing arrows on the plane (for example, like wind velocities at every point on the plane). We can interpret this last equation as saying that if we are at a point $(x,y)$ on the plane, and we move a small distance away to $(x+dx, y+dy)$, then $d\vec F$ approximates how much the wind vector changes between the two points (i.e., how much the output of $\vec F$ changes). The derivative $D\vec F$ relates a small change in inputs (moving a small distance) to changes the outputs (the wind vector). We can then ask questions like: what direction should I move so that the velocity of the wind goes down? What direction should I move so that the wind blows more in a northern direction? If I walk in this direction, will the wind keep pushing me in the same direction? Will it get stronger or weaker?
Use Sage to check your work in each part.
Do the following for each of the functions below:
  • Compute the partial derivatives and the total (matrix) derivative.
  • Tell what the range and domain are (i.e., $\mathbb{R}^?\to\mathbb{R}^?$).
  • Write down the relationship between small changes in inputs and outputs (in the form analagous to $d\vec y = D\vec f d\vec x$) and interpret this relationship graphically.
  1. The parametric curve $\vec r(t)=(t,\cos t,\sin t)$.
  2. The parametric surface $\vec r(u,v) = (u,v,u\cos(v))$. [Hint: compute partial derivatives $\vec r_u$, $\vec r_v$ and the total derivative $D\vec r(u,v)$, a 3-row by 2-column matrix].
  3. The vector field $\vec F(x,y) = (-y,xe^{3y})$. [Hint: compute partial derivatives $\vec F_x$ and $\vec F_y$ and the total derivative $D\vec F(x,y)$, a 2-row by 2-column matrix.]
  4. The parametric surface $\vec f(u,v)=(u^2,v^2,u-v)$.
  5. The space transformation $\vec T(r,\theta,z)=(r\cos\theta, r\sin\theta, z)$.
As you completed the problems above, did you notice any connections between the size of the matrix and the size of the input and output vectors? Make sure you ask in class about this. We'll make a connection. We've now seen that the derivative of $z=f(x,y)$ is a matrix $Df(x,y) = \begin{bmatrix}f_x & f_y\end{bmatrix}$. This means that $Df$ is itself a function from $\mathbb{R}^2$ to $\mathbb{R}^2$ that has inputs $x$ and $y$ and outputs $f_x$ and $f_y$. Therefore we can draw $Df$ as a 2d vector field.
Check your work with Sage.
Consider the function $f(x,y)=y-x^2$.
  1. In the $xy$ plane, please draw several level curves of $f$ (maybe $z=0$, $z=2$, $z=-4$, etc.) Write the height on each curve (so you're making a contour plot).
  2. Compute the derivative $Df$ (which we'll think of as a vector field).
  3. We'll examine the connection between the derivative and level curves much more when we study optimization later.
    Pick 8 points in the $xy$ plane that lie on the level curves you drew above. At these 8 points, add the vector given by the derivative evaluated at that point. For example, at $(0,0)$, draw the vector $Df(0,0)=(0,1)$, and at the point $(1,1)$, draw the vector $Df(1,1)=(-2,1)$. What do you observe about the relationship between the vectors and the contour lines?

The Geometry of Derivatives

Tangent planes of $z=f(x,y)$

We promised earlier in this chapter that you can obtain most of the results in multivariate calculus by replacing the $x$ and $y$ in $dy=f'dx$ with $\vec x$ and $\vec y$. Let's review how to find the tangent line for functions of the form $y=f(x)$, and then generalize to finding tangent planes for functions of the form $z=f(x,y)$.

Consider the function $y=f(x)=x^2$.
  1. The derivative is $f'(x) = ?$. At the point $x=3$ the derivative is $f'(3)=?$ and the output $y$ is $y=f(3)=?$.
  2. If we move from the point $(3,f(3))$ to the point $(x,y)$ along the tangent line, then a small change in $x$ is $dx=x-3$. What is $dy$?
  3. Differential notation states that a change in the output $dy$ equals the derivative times a change in the input $dx$, which gives us the equation $dy=f'(3)dx$. Replace $dx$, $dy$, and $f'(3)$ with what we know they equal, to obtain an equation $y-?=?(x-?)$. What line does this equation represent?
  4. Draw both $f$ and the equation from the previous part on the same axes.

In first semester calculus, differential notation says $dy=f' dx$. A small change in the inputs times the derivative gives the change in the outputs. For the next problem, the output is $z$, and input is $(x,y)$, which means differential notation says $dz = Df \begin{bmatrix}dx\\dy\end{bmatrix}$.

See Sage for a picture.
See 14.6: 9-12 for more practice.See Larson 13.7:17–30 for more practice.
Let $z=f(x,y)=9-x^2-y^2$.
  1. The derivative is $Df(x,y) = \begin{bmatrix}-2x&?\end{bmatrix}$. At the point $(x,y)=(2,1)$, the derivative is $Df(2,1) = \begin{bmatrix}-4&?\end{bmatrix}$ and the output $z$ is $z=f(2,1)=?$.
  2. If we move from the point $(2,1,f(2,1))$ to the point $(x,y,z)$ along the tangent plane, then a small change in $x$ is $dx=x-2$. What are $dy$ and $dz$?
  3. Explain why an equation of the tangent plane is
    We'll construct a graph of $f$ and it's tangent plane in class.
    $$ z-4=\begin{bmatrix}-4 & -2 \end{bmatrix}\begin{bmatrix}x-2\\y-1\end{bmatrix} \quad \text{or}\quad z-4=-4(x-2)-2(y-1).$$ [Hint: What does differential notation tell us?]

Look back at the previous two problems. The first semester calculus tangent line equation, with differential notation, generalized immediately to the tangent plane equation for functions of the form $z=f(x,y)$. We just used the differential notation $dy=f'dx$ in 2D, and generalized to $dz = Df \begin{bmatrix}dx\\dy\end{bmatrix}$. Let's repeat this on another problem.

See Sage.
See 14.6: 9-12 for more practice.See Larson 13.7:17–30 for more practice.
Let $f(x,y)=x^2+4xy+y^2$. Give an equation of the tangent plane at $(3,-1)$. [Hint: Just as in the previous problem, find $Df(x,y)$, $dx$, $dy$, and $dz$. Then use differential notation.]

Partial Derivatives of $z=f(x,y)$ functions

We can also understand tangent lines to surfaces using partial derivatives. The next problem will help you visualize what a partial derivative means in the graph of a surface.

See Sage.
See Larson 13.3:53–58 for more practice.
Consider the function $f(x,y)=9-x^2-y^2$. Construct a 3D surface plot of $f$. We'll focus on the point $(2,1)$.
  1. Let $y=1$ and construct a graph in the $xz$ plane of the curve $z=f(x,1)=9-x^2-1^2$. Find an equation of the tangent line to this curve at $x=2$. Write the equation in the form $(z-z_0)=m(x-x_0)$ (find $z_0,m,x_0$). Also, find a direction vector $(1,0,?)$ for this line.
  2. Let $x=2$ and construct a graph in the $yz$ plane of the curve $z=f(2,y)=9-2^2-y^2$. Find an equation of the tangent line to this curve at $y=1$. Write the equation in the form $(z-z_0)=m(y-y_0)$ (find $z_0,m,y_0$). Also, find a direction vector $(0,1,?)$ for this line.
  3. Compute $f_x$ and $f_y$ and then evaluate each at $(2,1)$. What does this have to do with the previous two parts?
  4. If the slope of a line $y=mx+b$ is $m$, then we know that an increase of $1$ unit in the $x$ direction will increase $y$ by $m$ units. Fill in the blanks by using the slopes of tangent lines calculated above for the function $z=f(x,y)=9-x^2-y^2$.
    • Increasing $x$ by 1 unit when $y$ does not change will cause $z$ to increase by about ? units.
    • Increasing $y$ by 1 unit when $x$ does not change will cause $z$ to increase by about ? units.
    • Increasing $x$ by 1 unit and $y$ by 1 unit will cause $z$ to increase by about ? units.
  5. In the previous part, we said that $z=9-x^2-y^2$ increased by about a certain amount each time. Why did we not say that $z=9-x^2-y^2$ increases by exactly that amount?
We'll conclude this section with a note about taking derivatives of higher orders. Since a partial derivative is a function, we can take partial derivatives of that function as well. If we want to first compute a partial with respect to $x$, and then with respect to $y$, we would use one of the following notations: $$f_{xy}=\ds\frac{\partial}{\partial y}\frac{\partial}{\partial x}f = \frac{\partial}{\partial y}\frac{\partial f}{\partial x} = \frac{\partial^2 f}{\partial y \partial x}.$$
See Larson 13.3:71–80 for more practice.
Complete the following:
  1. Let $f(x,y)=3xy^3+e^{x}.$ Compute the four second partials $$\ds \frac{\partial^2 f}{ \partial x^2},\quad \ds\frac{\partial^2 f}{\partial y \partial x},\quad \ds\frac{\partial^2 f}{\partial y^2}, \quad \text{ and }\ds\frac{\partial^2 f}{\partial x \partial y}.$$
  2. For $f(x,y)=x^2\sin(y)+y^3$, compute both $f_{xy}$ and $f_{yx}$.
  3. Make a conjecture about a relationship between $f_{xy}$ and $f_{yx}$. Then use your conjecture to quickly compute $f_{xy}$ if $$f(x,y)=3xy^2+\tan^{2}(\cos(x)) (x^{49}+x)^{1000}.$$
Clairaut's theorem implies that if a function $f$ is “nice”—if $f$ and its partial derivatives and second partials are defined and continuous around a point $(a,b)$, then $f_{xy}=f_{yx}$ at that point. We will be dealing with nice functions of this sort in this class, so we will have this relationship between $f_{xy}$ and $f_{yx}$.
Let $z=f(x,y)=9-x^2-y^2$. We'll look at the point $(2,1)$, like above.
  1. Compute $f_{xx}$, $f_{yy}$, $f_{xy}$, and $f_{yx}$ at the point $(2,1)$.
  2. How do you interpret each of those second partials graphically on this function?
Let $f(x,y)=3xy^2+y\ln x$.
  1. Compute $Df$, the matrix derivative.
  2. We can think of $Df$ as being a function with some inputs and some outputs. How many inputs and how many outputs does $Df$ have?
  3. Let $g(x,y)=Df(x,y)$. Compute $Dg$ (which is the second derivative of $f$, or $D^2f$). How does each entry in $D^2f$ relate to $f(x,y)$.

Partial derivatives of parametric functions

Now let's examine computations similar to those in this problem in the light of parametric surfaces. With parametric functions, partial derivatives are vectors instead of just numbers. But they still represent how the outputs change relative to changes in the inputs.

See Sage for a picture.
See 16.5: 27-30 for more practice.See Larson 15.5:35–38 for more practice.
Let $z=f(x,y)=9-x^2-y^2$. We'll parameterize this function by writing $x=x, y=y, z=9-x^2-y^2$, or in vector notation we'd write $$\vec r(x,y) = (x,y,f(x,y)).$$
  1. Compute $\ds \frac{\partial \vec r}{\partial x}$ and $\ds \frac{\partial \vec r}{\partial y}$. Then evaluate these partials at $(x,y)=(2,1)$. What do these vectors mean? [Hint: Draw the surface, and at the point $(2,1,4)$, draw these vectors. See the Sage plot. Think about how the vectors are telling you about how changes in the inputs are related to changes in the outputs.]
  2. The vectors above are tangent to the surface. Use them to obtain a normal vector to the tangent plane, and then give an equation of the tangent plane. (You should compare it to your equation from this problem.)

If the vectors you found in the previous problem matched up with the direction vectors of the lines in this problem, you are doing things right. Partial derivatives of parametric functions tell us tangent directions. We can interpret this also in terms of motion.

Consider the change of coordinates $\vec T(r,\theta) = (r\cos \theta, r\sin \theta)$.
  1. Use Sage to check your work.
    Compute the partial derivatives $\ds\frac{\partial \vec T}{\partial r}$ and $\ds\frac{\partial \vec r}{\partial \theta}$, and then state the derivative $D\vec T(r,\theta)$. [Hint: $D\vec T$ is a 2 by 2 matrix, and each partial derivative is a column. Use Sage to check your answer (see the link to Sage in the margin of this problem for help with how to do this)]
  2. Consider the polar point $(r,\theta) = (4,\pi/2)$:
    1. Compute $T(4,\pi/2)$ (i.e., the $x,y$ coordinates for the polar point). Draw the point.
    2. Compute $\ds \frac{\partial \vec T}{\partial r}(4,\pi/2)$, i.e., the partial with respect to $r$ evaluated at $r=4$, $\theta=\pi/2$. Plot this vector on the graph you drew in the previous part, starting at the point you drew.
    3. Compute $\ds \frac{\partial \vec T}{\partial \theta}(4,\pi/2)$, i.e., the partial with respect to $\theta$ evaluated at $r=4$, $\theta=\pi/2$. Plot this vector on the graph you drew in the previous part, starting at the point you drew.
  3. If you were standing at the polar point $(4,\pi/2)$ and someone said, “Hey you, keep your angle constant, but increase your radius,” then which direction would you move? What if someone said, “Hey you, keep your radius constant, but increase your angle”, which direction would you move?
  4. Now change the polar point to $(r,\theta) = (2,3\pi/4)$. Try, without doing any computations, to repeat part 2 (at the point draw both partial derivatives vectors). Explain.

If your answers to the 2nd and 3rd part above were the same, then you're doing this correctly. The partial derivatives of parametric functions tell us about motion and tangents. The next problem reinforces this concept. But first, a short review about equations of lines.

If you know that a line passes through the point $(1,2,3)$ and is parallel to the vector $(4,5,6)$, give a vector equation, and parametric equations, of the line.
Answer: A vector equation is $\vec r(t) = (4,5,6)t+(1,2,3)$ or $\vec r(t) = (4t+1, 5t+2, 6t+3)$. Parametric equations for this line are $x=4t+1$, $y=5t+2$, and $z=6t+3$.
Consider the parametric surface $\vec r(a,t) = (a\cos t, a\sin t, t)$ for $2\leq a\leq 4$ and $0\leq t\leq 4\pi$. We encountered this parametric surface in chapter 5 when we considered a smoke screen left by multiple jets.
  1. Compute the partial derivatives $\vec r_a$ and $\vec r_t$ (they are vectors), and state the total derivative. (How big is the matrix? What is the domain and range of $\vec r$?)
  2. Please see Sage or Wolfram Alpha for a plot of the surface. Click on either link.
    Look at a plot of the surface (use one of the links to the right). Now, suppose an object is on this surface at the point $\vec r(3,\pi) = (-3,0,\pi)$. At that point, please draw the partial derivatives $\vec r_a(3,\pi)$ and $\vec r_t(3,\pi)$.
  3. If you were standing at $\vec r(3,\pi)$ and someone told you, “Hey you, hold $t$ constant and increase $a$,” then in which direction would you move? What if someone told you, “Hey you, hold $a$ constant and increase $t$”?
  4. Give vector equations for two tangent lines to the surface at $\vec r(3,\pi)$. [Hint: You've got the point by plugging $(3,\pi)$ into $\vec r$, and you've got two different direction vectors from $D\vec r$. Once you have a point and a vector, we know (from chapter 2) how to get an equation of a line.]
In the previous problem, you should have noticed that the partial derivatives of $\vec r(a,t)$ are tangent vectors to the surface. Because we have two tangent vectors to the surface, we should be able to use them to construct a normal vector to the surface, and from that, a tangent plane. That's just cool.
If you know that a plane passes through the point $(1,2,3)$ and has normal vector $(4,5,6)$, then give an equation of the plane.
An equation of the plane is $4(x-1)+5(y-2)+6(y-3)=0$. If $(x,y,z)$ is any point in the plane, then the vector $(x-1,y-2,z-3)$ is a vector in the plane, and hence orthogonal to $(4,5,6)$. The dot product of these two vectors should be equal to zero, which is why the plane's equation is $(4,5,6)\cdot (x-1,y-2,z-3)=0$.
Consider again the parametric surface $\vec r(a,t) = (a\cos t, a\sin t, t)$ for $2\leq a\leq 4$ and $0\leq t\leq 4\pi$. We'd like to obtain an equation of the tangent plane to this surface at the point $\vec r(3,2\pi)$. Once you have a point on the plane, and a normal vector to the surface, we can use the concepts in chapter 2 to get an equation of the plane. Give an equation of the tangent plane. [Hint: To get the point, what is $\vec r(3,2\pi)$? The partial derivatives at $(3,2\pi)$ give us two tangent vectors. How do I obtain a vector orthogonal to both?]
See Sage.
See 16.5: 27-30 for more practice.See Larson 15.5:35–38 for more practice.
Consider the cone parametrized by $\vec r(u,v)=(u\cos v, u\sin v,u)$.
  1. Give vector equations of two tangent lines to the surface at $\vec r(2,\pi/2)$ (so $u=2$ and $v=\pi/2$).
  2. Give a normal vector to the surface at $\vec r(2,\pi/2)$.
  3. Give an equation of the tangent plane at $\vec r(2,\pi/2)$.

We now have two different ways to compute tangent planes. One way generalizes differential notation $dy=f'dx$ to $dz = Df \begin{bmatrix}dx\\dy\end{bmatrix}$ and then uses matrix multiplication. This way will extend to tangent objects in EVERY dimension. It's the key idea needed to work on really large problems. The other way requires that we parametrize the surface $z=f(x,y)$ as $\vec r(x,y)=(x,y,f(x,y))$ and then use the cross product on the partial derivatives. Both give the same answer. The next problem has you give a general formula for a tangent plane. To tackle this problem, you'll need to make sure you can use symbolic notation. The review problem should help with this.

Joe wants to to find the tangent line to $y=x^3$ at $x=2$. He knows the derivative is $y=3x^2$, and when $x=2$ the curve passes through $8$. So he writes an equation of the tangent line as $y-8=3x^2(x-2)$. What's wrong? What part of the general formula $y-f(c) = f'(c) (x-c)$ did Joe forget?
Joe forgot to replace $x$ with $2$ in the derivative. The equation should be $y-8=12(x-2)$. The notation $f'(c)$ is the part he forgot. He used $f'(x)=3x^2$ instead of $f'(2)=8$.
Consider the function $z=f(x,y)$. Explain why an equation of the tangent plane to $f$ at $(x,y)=(a,b)$ is given by $$z-f(a,b) = \frac{\partial f}{\partial x}(a,b) (x-a) + \frac{\partial f}{\partial y}(a,b) (y-b).$$ Then give an equation of the tangent plane to $f(x,y) = x^2+3xy$ at $(3,-1)$. [Hint: Use either differential notation or a parametrization, or try both ways.]

The Chain Rule

We'll now see how the chain rule generalizes to all dimensions. Just as before, we'll find that the first semester calculus rule will generalize to all dimensions if we replace $f'$ with the matrix $Df$. Let's recall the chain rule from first-semester calculus.
Let $x$ be a real number and $f$ and $g$ be functions of a single real variable. Suppose $f$ is differentiable at $g(x)$ and $g$ is differentiable at $x$. The derivative of $f\circ g$ at $x$ is $$(f\circ g)'(x) = \frac{d}{dx}(f\circ g)(x) = f'(g(x))\cdot g'(x).$$
Some people remember the theorem above as “the derivative of a composition is the derivative of the outside (evaluated at the inside) multiplied by the derivative of the inside.” If $u=g(x)$, we sometimes write $\ds \frac{df}{dx}=\frac{df}{du}\frac{du}{dx}$. The following problem should help us master this notation.
Suppose we know that $\ds f'(x) = \frac{\sin(x)}{2x^2+3}$ and $g(x)=\sqrt{x^2+1}$. Notice we don't know $f(x)$.
Not knowing a function $f$ is actually quite common in real life. We can often measure how something changes (a derivative) without knowing the function itself.
  1. State $f'(x)$ and $g'(x)$.
  2. State $f'(g(x))$, and explain the difference between $f'(x)$ and $f'(g(x))$.
  3. Use the chain rule to compute $(f\circ g)'(x)$.
We now generalize to higher dimensions. If I want to write $\vec f(\vec g(\vec x))$, then $\vec x$ must be a vector in the domain of $g$. After computing $\vec g(\vec x)$, we must get a vector that is in the domain of $f$. Since the chain rule in first semester calculus states $(f(g(x))'=f'(g(x))g'(x)$, then in high dimension it should state $D(f(g(x)) = Df(g(x))Dg(x)$, the product of two matrices.
In this problem, we showed that for a circular cylinder with volume $V=\pi r^2 h$, the derivative is $$DV(r,h)=\begin{bmatrix} 2\pi rh & \pi r^2 \end{bmatrix}.$$ Suppose that the radius and height are both changing with respect to time, where $r=3t$ and $h=t^2$. We'll write this parametrically as $\vec g(t) =(3t, t^2)$ (i.e., $\vec g(t)=(r,h)$).
  1. In $V=\pi r^2 h$, replace $r$ and $h$ with what they are in terms of $t$. Then compute $\dfrac{dV}{dt}$. This is a first-semester calculus derivative; we'll use it to check our work below.
  2. We know $DV(r,h)=\begin{bmatrix} 2\pi rh & \pi r^2 \end{bmatrix}$ and $Dg(t)= \begin{bmatrix} 3\\ 2t \end{bmatrix}.$ In first semester calculus, the chain rule was the product of derivatives. Multiply these matrices together to get $$\dfrac{dV}{dt}=DV(g(t))\, D(r,h)(t).$$ Did you get the same answer as the first part?
  3. To get the correct answer to the previous part, you had to replace $r$ and $h$ with what they equaled in terms of $t$. What part of the notation $\dfrac{dV}{dt}=DV(g(t))\, Dg(t)$ tells you to replace $r$ and $h$ with what they equal in terms of $t$?
Let's look at some physical examples involving motion and temperature, and try to connect what we know should happen to what the chain rule states.
Consider $f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t, 3\sin t)$. Imagine the following scenario: a horse runs around outside in the cold. The horse's position at time $t$ is given parametrically by the elliptical path $\vec r(t)$. The function $T=f(x,y)$ gives the temperature of the air at any point $(x,y)$.
  1. At time $t=0$, what is the horse's position $\vec r(0)$, and what is the temperature $f(\vec r(0))$ at that position? Find the temperatures at $t=\pi/2$, $t=\pi$, and $t=3\pi/2$ as well.
  2. If you end up with an ellipse and several concentric circles, then you've done this right.
    In the plane, draw the path of the horse for $t\in [0,2\pi]$. Then, on the same 2D graph, include a contour plot of the temperature function $f$. Make sure you include the level curves that pass through the points in this part, and write the temperature on each level curve you draw.
  3. This idea leads to an optimization technique, Lagrange multipliers, later in the semester.
    As the horse runs around, the temperature of the air around the horse is constantly changing. At which $t$ does the temperature around the horse reach a maximum? A minimum? Explain, using your graph.
  4. As the horse moves past the point at $t=\pi/4$, is the temperature of the surrounding air increasing or decreasing? In other words, is $\dfrac{df}{dt}$ positive or negative? Use your graph to explain.
  5. We'll complete this part in class, but you're welcome to give it a try yourself.
    See Sage.
    Draw the 3D surface plot of $f$. In the $xy$-plane of your 3D plot (so $z=0$) add the path of the horse. In class, we'll project the path of the horse up into the 3D surface.
Consider again $f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t, 3\sin t)$, which means $x=2\cos t$ and $y=3\sin t$.
  1. At the point $\vec r(t)$, we'd like a formula for the temperature $f(\vec r(t))$. What is the temperature of the horse at any time $t$? [In $f(x,y)$, replace $x$ and $y$ with what they are in terms of $t$.]
  2. Compute $df/dt$ (the derivative as you did in first-semester calculus).
  3. Construct a graph of $f(t)$ (use software to draw this if you like). From your graph, at what time values do the maxima and minima occur?
  4. What is $df/dt$ at $t=\pi/4$?
  5. Compare your work with the previous problem.
Consider again $f(x,y)=9-x^2-y^2$ and $\vec r(t)=(2\cos t, 3\sin t)$.
  1. Compute both $Df(x,y)$ and $D\vec r(t)$ as matrices. One should have two columns. The other should have one column (but two rows).
  2. The temperature at any time $t$ we can write symbolically as $f(r(t))$. First semester calculus suggests the derivative should be the produce $(f(\vec r(t))) ' = f'(\vec r(t))\vec r'(t)$. Write this using $D$ notation instead of prime notation.
  3. Compute the matrix product $DfD\vec r$, and then substitute $x=2\cos t$ and $y=3\sin t$.
  4. What is the change in temperature with respect to time at $t=\pi/4$? Is it positive or negative? Compare with the previous problem.
The previous three problems all focused on exactly the same concept. The first looked at the concept graphically, showing what it means to write $(f\circ \vec r)(t)=f(\vec r(t))$. The second reduced the problem to first-semester calculus. The third tackled the problem by considering matrix derivatives. In all three cases, we wanted to understand the following problem.
If $z=f(x,y)$ is a function of $x$ and $y$, and both $x$ and $y$ are functions of $t$ (i.e., $\vec r(t)=(x(t),y(t))$), then how do we discover how do changes in $t$ affect $f$? In other words, what is the derivative of $f$ with respect to $t$? Notationally, we seek $\ds \frac{df}{dt}$ which we formally write as $\ds \frac{d}{dt}[f(x(t),y(t))]$ or $\ds \frac{d}{dt} [f(\vec r(t))].$
To answer this problem, we use the chain rule, which is just matrix multiplication.
The Chain Rule Let $\vec x$ be a vector and $\vec f$ and $\vec g$ be functions so that the composition $\vec f(\vec g(\vec x))$ makes sense (we can use the output of $g$ as an input to $f$). Suppose $\vec f$ is differentiable at $\vec g(\vec x)$ and that $\vec g$ is differentiable at $\vec x$. Then the derivative of $\vec f\circ \vec g$ at $\vec x$ is $$D(\vec f\circ \vec g)(\vec x) = D\vec f(\vec g(\vec x))\cdot D\vec g(\vec x).$$ The derivative of a composition is equal to the derivative of the outside (evaluated at the inside), multiplied by the derivative of the inside.
This is exactly the same as the chain rule in first-semester calculus. The only difference is that now we have vectors above every variable and function, and we replaced the one-by-one matrices $f'$ and $g'$ with potentially larger matrices $Df$ and $Dg$. If we write everything in vector notation, the chain rule in all dimensions is the EXACT same as the chain rule in one dimension.
See 14.4: 1-6 for more practice. Don't use the formulas in the chapter, rather practice using matrix multiplication. The formulas are just a way of writing matrix multiplication without writing down the matrices, and only work for functions from $\RR^n\to\RR$. Our matrix multiplication method works for any function from $\RR^n\to\RR^m$.See Larson 13.5:1–6 for more practice (you can check answers in the back of the book). Don't use the formulas on pages 925–930. Instead, use matrix multiplication. The formulas are just a way of writing matrix multiplication without writing down the matrices, and only work for functions from $\RR^n\to\RR$. Our matrix multiplication method works for any function from $\RR^n\to\RR^m$.
Suppose that $f(x,y) = x^2+xy$ and that $x=2t+3$ and $y=3t^2+4$.
  1. Rewrite the parametric equations $x=2t+3$ and $y=3t^2+4$ in vector form, so we can apply the chain rule. This means you need to create a function $\vec r(t) = (\blank{1in}, \blank{1in})$.
  2. Compute the derivatives $Df(x,y)$ and $D\vec r(t)$, and then multiply the matrices together to obtain $\dfrac{df}{dt}$. How can you make your answer only depend on $t$ (not $x$ or $y$)?
  3. The chain rule states that $D(f\circ \vec r)(t) = Df(\vec r(t))D\vec r(t)$. Explain why we write $Df(\vec r(t))$ instead of $Df(x,y)$.
If you'd like to make sure you are correct, try the following: replace $x$ and $y$ in $f=x^2+xy$ with what they are in terms of $t$, and then just use first-semester calculus to find $df/dt$. Is it the same?
See 14.4: 7-12 for more practice.See Larson 13.5:7–10 for more practice (remember to use matrix multiplication, not the formulas from the book).
Suppose $f(x,y,z) = x+2y+3z^2$ and $x=u+v$, $y=2u-3v$, and $z=uv$. Our goal is to find how much $f$ changes if we were to change $u$ (so $\partial f/\partial u$) or if we were to change $v$ (so $\partial f/\partial v$). Try doing this problem without looking at the steps below, but instead try to follow the patterns in the previous problem on your own.
  1. Rewrite the equations for $x$, $y$, and $z$ in vector form $\vec r(u,v)=(x,y,z)$. If you were to graph $\vec r$, what kind of graph would you make?
  2. Compute the derivatives $Df(x,y,z)$ and $D\vec r(u,v)$, and then multiply them together. Notice that since this composite function has 2 inputs, namely $u$ and $v$, we should expect to get two columns when we are done.
  3. What are $\partial f/\partial u$ and $\partial f/\partial v$? [Hint: remember, each input variable gets a column.]
Let $\vec F(s,t) = (2s+t,3s-4t,t)$ and $s=3pq$ and $t=2p+q^2$. This means that changing $p$ and/or $q$ should cause $\vec F$ to change. Our goal is to find $\partial \vec F/\partial p$ and $\partial \vec F/\partial q$. Note that since $\vec F$ is a vector-valued function, the two partial derivatives should be vectors. Try doing this problem without looking at the steps below, but instead try to follow the patterns in the previous problems.
  1. Rewrite the parametric equations for $s$ and $t$ in vector form.
  2. Compute $D\vec F(s,t)$ and the derivative of your vector function from the previous part, and then multiply them together to find the derivative of $\vec F$ with respect to $p$ and $q$. How many columns should we expect to have when we are done multiplying matrices?
  3. What are $\partial \vec F/\partial p$ and $\partial \vec F/\partial q$?
(Optional challenge) Suppose $\vec F(u,v) = (3u-v,u+2v,3v)$, $\vec G(x,y,z)=(x^2+z, 4y-x)$, and $\vec r(t) = (t^3, 2t+1, 1-t)$. We want to examine $\vec F(\vec G(\vec r(t))$. This means that $\vec F\circ \vec G\circ \vec r$ is a function from $\RR^n\to\RR^m$ for what $n$ and $m$? Similar to first-semester calculus, since we have two functions nested inside of each other, we'll just need to apply the chain rule twice. Our goal is to find $d\vec F/dt$. Try to do this problem without looking at the steps below.
  1. Compute $D\vec F(u,v)$, $D\vec G(x,y,z)$, and $D\vec r(t)$.
  2. Use the chain rule (matrix multiplication) to find the derivative of $\vec F$ with respect to $t$. What size of matrix should we expect for the derivative?
Suppose $f(x,y)=x^2+3xy$ and $(x,y) = \vec r(t) = (3t,t^2)$. Compute both $Df(x,y)$ and $D\vec r(t)$. Then explain how you got your answer by writing what you did in terms of partial derivatives and regular derivatives.
Answer: We have $Df(x,y) = \begin{bmatrix}2x+3y&3y\end{bmatrix}$ and $D\vec r(t) = \begin{bmatrix}3\\2t\end{bmatrix}$. We just computed $f_x$ and $f_y$, and $dx/dt$ and $dy/dt$, which gave us $Df(x,y) = \begin{bmatrix}\partial f/\partial x&\partial f/\partial y\end{bmatrix}$ and $D\vec r(t) = \begin{bmatrix}dx/dt\\dy/dt\end{bmatrix}$.
See 14.4: 13-24 for more practice. Practice these problems by using matrix multiplication. The examples problems in the text use a “branch diagram,” which is just a way to express matrix multiplication without having to introduce matrices.See Larson 13.5:7–10 for more practice.
Complete the following:
  1. Suppose that $w=f(x,y,z)$ and that $x,y,z$ are all function of one variable $t$ (so $x=g(t), y=h(t), z=k(t)$). Use the chain rule with matrix multiplication to explain why $$\frac{dw}{dt} = \frac{\partial f}{\partial x}\frac{dg}{dt}+\frac{\partial f}{\partial y}\frac{dh}{dt}+\frac{\partial f}{\partial z}\frac{dk}{dt} .$$ which is equivalent to writing $$\frac{dw}{dt} = \frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}+\frac{\partial f}{\partial z}\frac{dz}{dt} .$$ [Hint: Rewrite the parametric equations for $x$, $y$, and $z$ in vector form $\vec r(t) = (x,y,z)$ and compute $Dw(x,y,z)$ and $D\vec r(t)$.]
  2. Suppose that $R=f(V,T,n,P)$, and that $V,T,n,P$ are all functions of $x$. Give a formula (similar to the above) for $\dfrac{dR}{dx}.$
See Larson 13.5:19–26 for more practice.
Make sure you practice problems 14.4: 13-24. Use matrix multiplication, rather than the “branch diagram” referenced in the text.See Larson 13.5:7–10 for more practice.
Suppose $z=f(s,t)$ and $s$ and $t$ are functions of $u$, $v$ and $w$. Use the chain rule to give a general formula for $\partial z/\partial u$, $\partial z/\partial v$, and $\partial z/\partial w$.
If $w=f(x,y,z)$ and $x,y,z$ are functions of $u$ and $v$, obtain formulas for $\dfrac{\partial f}{\partial u}$ and $\dfrac{\partial f}{\partial v}$.
We have $Df(x,y,z) =\begin{bmatrix}\dfrac{\partial f}{\partial x}&\dfrac{\partial f}{\partial y}&\dfrac{\partial f}{\partial z}\end{bmatrix}$. The parametrization $\vec r(u,v)=(x,y,z)$ has derivative $D\vec r =\begin{bmatrix} \dfrac{\partial x}{\partial u}&\dfrac{\partial x}{\partial v}\\ \dfrac{\partial y}{\partial u}&\dfrac{\partial y}{\partial v}\\ \dfrac{\partial z}{\partial u}&\dfrac{\partial z}{\partial v} \end{bmatrix}$. The product is $D(f(\vec r(u,v))) =\begin{bmatrix} \dfrac{\partial f}{\partial x}\dfrac{\partial x}{\partial u}+ \dfrac{\partial f}{\partial y}\dfrac{\partial y}{\partial u}+ \dfrac{\partial f}{\partial z}\dfrac{\partial z}{\partial u}& \dfrac{\partial f}{\partial x}\dfrac{\partial x}{\partial v}+ \dfrac{\partial f}{\partial y}\dfrac{\partial y}{\partial v}+ \dfrac{\partial f}{\partial z}\dfrac{\partial z}{\partial v} \end{bmatrix} $. The first column is $\dfrac{\partial f}{\partial u}$, and the second column is $\dfrac{\partial f}{\partial v}$.

You've now got the key ideas needed to use the chain rule in all dimensions. You'll find this shows up many places in upper-level math, physics, and engineering courses. The following problem will show you how you can use the general chain rule to get an extremely quick way to perform implicit differentiation from first-semester calculus.

See 14.4: 25-32 to practice using the formula you developed. To practice the idea developed in this problem, show that if $w=F(x,y,z)$ is held constant at $w=c$ and we assume that $z=f(x,y)$ depends on $x$ and $y$, then $\frac{\partial z}{\partial x} = -\frac{F_x}{F_z}$ and $\frac{\partial z}{\partial y} = -\frac{F_y}{F_z}$. This is done on page 798 at the bottom.
See Larson 13.5:27–30 for more practice, and see pages 929–930 for how the book derives these formulas.
Suppose $z=f(x,y)$. If $z$ is held constant, this produces a level curve. As an example, if $f(x,y) = x^2+3xy-y^3$ then $5=x^2+3xy-y^3$ is a level curve. Our goal in this problem is to find $dy/dx$ in terms of partial derivatives of $f$.
  1. Suppose $x=x$ and $y=y(x)$, so $y$ is a function of $x$. We can write this in parametric form as $\vec r(x) = (x,y(x))$. We now have $z=f(x,y)$ and $\vec r(x)=(x,y(x))$. Compute both $Df(x,y)$ and $D\vec r(x)$ symbolically. Don't use the function $f(x,y)=x^2+3xy-y^3$ until the last step.
  2. Use the chain rule to compute $D(f(\vec r(x)))$. What is $dz/dx$ (i.e., $df/dx$)?
  3. Since $z$ is held constant, we know that $dz/dx=0$. Use this fact, together with previous part, to explain why $\ds \frac{dy}{dx} = -\frac{f_x}{f_y} = -\frac{\partial f/ \partial x}{\partial f/ \partial y}$.
  4. For the curve $5=x^2+3xy-y^3$, use this formula to compute $dy/dx$.