Implicit function theorem

Template:Short description {{#invoke:sidebar|collapsible | class = plainlist | titlestyle = padding-bottom:0.25em; | pretitle = Part of a series of articles about | title = Calculus | image = $\int_{a}^{b} f^{'} (t) d t = f (b) - f (a)$ | listtitlestyle = text-align:center; | liststyle = border-top:1px solid #aaa;padding-top:0.15em;border-bottom:1px solid #aaa; | expanded = multivariable | abovestyle = padding:0.15em 0.25em 0.3em;font-weight:normal; | above =

| list2name = differential | list2titlestyle = display:block;margin-top:0.65em; | list2title = Template:Bigger | list2 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | heading1 = Definitions
 | content1 =

 | heading2 = Concepts
 | content2 =

 | heading3 = Rules and identities
 | content3 =

}}

| list3name = integral | list3title = Template:Bigger | list3 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | content1 =

| heading2 = Definitions

 | content2 =

 | heading3 = Integration by
 | content3 =

}}

| list4name = series | list4title = Template:Bigger | list4 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | content1 =

 | heading2 = Convergence tests
 | content2 =

}}

| list5name = vector | list5title = Template:Bigger | list5 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | content1 =

 | heading2 = Theorems
 | content2 =

}}

| list6name = multivariable | list6title = Template:Bigger | list6 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | heading1 = Formalisms
 | content1 =

 | heading2 = Definitions
 | content2 =

}}

| list7name = advanced | list7title = Template:Bigger | list7 ={{#invoke:sidebar|sidebar|child=yes

 |contentclass=hlist
 | content1 =

}}

| list8name = specialized | list8title = Template:Bigger | list8 =

| list9name = miscellanea | list9title = Template:Bigger | list9 =

}} In multivariable calculus, the implicit function theoremTemplate:Efn is a tool that allows relations to be converted to functions of several real variables. It does so by representing the relation as the graph of a function. There may not be a single function whose graph can represent the entire relation, but there may be such a function on a restriction of the domain of the relation. The implicit function theorem gives a sufficient condition to ensure that there is such a function.

More precisely, given a system of Template:Mvar equations Template:Math (often abbreviated into Template:Math), the theorem states that, under a mild condition on the partial derivatives (with respect to each Template:Math ) at a point, the Template:Mvar variables Template:Math are differentiable functions of the Template:Math in some neighborhood of the point. As these functions generally cannot be expressed in closed form, they are implicitly defined by the equations, and this motivated the name of the theorem.^[1]

In other words, under a mild condition on the partial derivatives, the set of zeros of a system of equations is locally the graph of a function.

History

Augustin-Louis Cauchy (1789–1857) is credited with the first rigorous form of the implicit function theorem. Ulisse Dini (1845–1918) generalized the real-variable version of the implicit function theorem to the context of functions of any number of real variables.^[2]

First example

The unit circle can be specified as the level curve Template:Math of the function Template:Math. Around point A, Template:Mvar can be expressed as a function Template:Math. In this example this function can be written explicitly as $g_{1} (x) = \sqrt{1 - x^{2}};$ in many cases no such explicit expression exists, but one can still refer to the *implicit* function Template:Math. No such function exists around point B. However at B it is possible to write a function Template:Math that describes the solution set locally.

If we define the function Template:Math, then the equation Template:Math cuts out the unit circle as the level set Template:Math. There is no way to represent the unit circle as the graph of a function of one variable Template:Math because for each choice of Template:Math, there are two choices of y, namely $\pm \sqrt{1 - x^{2}}$ .

However, it is possible to represent part of the circle as the graph of a function of one variable. If we let $g_{1} (x) = \sqrt{1 - x^{2}}$ for Template:Math, then the graph of Template:Math provides the upper half of the circle. Similarly, if $g_{2} (x) = - \sqrt{1 - x^{2}}$ , then the graph of Template:Math gives the lower half of the circle.

The purpose of the implicit function theorem is to tell us that functions like Template:Math and Template:Math almost always exist, even in situations where we cannot write down explicit formulas. It guarantees that Template:Math and Template:Math are differentiable, and it even works in situations where we do not have a formula for Template:Math.

Definitions

Let $f : ℝ^{n + m} \to ℝ^{m}$ be a continuously differentiable function. We think of $ℝ^{n + m}$ as the Cartesian product $ℝ^{n} \times ℝ^{m},$ and we write a point of this product as $(𝐱, 𝐲) = (x_{1}, \dots, x_{n}, y_{1}, \dots y_{m}) .$ Starting from the given function $f$ , our goal is to construct a function $g : ℝ^{n} \to ℝ^{m}$ whose graph $(𝐱, g (𝐱))$ is precisely the set of all $(𝐱, 𝐲)$ such that $f (𝐱, 𝐲) = 𝟎$ .

As noted above, this may not always be possible. We will therefore fix a point $(𝐚, 𝐛) = (a_{1}, \dots, a_{n}, b_{1}, \dots, b_{m})$ which satisfies $f (𝐚, 𝐛) = 𝟎$ , and we will ask for a $g$ that works near the point $(𝐚, 𝐛)$ . In other words, we want an open set $U \subset ℝ^{n}$ containing $𝐚$ , an open set $V \subset ℝ^{m}$ containing $𝐛$ , and a function $g : U \to V$ such that the graph of $g$ satisfies the relation $f = 𝟎$ on $U \times V$ , and that no other points within $U \times V$ do so. In symbols,

${(𝐱, g (𝐱)) ∣ 𝐱 \in U} = {(𝐱, 𝐲) \in U \times V ∣ f (𝐱, 𝐲) = 𝟎} .$

To state the implicit function theorem, we need the Jacobian matrix of $f$ , which is the matrix of the partial derivatives of $f$ . Abbreviating $(a_{1}, \dots, a_{n}, b_{1}, \dots, b_{m})$ to $(𝐚, 𝐛)$ , the Jacobian matrix is

$(D f) (𝐚, 𝐛) = [\begin{matrix} \frac{\partial f_{1}}{\partial x_{1}} (𝐚, 𝐛) & \dots & \frac{\partial f_{1}}{\partial x_{n}} (𝐚, 𝐛) & \frac{\partial f_{1}}{\partial y_{1}} (𝐚, 𝐛) & \dots & \frac{\partial f_{1}}{\partial y_{m}} (𝐚, 𝐛) \\ ⋮ & ⋱ & ⋮ & ⋮ & ⋱ & ⋮ \\ \frac{\partial f_{m}}{\partial x_{1}} (𝐚, 𝐛) & \dots & \frac{\partial f_{m}}{\partial x_{n}} (𝐚, 𝐛) & \frac{\partial f_{m}}{\partial y_{1}} (𝐚, 𝐛) & \dots & \frac{\partial f_{m}}{\partial y_{m}} (𝐚, 𝐛) \end{matrix}] = [\begin{matrix} X & Y \end{matrix}]$

where $X$ is the matrix of partial derivatives in the variables $x_{i}$ and $Y$ is the matrix of partial derivatives in the variables $y_{j}$ . The implicit function theorem says that if $Y$ is an invertible matrix, then there are $U$ , $V$ , and $g$ as desired. Writing all the hypotheses together gives the following statement.

Statement of the theorem

Let $f : ℝ^{n + m} \to ℝ^{m}$ be a continuously differentiable function, and let $ℝ^{n + m}$ have coordinates $(𝐱, 𝐲)$ . Fix a point $(𝐚, 𝐛) = (a_{1}, \dots, a_{n}, b_{1}, \dots, b_{m})$ with $f (𝐚, 𝐛) = 𝟎$ , where $𝟎 \in ℝ^{m}$ is the zero vector. If the Jacobian matrix (this is the right-hand panel of the Jacobian matrix shown in the previous section): $J_{f, 𝐲} (𝐚, 𝐛) = [\frac{\partial f_{i}}{\partial y_{j}} (𝐚, 𝐛)]$ is invertible, then there exists an open set $U \subset ℝ^{n}$ containing $𝐚$ such that there exists a unique function $g : U \to ℝ^{m}$ such that Template:Nowrap and Template:Nowrap Moreover, $g$ is continuously differentiable and, denoting the left-hand panel of the Jacobian matrix shown in the previous section as: $J_{f, 𝐱} (𝐚, 𝐛) = [\frac{\partial f_{i}}{\partial x_{j}} (𝐚, 𝐛)],$ the Jacobian matrix of partial derivatives of $g$ in $U$ is given by the matrix product:^[3] ${[\frac{\partial g_{i}}{\partial x_{j}} (𝐱)]}_{m \times n} = - {[J_{f, 𝐲} (𝐱, g (𝐱))]}_{m \times m}^{- 1} {[J_{f, 𝐱} (𝐱, g (𝐱))]}_{m \times n}$

Higher derivatives

If, moreover, $f$ is analytic or continuously differentiable $k$ times in a neighborhood of $(𝐚, 𝐛)$ , then one may choose $U$ in order that the same holds true for $g$ inside $U$ . ^[4] In the analytic case, this is called the analytic implicit function theorem.

Proof for 2D case

Suppose $F : ℝ^{2} \to ℝ$ is a continuously differentiable function defining a curve $F (𝐫) = F (x, y) = 0$ . Let $(x_{0}, y_{0})$ be a point on the curve. The statement of the theorem above can be rewritten for this simple case as follows: Template:Math theorem

Proof. Since $F$ is differentiable we write the differential of $F$ through partial derivatives: $d F = grad F \cdot d 𝐫 = \frac{\partial F}{\partial x} d x + \frac{\partial F}{\partial y} d y .$

Since we are restricted to movement on the curve $F = 0$ and by assumption $\frac{\partial F}{\partial y} \neq 0$ around the point $(x_{0}, y_{0})$ (since $\frac{\partial F}{\partial y}$ is continuous at $(x_{0}, y_{0})$ and ${\frac{\partial F}{\partial y} |}_{(x_{0}, y_{0})} \neq 0$ ). Therefore we have a first-order ordinary differential equation: $\partial_{x} F d x + \partial_{y} F d y = 0, y (x_{0}) = y_{0}$

Now we are looking for a solution to this ODE in an open interval around the point $(x_{0}, y_{0})$ for which, at every point in it, $\partial_{y} F \neq 0$ . Since $F$ is continuously differentiable and from the assumption we have $| \partial_{x} F | < \infty, | \partial_{y} F | < \infty, \partial_{y} F \neq 0.$

From this we know that $\frac{\partial_{x} F}{\partial_{y} F}$ is continuous and bounded on both ends. From here we know that $- \frac{\partial_{x} F}{\partial_{y} F}$ is Lipschitz continuous in both $x$ and $y$ . Therefore, by Cauchy-Lipschitz theorem, there exists unique $y (x)$ that is the solution to the given ODE with the initial conditions. Q.E.D.

The circle example

Let us go back to the example of the unit circle. In this case n = m = 1 and $f (x, y) = x^{2} + y^{2} - 1$ . The matrix of partial derivatives is just a 1 × 2 matrix, given by $(D f) (a, b) = [\begin{matrix} \frac{\partial f}{\partial x} (a, b) & \frac{\partial f}{\partial y} (a, b) \end{matrix}] = [\begin{matrix} 2 a & 2 b \end{matrix}]$

Thus, here, the Template:Math in the statement of the theorem is just the number Template:Math; the linear map defined by it is invertible if and only if Template:Math. By the implicit function theorem we see that we can locally write the circle in the form Template:Math for all points where Template:Math. For Template:Math we run into trouble, as noted before. The implicit function theorem may still be applied to these two points, by writing Template:Mvar as a function of Template:Mvar, that is, $x = h (y)$ ; now the graph of the function will be $(h (y), y)$ , since where Template:Math we have Template:Math, and the conditions to locally express the function in this form are satisfied.

The implicit derivative of y with respect to x, and that of x with respect to y, can be found by totally differentiating the implicit function $x^{2} + y^{2} - 1$ and equating to 0: $2 x d x + 2 y d y = 0,$ giving $\frac{d y}{d x} = - \frac{x}{y}$ and $\frac{d x}{d y} = - \frac{y}{x} .$

Application: change of coordinates

Suppose we have an Template:Mvar-dimensional space, parametrised by a set of coordinates $(x_{1}, \dots, x_{m})$ . We can introduce a new coordinate system $(x'_{1}, \dots, x'_{m})$ by supplying m functions $h_{1} \dots h_{m}$ each being continuously differentiable. These functions allow us to calculate the new coordinates $(x'_{1}, \dots, x'_{m})$ of a point, given the point's old coordinates $(x_{1}, \dots, x_{m})$ using $x'_{1} = h_{1} (x_{1}, \dots, x_{m}), \dots, x'_{m} = h_{m} (x_{1}, \dots, x_{m})$ . One might want to verify if the opposite is possible: given coordinates $(x'_{1}, \dots, x'_{m})$ , can we 'go back' and calculate the same point's original coordinates $(x_{1}, \dots, x_{m})$ ? The implicit function theorem will provide an answer to this question. The (new and old) coordinates $(x'_{1}, \dots, x'_{m}, x_{1}, \dots, x_{m})$ are related by f = 0, with $f (x'_{1}, \dots, x'_{m}, x_{1}, \dots, x_{m}) = (h_{1} (x_{1}, \dots, x_{m}) - x'_{1}, \dots, h_{m} (x_{1}, \dots, x_{m}) - x'_{m}) .$ Now the Jacobian matrix of f at a certain point (a, b) [ where $a = (x'_{1}, \dots, x'_{m}), b = (x_{1}, \dots, x_{m})$ ] is given by $(D f) (a, b) = [\begin{matrix} - 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & - 1 \end{matrix} | \begin{matrix} \frac{\partial h_{1}}{\partial x_{1}} (b) & \dots & \frac{\partial h_{1}}{\partial x_{m}} (b) \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial h_{m}}{\partial x_{1}} (b) & \dots & \frac{\partial h_{m}}{\partial x_{m}} (b) \end{matrix}] = [- I_{m} | J] .$ where I_m denotes the m × m identity matrix, and Template:Mvar is the Template:Math matrix of partial derivatives, evaluated at (a, b). (In the above, these blocks were denoted by X and Y. As it happens, in this particular application of the theorem, neither matrix depends on a.) The implicit function theorem now states that we can locally express $(x_{1}, \dots, x_{m})$ as a function of $(x'_{1}, \dots, x'_{m})$ if J is invertible. Demanding J is invertible is equivalent to det J ≠ 0, thus we see that we can go back from the primed to the unprimed coordinates if the determinant of the Jacobian J is non-zero. This statement is also known as the inverse function theorem.

Example: polar coordinates

As a simple application of the above, consider the plane, parametrised by polar coordinates Template:Math. We can go to a new coordinate system (cartesian coordinates) by defining functions Template:Math and Template:Math. This makes it possible given any point Template:Math to find corresponding Cartesian coordinates Template:Math. When can we go back and convert Cartesian into polar coordinates? By the previous example, it is sufficient to have Template:Math, with $J = [\begin{matrix} \frac{\partial x (R, θ)}{\partial R} & \frac{\partial x (R, θ)}{\partial θ} \\ \frac{\partial y (R, θ)}{\partial R} & \frac{\partial y (R, θ)}{\partial θ} \end{matrix}] = [\begin{matrix} \cos θ & - R \sin θ \\ \sin θ & R \cos θ \end{matrix}] .$ Since Template:Math, conversion back to polar coordinates is possible if Template:Math. So it remains to check the case Template:Math. It is easy to see that in case Template:Math, our coordinate transformation is not invertible: at the origin, the value of θ is not well-defined.

Generalizations

Banach space version

Based on the inverse function theorem in Banach spaces, it is possible to extend the implicit function theorem to Banach space valued mappings.^[5]^[6]

Let X, Y, Z be Banach spaces. Let the mapping Template:Math be continuously Fréchet differentiable. If $(x_{0}, y_{0}) \in X \times Y$ , $f (x_{0}, y_{0}) = 0$ , and $y \mapsto D f (x_{0}, y_{0}) (0, y)$ is a Banach space isomorphism from Y onto Z, then there exist neighbourhoods U of x₀ and V of y₀ and a Fréchet differentiable function g : U → V such that f(x, g(x)) = 0 and f(x, y) = 0 if and only if y = g(x), for all $(x, y) \in U \times V$ .

Implicit functions from non-differentiable functions

Various forms of the implicit function theorem exist for the case when the function f is not differentiable. It is standard that local strict monotonicity suffices in one dimension.^[7] The following more general form was proven by Kumagai based on an observation by Jittorntrum.^[8]^[9]

Consider a continuous function $f : ℝ^{n} \times ℝ^{m} \to ℝ^{n}$ such that $f (x_{0}, y_{0}) = 0$ . If there exist open neighbourhoods $A \subset ℝ^{n}$ and $B \subset ℝ^{m}$ of x₀ and y₀, respectively, such that, for all y in B, $f (\cdot, y) : A \to ℝ^{n}$ is locally one-to-one, then there exist open neighbourhoods $A_{0} \subset ℝ^{n}$ and $B_{0} \subset ℝ^{m}$ of x₀ and y₀, such that, for all $y \in B_{0}$ , the equation f(x, y) = 0 has a unique solution $x = g (y) \in A_{0},$ where g is a continuous function from B₀ into A₀.

Collapsing manifolds

Perelman’s collapsing theorem for 3-manifolds, the capstone of his proof of Thurston's geometrization conjecture, can be understood as an extension of the implicit function theorem.^[10]

Notes

Template:Notelist

References

Template:Reflist

Implicit function theorem

Contents

History

First example

Definitions

Statement of the theorem

Higher derivatives

Proof for 2D case

The circle example

Application: change of coordinates

Example: polar coordinates

Generalizations

Banach space version

Implicit functions from non-differentiable functions

Collapsing manifolds

See also

Notes

References

Further reading

Navigation menu

Implicit function theorem

History

First example

Definitions

Statement of the theorem

Higher derivatives

Proof for 2D case

The circle example

Application: change of coordinates

Example: polar coordinates

Generalizations

Banach space version

Implicit functions from non-differentiable functions

Collapsing manifolds

See also

Notes

References

Further reading

Navigation menu

Search