Kantorovich theorem

Template:Short description The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local convergence of Newton's method. It was first stated by Leonid Kantorovich in 1948.^[1]^[2] It is similar to the form of the Banach fixed-point theorem, although it states existence and uniqueness of a zero rather than a fixed point.^[3]

Newton's method constructs a sequence of points that under certain conditions will converge to a solution $x$ of an equation $f (x) = 0$ or a vector solution of a system of equation $F (x) = 0$ . The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.^[1]^[2]

Assumptions

Let $X \subset ℝ^{n}$ be an open subset and $F : X \subset ℝ^{n} \to ℝ^{n}$ a differentiable function with a Jacobian $F^{'} (𝐱)$ that is locally Lipschitz continuous (for instance if $F$ is twice differentiable). That is, it is assumed that for any $x \in X$ there is an open subset $U \subset X$ such that $x \in U$ and there exists a constant $L > 0$ such that for any $𝐱, 𝐲 \in U$

‖ F^{'} (𝐱) - F^{'} (𝐲) ‖ \leq L ‖ 𝐱 - 𝐲 ‖

holds. The norm on the left is the operator norm. In other words, for any vector $𝐯 \in ℝ^{n}$ the inequality

‖ F^{'} (𝐱) (𝐯) - F^{'} (𝐲) (𝐯) ‖ \leq L ‖ 𝐱 - 𝐲 ‖ ‖ 𝐯 ‖

must hold.

Now choose any initial point $𝐱_{0} \in X$ . Assume that $F^{'} (𝐱_{0})$ is invertible and construct the Newton step $𝐡_{0} = - F^{'} (𝐱_{0})^{- 1} F (𝐱_{0}) .$

The next assumption is that not only the next point $𝐱_{1} = 𝐱_{0} + 𝐡_{0}$ but the entire ball $B (𝐱_{1}, ‖ 𝐡_{0} ‖)$ is contained inside the set $X$ . Let $M$ be the Lipschitz constant for the Jacobian over this ball (assuming it exists).

As a last preparation, construct recursively, as long as it is possible, the sequences $(𝐱_{k})_{k}$ , $(𝐡_{k})_{k}$ , $(α_{k})_{k}$ according to

\begin{matrix} 𝐡_{k} & = - F^{'} (𝐱_{k})^{- 1} F (𝐱_{k}) \\ α_{k} & = M ‖ F^{'} (𝐱_{k})^{- 1} ‖ ‖ 𝐡_{k} ‖ \\ 𝐱_{k + 1} & = 𝐱_{k} + 𝐡_{k} . \end{matrix}

Statement

Now if $α_{0} \leq \frac{1}{2}$ then

a solution $𝐱^{*}$ of $F (𝐱^{*}) = 0$ exists inside the closed ball $\bar{B} (𝐱_{1}, ‖ 𝐡_{0} ‖)$ and
the Newton iteration starting in $𝐱_{0}$ converges to $𝐱^{*}$ with at least linear order of convergence.

A statement that is more precise but slightly more difficult to prove uses the roots $t^{*} \leq t^{* *}$ of the quadratic polynomial

p (t) = (\frac{1}{2} L ‖ F^{'} (𝐱_{0})^{- 1} ‖^{- 1}) t^{2} - t + ‖ 𝐡_{0} ‖

,

t^{* / * *} = \frac{2 ‖ 𝐡_{0} ‖}{1 \pm \sqrt{1 - 2 α_{0}}}

and their ratio

θ = \frac{t^{*}}{t^{* *}} = \frac{1 - \sqrt{1 - 2 α_{0}}}{1 + \sqrt{1 - 2 α_{0}}} .

Then

a solution $𝐱^{*}$ exists inside the closed ball $\bar{B} (𝐱_{1}, θ ‖ 𝐡_{0} ‖) \subset \bar{B} (𝐱_{0}, t^{*})$
it is unique inside the bigger ball $B (𝐱_{0}, t^{* *})$
and the convergence to the solution of $F$ is dominated by the convergence of the Newton iteration of the quadratic polynomial $p (t)$ towards its smallest root $t^{*}$ ,^[4] if $t_{0} = 0, t_{k + 1} = t_{k} - \frac{p (t_{k})}{p^{'} (t_{k})}$ , then
$‖ 𝐱_{k + p} - 𝐱_{k} ‖ \leq t_{k + p} - t_{k} .$
The quadratic convergence is obtained from the error estimate^[5]
$‖ 𝐱_{n + 1} - 𝐱^{*} ‖ \leq θ^{2^{n}} ‖ 𝐱_{n + 1} - 𝐱_{n} ‖ \leq \frac{θ^{2^{n}}}{2^{n}} ‖ 𝐡_{0} ‖ .$

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973),^[6]^[7] Gragg-Tapia (1974), Potra-Ptak (1980),^[8] Miel (1981),^[9] Potra (1984),^[10] can be derived from the Kantorovich theorem.^[11]

Generalizations

There is a q-analog for the Kantorovich theorem.^[12]^[13] For other generalizations/variations, see Ortega & Rheinboldt (1970).^[14]

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of linear programming.^[15]

References

Template:Reflist

Kantorovich theorem

Contents

Assumptions

Statement

Corollary

Generalizations

Applications

References

Further reading

Navigation menu

Kantorovich theorem

Assumptions

Statement

Corollary

Generalizations

Applications

References

Further reading

Navigation menu

Search