Matrix sign function

Template:Short description

In mathematics, the matrix sign function is a matrix function on square matrices analogous to the complex sign function.^[1]

It was introduced by J.D. Roberts in 1971 as a tool for model reduction and for solving Lyapunov and Algebraic Riccati equation in a technical report of Cambridge University, which was later published in a journal in 1980.^[2]^[3]

Definition

The matrix sign function is a generalization of the complex signum function

$csgn (z) = {\begin{matrix} 1 & if R e (z) > 0, \\ - 1 & if R e (z) < 0, \end{matrix}$

to the matrix valued analogue $csgn (A)$ . Although the sign function is not analytic, the matrix function is well defined for all matrices that have no eigenvalue on the imaginary axis, see for example the Jordan-form-based definition (where the derivatives are all zero).

Properties

Theorem: Let $A \in ℂ^{n \times n}$ , then $csgn (A)^{2} = I$ .^[1]

Theorem: Let $A \in ℂ^{n \times n}$ , then $csgn (A)$ is diagonalizable and has eigenvalues that are $\pm 1$ .^[1]

Theorem: Let $A \in ℂ^{n \times n}$ , then $(I + csgn (A)) / 2$ is a projector onto the invariant subspace associated with the eigenvalues in the right-half plane, and analogously for $(I - csgn (A)) / 2$ and the left-half plane.^[1]

Theorem: Let $A \in ℂ^{n \times n}$ , and $A = P [\begin{matrix} J_{+} & 0 \\ 0 & J_{-} \end{matrix}] P^{- 1}$ be a Jordan decomposition such that $J_{+}$ corresponds to eigenvalues with positive real part and $J_{-}$ to eigenvalue with negative real part. Then $csgn (A) = P [\begin{matrix} I_{+} & 0 \\ 0 & - I_{-} \end{matrix}] P^{- 1}$ , where $I_{+}$ and $I_{-}$ are identity matrices of sizes corresponding to $J_{+}$ and $J_{-}$ , respectively.^[1]

Computational methods

The function can be computed with generic methods for matrix functions, but there are also specialized methods.

Newton iteration

The Newton iteration can be derived by observing that $csgn (x) = \sqrt{x^{2}} / x$ , which in terms of matrices can be written as $csgn (A) = A^{- 1} \sqrt{A^{2}}$ , where we use the matrix square root. If we apply the Babylonian method to compute the square root of the matrix $A^{2}$ , that is, the iteration $X_{k + 1} = \frac{1}{2} (X_{k} + A X_{k}^{- 1})$ , and define the new iterate $Z_{k} = A^{- 1} X_{k}$ , we arrive at the iteration

$Z_{k + 1} = \frac{1}{2} (Z_{k} + Z_{k}^{- 1})$ ,

where typically $Z_{0} = A$ . Convergence is global, and locally it is quadratic.^[1]^[2]

The Newton iteration uses the explicit inverse of the iterates $Z_{k}$ .

Newton–Schulz iteration

To avoid the need of an explicit inverse used in the Newton iteration, the inverse can be approximated with one step of the Newton iteration for the inverse, $Z_{k}^{- 1} \approx Z_{k} (2 I - Z_{k}^{2})$ , derived by Schulz(de) in 1933.^[4] Substituting this approximation into the previous method, the new method becomes

$Z_{k + 1} = \frac{1}{2} Z_{k} (3 I - Z_{k}^{2})$ .

Convergence is (still) quadratic, but only local (guaranteed for $‖ I - A^{2} ‖ < 1$ ).^[1]

Applications

Solutions of Sylvester equations

Theorem:^[2]^[3] Let $A, B, C \in ℝ^{n \times n}$ and assume that $A$ and $B$ are stable, then the unique solution to the Sylvester equation, $A X + X B = C$ , is given by $X$ such that

$[\begin{matrix} - I & 2 X \\ 0 & I \end{matrix}] = csgn ([\begin{matrix} A & - C \\ 0 & - B \end{matrix}]) .$

Proof sketch: The result follows from the similarity transform

$[\begin{matrix} A & - C \\ 0 & - B \end{matrix}] = [\begin{matrix} I & X \\ 0 & I \end{matrix}] [\begin{matrix} A & 0 \\ 0 & - B \end{matrix}] {[\begin{matrix} I & X \\ 0 & I \end{matrix}]}^{- 1},$

since

$csgn ([\begin{matrix} A & - C \\ 0 & - B \end{matrix}]) = [\begin{matrix} I & X \\ 0 & I \end{matrix}] [\begin{matrix} I & 0 \\ 0 & - I \end{matrix}] [\begin{matrix} I & - X \\ 0 & I \end{matrix}],$

due to the stability of $A$ and $B$ .

The theorem is, naturally, also applicable to the Lyapunov equation. However, due to the structure the Newton iteration simplifies to only involving inverses of $A$ and $A^{T}$ .

Solutions of algebraic Riccati equations

There is a similar result applicable to the algebraic Riccati equation, $A^{H} P + P A - P F P + Q = 0$ .^[1]^[2] Define $V, W \in ℂ^{2 n \times n}$ as

$[\begin{matrix} V & W \end{matrix}] = csgn ([\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}]) - [\begin{matrix} I & 0 \\ 0 & I \end{matrix}] .$

Under the assumption that $F, Q \in ℂ^{n \times n}$ are Hermitian and there exists a unique stabilizing solution, in the sense that $A - F P$ is stable, that solution is given by the over-determined, but consistent, linear system

$V P = - W .$

Proof sketch: The similarity transform

$[\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}] = [\begin{matrix} P & - I \\ I & 0 \end{matrix}] [\begin{matrix} - (A - F P) & - F \\ 0 & (A - F P) \end{matrix}] {[\begin{matrix} P & - I \\ I & 0 \end{matrix}]}^{- 1},$

and the stability of $A - F P$ implies that

$(csgn ([\begin{matrix} A^{H} & Q \\ F & - A \end{matrix}]) - [\begin{matrix} I & 0 \\ 0 & I \end{matrix}]) [\begin{matrix} X & - I \\ I & 0 \end{matrix}] = [\begin{matrix} X & - I \\ I & 0 \end{matrix}] [\begin{matrix} 0 & Y \\ 0 & - 2 I \end{matrix}],$

for some matrix $Y \in ℂ^{n \times n}$ .

Computations of matrix square-root

The Denman–Beavers iteration for the square root of a matrix can be derived from the Newton iteration for the matrix sign function by noticing that $A - P I P = 0$ is a degenerate algebraic Riccati equation^[3] and by definition a solution $P$ is the square root of $A$ .

References

Template:Reflist

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 Template:Cite book
↑ ^2.0 ^2.1 ^2.2 ^2.3 Template:Cite journal
↑ ^3.0 ^3.1 ^3.2 Template:Cite journal
↑ Template:Cite journal

[:0-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 ^1.6 ^1.7 Template:Cite book

[:1-2] 2.0 ^2.1 ^2.2 ^2.3 Template:Cite journal

[:2-3] 3.0 ^3.1 ^3.2 Template:Cite journal

[4] Template:Cite journal

[1]

[2]

[3]

[4]

Matrix sign function

Contents

Definition

Properties

Computational methods

Newton iteration

Newton–Schulz iteration

Applications

Solutions of Sylvester equations

Solutions of algebraic Riccati equations

Computations of matrix square-root

References

Navigation menu

Matrix sign function

Definition

Properties

Computational methods

Newton iteration

Newton–Schulz iteration

Applications

Solutions of Sylvester equations

Solutions of algebraic Riccati equations

Computations of matrix square-root

References

Navigation menu

Search