Eigenvalue perturbation

From testwiki
Jump to navigation Jump to search

Template:Main In mathematics, an eigenvalue perturbation problem is that of finding the eigenvectors and eigenvalues of a system Ax=λx that is perturbed from one with known eigenvectors and eigenvalues A0x0=λ0x0. This is useful for studying how sensitive the original system's eigenvectors and eigenvalues x0i,λ0i,i=1,n are to changes in the system. This type of analysis was popularized by Lord Rayleigh, in his investigation of harmonic vibrations of a string perturbed by small inhomogeneities.[1]

The derivations in this article are essentially self-contained and can be found in many texts on numerical linear algebra or numerical functional analysis. This article is focused on the case of the perturbation of a simple eigenvalue (see in multiplicity of eigenvalues).

Why generalized eigenvalues?

In the entry applications of eigenvalues and eigenvectors we find numerous scientific fields in which eigenvalues are used to obtain solutions. Generalized eigenvalue problems are less widespread but are a key in the study of vibrations. They are useful when we use the Galerkin method or Rayleigh-Ritz method to find approximate solutions of partial differential equations modeling vibrations of structures such as strings and plates; the paper of Courant (1943) [2] is fundamental. The Finite element method is a widespread particular case.

In classical mechanics, generalized eigenvalues may crop up when we look for vibrations of multiple degrees of freedom systems close to equilibrium; the kinetic energy provides the mass matrix M, the potential strain energy provides the rigidity matrix K. For further details, see the first section of this article of Weinstein (1941, in French) [3]

With both methods, we obtain a system of differential equations or Matrix differential equation MxΒ¨+BxΛ™+Kx=0 with the mass matrix M , the damping matrix B and the rigidity matrix K. If we neglect the damping effect, we use B=0, we can look for a solution of the following form x=eiωtu; we obtain that u and ω2are solution of the generalized eigenvalue problem ω2Mu+Ku=0

Setting of perturbation for a generalized eigenvalue problem

Suppose we have solutions to the generalized eigenvalue problem,

𝐊0𝐱0i=λ0i𝐌0𝐱0i.(0)

where 𝐊0 and 𝐌0 are matrices. That is, we know the eigenvalues Template:Math and eigenvectors Template:Math for Template:Math. It is also required that the eigenvalues are distinct.

Now suppose we want to change the matrices by a small amount. That is, we want to find the eigenvalues and eigenvectors of

𝐊𝐱i=λi𝐌𝐱i(1)

where

𝐊=𝐊0+δ𝐊𝐌=𝐌0+δ𝐌

with the perturbations δ𝐊 and δ𝐌 much smaller than 𝐊 and 𝐌 respectively. Then we expect the new eigenvalues and eigenvectors to be similar to the original, plus small perturbations:

λi=λ0i+δλi𝐱i=𝐱0i+δ𝐱i

Steps

We assume that the matrices are symmetric and positive definite, and assume we have scaled the eigenvectors such that

𝐱0j𝐌0𝐱0i=δij,𝐱iT𝐌𝐱j=δij(2)

where Template:Math is the Kronecker delta. Now we want to solve the equation

𝐊𝐱iλi𝐌𝐱i=0.

In this article we restrict the study to first order perturbation.

First order expansion of the equation

Substituting in (1), we get

(𝐊0+δ𝐊)(𝐱0i+δ𝐱i)=(λ0i+δλi)(𝐌0+δ𝐌)(𝐱0i+δ𝐱i),

which expands to

𝐊0𝐱0i+δ𝐊𝐱0i+𝐊0δ𝐱i+δ𝐊δ𝐱i=λ0i𝐌0𝐱0i+λ0i𝐌0δ𝐱i+λ0iδ𝐌𝐱0i+δλi𝐌0𝐱0i+λ0iδ𝐌δ𝐱i+δλiδ𝐌𝐱0i+δλi𝐌0δ𝐱i+δλiδ𝐌δ𝐱i.

Canceling from (0) (𝐊0𝐱0i=λ0i𝐌0𝐱0i) leaves

δ𝐊𝐱0i+𝐊0δ𝐱i+δ𝐊δ𝐱i=λ0i𝐌0δ𝐱i+λ0iδ𝐌𝐱0i+δλi𝐌0𝐱0i+λ0iδ𝐌δ𝐱i+δλiδ𝐌𝐱0i+δλi𝐌0δ𝐱i+δλiδ𝐌δ𝐱i.

Removing the higher-order terms, this simplifies to

𝐊0δ𝐱i+δ𝐊𝐱0i=λ0i𝐌0δ𝐱i+λ0iδ𝐌x0i+δλi𝐌0𝐱0i.(3)
In other words, δλi no longer denotes the exact variation of the eigenvalue but its first order approximation.

As the matrix is symmetric, the unperturbed eigenvectors are M orthogonal and so we use them as a basis for the perturbed eigenvectors. That is, we want to construct

δ𝐱i=j=1Nεij𝐱0j(4) with εij=𝐱0jTMδ𝐱i,

where the Template:Math are small constants that are to be determined.

In the same way, substituting in (2), and removing higher order terms, we get δ𝐱j𝐌0𝐱0i+𝐱0j𝐌0δ𝐱i+𝐱0jδ𝐌0𝐱0i=0(5)

The derivation can go on with two forks.

First fork: get first eigenvalue perturbation

Eigenvalue perturbation
We start with (3)𝐊0δ𝐱i+δ𝐊𝐱0i=λ0i𝐌0δ𝐱i+λ0iδ𝐌x0i+δλi𝐌0𝐱0i;

we left multiply with 𝐱0iT and use (2) as well as its first order variation (5); we get

𝐱0iTδ𝐊𝐱0i=λ0i𝐱0iTδ𝐌x0i+δλi

or

δλi=𝐱0iTδ𝐊𝐱0iλ0i𝐱0iTδ𝐌x0i

We notice that it is the first order perturbation of the generalized Rayleigh quotient with fixed x0i: R(K,M;x0i)=x0iTKx0i/x0iTMx0i, with x0iTMx0i=1

Moreover, for M=I, the formula δλi=x0iTδKx0i should be compared with Bauer-Fike theorem which provides a bound for eigenvalue perturbation.

Eigenvector perturbation

We left multiply (3) with x0jT for ji and get

𝐱0jT𝐊0δ𝐱i+𝐱0jTδ𝐊𝐱0i=λ0i𝐱0jT𝐌0δ𝐱i+λ0i𝐱0jTδ𝐌x0i+δλi𝐱0jT𝐌0𝐱0i.

We use 𝐱0jTK=λ0j𝐱0jTM and π±0jT𝐌0𝐱0i=0, for ji.

λ0j𝐱0jT𝐌0δ𝐱i+𝐱0jTδ𝐊𝐱0i=λ0i𝐱0jT𝐌0δ𝐱i+λ0i𝐱0jTδ𝐌x0i.

or

(λ0jλ0i)𝐱0jT𝐌0δ𝐱i+𝐱0jTδ𝐊𝐱0i=λ0i𝐱0jTδ𝐌x0i.

As the eigenvalues are assumed to be simple, for ji

ϵij=𝐱0jT𝐌0δ𝐱i=𝐱0jTδ𝐊𝐱0i+λ0i𝐱0jTδ𝐌x0i(λ0jλ0i),i=1,N;j=1,N;ji.

Moreover (5) (the first order variation of (2) ) yields 2ϵii=2𝐱0iT𝐌0δxi=𝐱0iTδM𝐱0i. We have obtained all the components of δxi .

Second fork: Straightforward manipulations

Substituting (4) into (3) and rearranging gives

𝐊0j=1Nεij𝐱0j+δ𝐊𝐱0i=λ0i𝐌0j=1Nεij𝐱0j+λ0iδ𝐌𝐱0i+δλi𝐌0𝐱0i(5)j=1Nεij𝐊0𝐱0j+δ𝐊𝐱0i=λ0i𝐌0j=1Nεij𝐱0j+λ0iδ𝐌𝐱0i+δλi𝐌0𝐱0i(applying πŠ0 to the sum)j=1Nεijλ0j𝐌0𝐱0j+δ𝐊𝐱0i=λ0i𝐌0j=1Nεij𝐱0j+λ0iδ𝐌𝐱0i+δλi𝐌0𝐱0i(using Eq. (1))

Because the eigenvectors are Template:Math-orthogonal when Template:Math is positive definite, we can remove the summations by left-multiplying by 𝐱0i:

𝐱0iεiiλ0i𝐌0𝐱0i+𝐱0iδ𝐊𝐱0i=λ0i𝐱0i𝐌0εii𝐱0i+λ0i𝐱0iδ𝐌𝐱0i+δλi𝐱0i𝐌0𝐱0i.

By use of equation (1) again:

𝐱0i𝐊0εii𝐱0i+𝐱0iδ𝐊𝐱0i=λ0i𝐱0i𝐌0εii𝐱0i+λ0i𝐱0iδ𝐌𝐱0i+δλi𝐱0i𝐌0𝐱0i.(6)

The two terms containing Template:Math are equal because left-multiplying (1) by 𝐱0i gives

𝐱0i𝐊0𝐱0i=λ0i𝐱0i𝐌0𝐱0i.

Canceling those terms in (6) leaves

𝐱0iδ𝐊𝐱0i=λ0i𝐱0iδ𝐌𝐱0i+δλi𝐱0i𝐌0𝐱0i.

Rearranging gives

δλi=𝐱0i(δ𝐊λ0iδ𝐌)𝐱0i𝐱0i𝐌0𝐱0i

But by (2), this denominator is equal to 1. Thus

δλi=𝐱0i(δ𝐊λ0iδ𝐌)𝐱0i.

Then, as λiλk for ik (assumption simple eigenvalues) by left-multiplying equation (5) by 𝐱0k:

εik=𝐱0k(δ𝐊λ0iδ𝐌)𝐱0iλ0iλ0k,ik.

Or by changing the name of the indices:

εij=𝐱0j(δ𝐊λ0iδ𝐌)𝐱0iλ0iλ0j,ij.

To find Template:Math, use the fact that:

𝐱i𝐌𝐱i=1

implies:

εii=12𝐱0iδ𝐌𝐱0i.

Summary of the first order perturbation result

In the case where all the matrices are Hermitian positive definite and all the eigenvalues are distinct,

λi=λ0i+𝐱0i(δ𝐊λ0iδ𝐌)𝐱0i𝐱i=𝐱0i(112𝐱0iδ𝐌𝐱0i)+j=1jiN𝐱0j(δ𝐊λ0iδ𝐌)𝐱0iλ0iλ0j𝐱0j

for infinitesimal δ𝐊 and δ𝐌 (the higher order terms in (3) being neglected).

So far, we have not proved that these higher order terms may be neglected. This point may be derived using the implicit function theorem; in next section, we summarize the use of this theorem in order to obtain a first order expansion.

Theoretical derivation

Perturbation of an implicit function.

In the next paragraph, we shall use the Implicit function theorem (Statement of the theorem ); we notice that for a continuously differentiable function f:ℝn+mℝm,f:(x,y)f(x,y), with an invertible Jacobian matrix Jf,b(x0,y0), from a point (x0,y0) solution of f(x0,y0)=0, we get solutions of f(x,y)=0 with x close to x0 in the form y=g(x) where g is a continuously differentiable function ; moreover the Jacobian marix of g is provided by the linear system

Jf,y(x,g(x))Jg,x(x)+Jf,x(x,g(x))=0(6).

As soon as the hypothesis of the theorem is satisfied, the Jacobian matrix of g may be computed with a first order expansion of f(x0+δx,y0+δy)=0, we get

Jf,x(x,g(x))δx+Jf,y(x,g(x))δy=0; as δy=Jg,x(x)δx, it is equivalent to equation (6).

Eigenvalue perturbation: a theoretical basis.

We use the previous paragraph (Perturbation of an implicit function) with somewhat different notations suited to eigenvalue perturbation; we introduce f~:ℝ2n2×ℝn+1ℝn+1, with

  • f~(K,M,λ,x)=(f(K,M,λ,x)fn+1(x)) with

f(K,M,λ,x)=Kxλx,fn+1(M,x)=xTMx1. In order to use the Implicit function theorem, we study the invertibility of the Jacobian Jf~;λ,x(K,M;λ0i,x0i) with

Jf~;λ,x(K,M;λi,xi)(δλ,δx)=(Mxi0)δλ+(KλM2xiTM)δxi. Indeed, the solution of

Jf~;λ0i,x0i(K,M;λ0i,x0i)(δλi,δxi)=(yyn+1) may be derived with computations similar to the derivation of the expansion.

δλi=x0iTy, and (λ0iλ0j)x0jTMδxi=xjTy,j=1,,n,ji;  or x0jTMδxi=xjTy/(λ0iλ0j), and 2x0iTMδxi=yn+1


When λi is a simple eigenvalue, as the eigenvectors x0j,j=1,,n form an orthonormal basis, for any right-hand side, we have obtained one solution therefore, the Jacobian is invertible.

The implicit function theorem provides a continuously differentiable function (K,M)(λi(K,M),xi(K,M)) hence the expansion with little o notation: λi=λ0i+δλi+o(δK+δM) xi=x0i+δxi+o(δK+δM). with

δλi=𝐱0iTδ𝐊𝐱0iλ0i𝐱0iTδ𝐌x0i; δxi=𝐱0jT𝐌0δ𝐱i𝐱0j with𝐱0jT𝐌0δ𝐱i=𝐱0jTδ𝐊𝐱0i+λ0i𝐱0jTδ𝐌x0i(λ0jλ0i),i=1,n;j=1,n;ji. This is the first order expansion of the perturbed eigenvalues and eigenvectors. which is proved.

Results of sensitivity analysis with respect to the entries of the matrices

The results

This means it is possible to efficiently do a sensitivity analysis on Template:Math as a function of changes in the entries of the matrices. (Recall that the matrices are symmetric and so changing Template:Math will also change Template:Math, hence the Template:Math term.)

λi𝐊(k)=𝐊(k)(λ0i+𝐱0i(δ𝐊λ0iδ𝐌)𝐱0i)=x0i(k)x0i()(2δk)λi𝐌(k)=𝐌(k)(λ0i+𝐱0i(δ𝐊λ0iδ𝐌)𝐱0i)=λix0i(k)x0i()(2δk).

Similarly

𝐱i𝐊(k)=j=1jiNx0j(k)x0i()(2δk)λ0iλ0j𝐱0j𝐱i𝐌(k)=𝐱0ix0i(k)x0i()2(2δk)j=1jiNλ0ix0j(k)x0i()λ0iλ0j𝐱0j(2δk).

Eigenvalue sensitivity, a small example

A simple case is K=[2bb0]; however you can compute eigenvalues and eigenvectors with the help of online tools such as [1] (see introduction in Wikipedia WIMS) or using Sage SageMath. You get the smallest eigenvalue λ=[b2+1+1] and an explicit computation λb=xx2+1; more over, an associated eigenvector is x~0=[x,(x2+1+1))]T; it is not an unitary vector; so x01x02=x~01x~02/x~02; we get x~02=2x2+1(x2+1+1) and x~01x~02=x(x2+1+1) ; hence x01x02=x2x2+1; for this example , we have checked that λb=2x01x02 or δλ=2x01x02δb.

Existence of eigenvectors

Note that in the above example we assumed that both the unperturbed and the perturbed systems involved symmetric matrices, which guaranteed the existence of N linearly independent eigenvectors. An eigenvalue problem involving non-symmetric matrices is not guaranteed to have N linearly independent eigenvectors, though a sufficient condition is that 𝐊 and 𝐌 be simultaneously diagonalizable.

The case of repeated eigenvalues

A technical report of Rellich [4] for perturbation of eigenvalue problems provides several examples. The elementary examples are in chapter 2. The report may be downloaded from archive.org. We draw an example in which the eigenvectors have a nasty behavior.

Example 1

Consider the following matrix B(ϵ)=ϵ[cos(2/ϵ),sin(2/ϵ)sin(2/ϵ),scos(2/ϵ)] and A(ϵ)=Ie1/ϵ2B; A(0)=I. For ϵ0, the matrix A(ϵ) has eigenvectors Φ1=[cos(1/ϵ),sin(1/ϵ)]T;Φ2=[sin(1/ϵ),cos(1/ϵ)]T belonging to eigenvalues λ1=1e1/ϵ2),λ2=1+e1/ϵ2). Since λ1λ2 for ϵ0 if uj(ϵ),j=1,2, are any normalized eigenvectors belonging to λj(ϵ),j=1,2 respectively then uj=eαj(ϵ)Φj(ϵ) where αj,j=1,2 are real for ϵ0. It is obviously impossible to define α1(ϵ) , say, in such a way that u1(ϵ) tends to a limit as ϵ0, because |u1(ϵ)|=|cos(1/ϵ)| has no limit as ϵ0.

Note in this example that Ajk(ϵ) is not only continuous but also has continuous derivatives of all orders. Rellich draws the following important consequence. << Since in general the individual eigenvectors do not depend continuously on the perturbation parameter even though the operator A(ϵ) does, it is necessary to work, not with an eigenvector, but rather with the space spanned by all the eigenvectors belonging to the same eigenvalue. >>

Example 2

This example is less nasty that the previous one. Suppose [K0] is the 2x2 identity matrix, any vector is an eigenvector; then u0=[1,1]T/2 is one possible eigenvector. But if one makes a small perturbation, such as

[K]=[K0]+[ϵ000]

Then the eigenvectors are v1=[1,0]T and v2=[0,1]T; they are constant with respect to ϵ so that u0v1 is constant and does not go to zero.

See also

References

Further reading

Books

Report

Journal papers

  • Simon, B. (1982). Large orders and summability of eigenvalue perturbation theory: a mathematical overview. International Journal of Quantum Chemistry, 21(1), 3-25.
  • Crandall, M. G., & Rabinowitz, P. H. (1973). Bifurcation, perturbation of simple eigenvalues, and linearized stability. Archive for Rational Mechanics and Analysis, 52(2), 161-180.
  • Stewart, G. W. (1973). Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM review, 15(4), 727-764.
  • LΓΆwdin, P. O. (1962). Studies in perturbation theory. IV. Solution of eigenvalue problem by projection operator formalism. Journal of Mathematical Physics, 3(5), 969-982.