Weyr canonical form: Difference between revisions

From testwiki
Jump to navigation Jump to search
 
(No difference)

Latest revision as of 00:27, 31 January 2025

Template:Short description

The image shows an example of a general Weyr matrix consisting of two blocks each of which is a basic Weyr matrix. The basic Weyr matrix in the top-left corner has the structure (4,2,1) and the other one has the structure (2,2,1,1).

In mathematics, in linear algebra, a Weyr canonical form (or, Weyr form or Weyr matrix) is a square matrix which (in some sense) induces "nice" properties with matrices it commutes with. It also has a particularly simple structure and the conditions for possessing a Weyr form are fairly weak, making it a suitable tool for studying classes of commuting matrices. A square matrix is said to be in the Weyr canonical form if the matrix has the structure defining the Weyr canonical form. The Weyr form was discovered by the Czech mathematician Eduard Weyr in 1885.[1][2][3] The Weyr form did not become popular among mathematicians and it was overshadowed by the closely related, but distinct, canonical form known by the name Jordan canonical form.[3] The Weyr form has been rediscovered several times since Weyr’s original discovery in 1885.[4] This form has been variously called as modified Jordan form, reordered Jordan form, second Jordan form, and H-form.[4] The current terminology is credited to Shapiro who introduced it in a paper published in the American Mathematical Monthly in 1999.[4][5]

Recently several applications have been found for the Weyr matrix. Of particular interest is an application of the Weyr matrix in the study of phylogenetic invariants in biomathematics.

Definitions

Basic Weyr matrix

Definition

A basic Weyr matrix with eigenvalue λ is an n×n matrix W of the following form: There is an integer partition

n1+n2++nr=n of n with n1n2nr1

such that, when W is viewed as an r×r block matrix (Wij), where the (i,j) block Wij is an ni×nj matrix, the following three features are present:

  1. The main diagonal blocks Wii are the ni×ni scalar matrices λI for i=1,,r.
  2. The first superdiagonal blocks Wi,i+1 are full column rank ni×ni+1 matrices in reduced row-echelon form (that is, an identity matrix followed by zero rows) for i=1,,r1.
  3. All other blocks of W are zero (that is, Wij=0 when ji,i+1).

In this case, we say that W has Weyr structure (n1,n2,,nr).

Example

The following is an example of a basic Weyr matrix.

W= A Basic Weyr matrix with structure (4,2,2,1) =[W11W12W22W23W33W34W44]

In this matrix, n=9 and n1=4,n2=2,n3=2,n4=1. So W has the Weyr structure (4,2,2,1). Also,

W11=[λ0000λ0000λ0000λ]=λI4,W22=[λ00λ]=λI2,W33=[λ00λ]=λI2,W44=[λ]=λI1

and

W12=[10010000],W23=[1001],W34=[10].

General Weyr matrix

Definition

Let W be a square matrix and let λ1,,λk be the distinct eigenvalues of W. We say that W is in Weyr form (or is a Weyr matrix) if W has the following form:

W=[W1W2Wk]

where Wi is a basic Weyr matrix with eigenvalue λi for i=1,,k.

Example

The following image shows an example of a general Weyr matrix consisting of three basic Weyr matrix blocks. The basic Weyr matrix in the top-left corner has the structure (4,2,1) with eigenvalue 4, the middle block has structure (2,2,1,1) with eigenvalue -3 and the one in the lower-right corner has the structure (3, 2) with eigenvalue 0.

Relation between Weyr and Jordan forms

The Weyr canonical form W=P1JP is related to the Jordan form J by a simple permutation P for each Weyr basic block as follows: The first index of each Weyr subblock forms the largest Jordan chain. After crossing out these rows and columns, the first index of each new subblock forms the second largest Jordan chain, and so forth.[6]

The Weyr form is canonical

That the Weyr form is a canonical form of a matrix is a consequence of the following result:[3] Each square matrix A over an algebraically closed field is similar to a Weyr matrix W which is unique up to permutation of its basic blocks. The matrix W is called the Weyr (canonical) form of A.

Computation of the Weyr canonical form

Reduction to the nilpotent case

Let A be a square matrix of order n over an algebraically closed field and let the distinct eigenvalues of A be λ1,λ2,,λk. The Jordan–Chevalley decomposition theorem states that A is similar to a block diagonal matrix of the form

A=[λ1I+N1λ2I+N2λkI+Nk]=[λ1Iλ2IλkI]+[N1N2Nk]=D+N

where D is a diagonal matrix, N is a nilpotent matrix, and [D,N]=0, justifying the reduction of N into subblocks Ni. So the problem of reducing A to the Weyr form reduces to the problem of reducing the nilpotent matrices Ni to the Weyr form. This leads to the generalized eigenspace decomposition theorem.

Reduction of a nilpotent matrix to the Weyr form

Given a nilpotent square matrix A of order n over an algebraically closed field F, the following algorithm produces an invertible matrix C and a Weyr matrix W such that W=C1AC.

Step 1

Let A1=A

Step 2

  1. Compute a basis for the null space of A1.
  2. Extend the basis for the null space of A1 to a basis for the n-dimensional vector space Fn.
  3. Form the matrix P1 consisting of these basis vectors.
  4. Compute P11A1P1=[0B20A2]. A2 is a square matrix of size n − nullity (A1).

Step 3

If A2 is nonzero, repeat Step 2 on A2.

  1. Compute a basis for the null space of A2.
  2. Extend the basis for the null space of A2 to a basis for the vector space having dimension n − nullity (A1).
  3. Form the matrix P2 consisting of these basis vectors.
  4. Compute P21A2P2=[0B30A3]. A2 is a square matrix of size n − nullity (A1) − nullity(A2).

Step 4

Continue the processes of Steps 1 and 2 to obtain increasingly smaller square matrices A1,A2,A3, and associated invertible matrices P1,P2,P3, until the first zero matrix Ar is obtained.

Step 5

The Weyr structure of A is (n1,n2,,nr) where ni = nullity(Ai).

Step 6

  1. Compute the matrix P=P1[I00P2][I00P3][I00Pr] (here the I's are appropriately sized identity matrices).
  2. Compute X=P1AP. X is a matrix of the following form:
X=[0X12X13X1,r1X1r0X23X2,r1X2r0Xr1,r0].

Step 7

Use elementary row operations to find an invertible matrix Yr1 of appropriate size such that the product Yr1Xr,r1 is a matrix of the form Ir,r1=[IO].

Step 8

Set Q1= diag (I,I,,Yr11,I) and compute Q11XQ1. In this matrix, the (r,r1)-block is Ir,r1.

Step 9

Find a matrix R1 formed as a product of elementary matrices such that R11Q11XQ1R1 is a matrix in which all the blocks above the block Ir,r1 contain only 0's.

Step 10

Repeat Steps 8 and 9 on column r1 converting (r1,r2)-block to Ir1,r2 via conjugation by some invertible matrix Q2. Use this block to clear out the blocks above, via conjugation by a product R2 of elementary matrices.

Step 11

Repeat these processes on r2,r3,,3,2 columns, using conjugations by Q3,R3,,Qr2,Rr2,Qr1. The resulting matrix W is now in Weyr form.

Step 12

Let C=P1diag(I,P2)diag(I,Pr1)Q1R1Q2Rr2Qr1. Then W=C1AC.

Applications of the Weyr form

Some well-known applications of the Weyr form are listed below:[3]

  1. The Weyr form can be used to simplify the proof of Gerstenhaber’s Theorem which asserts that the subalgebra generated by two commuting n×n matrices has dimension at most n.
  2. A set of finite matrices is said to be approximately simultaneously diagonalizable if they can be perturbed to simultaneously diagonalizable matrices. The Weyr form is used to prove approximate simultaneous diagonalizability of various classes of matrices. The approximate simultaneous diagonalizability property has applications in the study of phylogenetic invariants in biomathematics.
  3. The Weyr form can be used to simplify the proofs of the irreducibility of the variety of all k-tuples of commuting complex matrices.

References

Template:Reflist