Scatter matrix

From testwiki
Jump to navigation Jump to search

Template:Short description

For the notion in quantum mechanics, see scattering matrix.

In multivariate statistics and probability theory, the scatter matrix is a statistic that is used to make estimates of the covariance matrix, for instance of the multivariate normal distribution.

Definition

Given n samples of m-dimensional data, represented as the m-by-n matrix, X=[𝐱1,𝐱2,…,𝐱n], the sample mean is

𝐱‾=1nβˆ‘j=1n𝐱j

where 𝐱j is the j-th column of X.[1]

The scatter matrix is the m-by-m positive semi-definite matrix

S=βˆ‘j=1n(𝐱jβˆ’π±β€Ύ)(𝐱jβˆ’π±β€Ύ)T=βˆ‘j=1n(𝐱jβˆ’π±β€Ύ)βŠ—(𝐱jβˆ’π±β€Ύ)=(βˆ‘j=1n𝐱j𝐱jT)βˆ’n𝐱‾𝐱‾T

where (β‹…)T denotes matrix transpose,[2] and multiplication is with regards to the outer product. The scatter matrix may be expressed more succinctly as

S=XCnXT

where Cn is the n-by-n centering matrix.

Application

The maximum likelihood estimate, given n samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix

CML=1nS.[3]

When the columns of X are independently sampled from a multivariate normal distribution, then S has a Wishart distribution.

See also

References

Template:Reflist


Template:Statistics-stub Template:Matrix-stub