Normal-inverse-Wishart distribution

From testwiki
Jump to navigation Jump to search

Template:Short description Template:Probability distribution In probability theory and statistics, the normal-inverse-Wishart distribution (or Gaussian-inverse-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with unknown mean and covariance matrix (the inverse of the precision matrix).[1]

Definition

Suppose

๐|๐0,ฮป,๐œฎโˆผ๐’ฉ(๐|๐0,1ฮป๐œฎ)

has a multivariate normal distribution with mean ๐0 and covariance matrix 1ฮป๐œฎ, where

๐œฎ|๐œณ,ฮฝโˆผ๐’ฒโˆ’1(๐œฎ|๐œณ,ฮฝ)

has an inverse Wishart distribution. Then (๐,๐œฎ) has a normal-inverse-Wishart distribution, denoted as

(๐,๐œฎ)โˆผNIW(๐0,ฮป,๐œณ,ฮฝ).

Characterization

Probability density function

f(๐,๐œฎ|๐0,ฮป,๐œณ,ฮฝ)=๐’ฉ(๐|๐0,1ฮป๐œฎ)๐’ฒโˆ’1(๐œฎ|๐œณ,ฮฝ)

The full version of the PDF is as follows:[2]

f(๐,๐œฎ|๐0,ฮป,๐œณ,ฮฝ)=ฮปD/2|๐œณ|ฮฝ/2|๐œฎ|โˆ’ฮฝ+D+22(2ฯ€)D/22ฮฝD2ฮ“D(ฮฝ2)exp{โˆ’12Tr(๐œณ๐œฎโˆ’1)โˆ’ฮป2(๐โˆ’๐0)T๐œฎโˆ’1(๐โˆ’๐0)}

Here ฮ“D[โ‹…] is the multivariate gamma function and Tr(๐œณ) is the Trace of the given matrix.

Properties

Scaling

Marginal distributions

By construction, the marginal distribution over ๐œฎ is an inverse Wishart distribution, and the conditional distribution over ๐ given ๐œฎ is a multivariate normal distribution. The marginal distribution over ๐ is a multivariate t-distribution.

Posterior distribution of the parameters

Suppose the sampling density is a multivariate normal distribution

๐’š๐’Š|๐,๐œฎโˆผ๐’ฉp(๐,๐œฎ)

where ๐’š is an nร—p matrix and ๐’š๐’Š (of length p) is row i of the matrix .

With the mean and covariance matrix of the sampling distribution is unknown, we can place a Normal-Inverse-Wishart prior on the mean and covariance parameters jointly

(๐,๐œฎ)โˆผNIW(๐0,ฮป,๐œณ,ฮฝ).

The resulting posterior distribution for the mean and covariance matrix will also be a Normal-Inverse-Wishart

(๐,๐œฎ|y)โˆผNIW(๐n,ฮปn,๐œณn,ฮฝn),

where

๐n=ฮป๐0+n๐’šยฏฮป+n
ฮปn=ฮป+n
ฮฝn=ฮฝ+n
๐œณn=๐œณ+๐‘บ+ฮปnฮป+n(๐’šยฏโˆ’๐0)(๐’šยฏโˆ’๐0)Twith๐‘บ=โˆ‘i=1n(๐’š๐’Šโˆ’๐’šยฏ)(๐’š๐’Šโˆ’๐’šยฏ)T.


To sample from the joint posterior of (๐,๐œฎ), one simply draws samples from ๐œฎ|๐’šโˆผ๐’ฒโˆ’1(๐œณn,ฮฝn), then draw ๐|๐œฎ,๐’šโˆผ๐’ฉp(๐n,๐œฎ/ฮปn). To draw from the posterior predictive of a new observation, draw ๐’š~|๐,๐œฎ,๐’šโˆผ๐’ฉp(๐,๐œฎ) , given the already drawn values of ๐ and ๐œฎ.[3]

Generating normal-inverse-Wishart random variates

Generation of random variates is straightforward:

  1. Sample ๐œฎ from an inverse Wishart distribution with parameters ๐œณ and ฮฝ
  2. Sample ๐ from a multivariate normal distribution with mean ๐0 and variance 1ฮป๐œฎ

Notes

Template:Reflist

References

  • Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning. Springer Science+Business Media.
  • Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution." [1]

Template:ProbDistributions

  1. โ†‘ Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution." [2]
  2. โ†‘ Simon J.D. Prince(June 2012). Computer Vision: Models, Learning, and Inference. Cambridge University Press. 3.8: "Normal inverse Wishart distribution".
  3. โ†‘ Gelman, Andrew, et al. Bayesian data analysis. Vol. 2, p.73. Boca Raton, FL, USA: Chapman & Hall/CRC, 2014.