Bayes space

From testwiki
Jump to navigation Jump to search

Template:Refimprove Template:Short description

Bayes space is a function space defined as an equivalence class of measures with the same null-sets. Two measures are defined to be equivalent if they are proportional. The basic ideas of Bayes spaces have their roots in Compositional Data Analysis and the Aitchison geometry.[1] Applications are mainly in statistics, specifically functional data analysis of density functions.[2][3]

The basic structure of the Bayes space is that of a vector space, with addition and multiplication being defined by perturbation and powering.[4] The space is formed over a σ-finite reference/base measure, denoted λ or P depending on whether it is infinite or finite. Densities are considered as Radon-Nikodym derivatives of the measures with same null-sets as the base measure, and are equivalent if they are proportional. In case of finite base measures, Hilbert space structure can be achieved by defining a centered log-ratio on the measures, mapping them to a subset of L2(P) consisting of funtions integrating to 0[5]

Definitions and main results

Consider a finite base measure P (not necessarily a probability measure) on a domain Ω. This may be a uniform distribution on a bounded interval, or it can be a Radon-Nikodym derivative of the Lebesgue measure (the Gaussian distribution, for example). If we take two densities f,g with respect to P, they are said to be B-equivalent if there exists a c>0 s.t f(x)=cg(x), denoted f=Bg (the convention c= is used in cases where a measure is infinite). It can be shown that (=B) is an equivalence relation. The Bayes space B(P) is defined as the quotient space of all measures with the same null-sets in Ω as P under the equivalence relation (=B).

The first challenge to analysing density functions is that B(P) is not linear space under ordinary addition and multiplication since the ordinary difference between two densities would not be non-negative everywhere. Like in the Aitchison geometry for finite dimensional data, perturbation and powering is defined for densities:

Perturbation

(fg)(x)=Bf(x)g(x)

Powering

αf(x)=Bf(x)α

where f(x), g(x) are densities in B(P) and α is some real number. It can be shown using the properties of multiplication and powering of real numbers that (B(P),,) forms a vector space over the real numbers.

The definition of Bayes space does not strictly require a finite reference measure P. If Bayes space is defined over an infinite reference measure λ, it must be σ-finite (like the Lebesgue measure). The finite reference measure is, however, necessary for adding Hilbert space structure to a subset of B(P). Consider the subspace

Bp(P)={fB(P)|Ω|log(f(x))|pdP(x)<}. For p=2, this is a linear subspace and isometrically isomorphic to the Hilbert space L02(P)={gL2(P):Ωg(x)dP(x)=0} via the centered log-ratio (clr) transformation clr(f(x))=log(f(x))1P(Ω)log(f(x))dP(x)L02(P). The subspace of log-square integrable functions is termed the Bayes Hilbert space. It can be shown that the clr-transformation is a linear isomorphism between the two spaces. Defining an inner product on B2(P) as the inner product of the clr-transformations will provide the Hilbert space structure for B2(P), obtaining the centered log-ratio as a linear isometry.

Multivariate densities

The measure P does not have to be univariate (1 dimensional), but can also be defined as a product measure on cartesian products, characterising bivariate (2 dimensional) or multivariate densities. The geometric structure of Hilbert spaces can be used to decompose multivariate densities into orthogonal independent and interaction parts.[6][7], using the concept of "clr-marginals". This decomposition has relations to copula theory.[7] The geometry in B2(P) defines norms on densities that can be used to quantify "relative simplicial deviance," which is measure of how much of a bivariate distribution can be explained by assuming independence[6]

See also

References

Template:Reflist