Chain rule for Kolmogorov complexity

Template:Short description Template:Inline The chain ruleTemplate:Cn for Kolmogorov complexity is an analogue of the chain rule for information entropy, which states:

H (X, Y) = H (X) + H (Y | X)

That is, the combined randomness of two sequences X and Y is the sum of the randomness of X plus whatever randomness is left in Y once we know X. This follows immediately from the definitions of conditional and joint entropy, and the fact from probability theory that the joint probability is the product of the marginal and conditional probability:

P (X, Y) = P (X) P (Y | X)

\Rightarrow \log P (X, Y) = \log P (X) + \log P (Y | X)

The equivalent statement for Kolmogorov complexity does not hold exactly; it is true only up to a logarithmic term:

K (x, y) = K (x) + K (y | x) + O (\log (K (x, y)))

(An exact version, Template:Math, holds for the prefix complexity KP, where Template:Math is a shortest program for x.)

It states that the shortest program printing X and Y is obtained by concatenating a shortest program printing X with a program printing Y given X, plus at most a logarithmic factor. The results implies that algorithmic mutual information, an analogue of mutual information for Kolmogorov complexity is symmetric: Template:Tmath for all x,y.

Proof

The ≤ direction is obvious: we can write a program to produce x and y by concatenating a program to produce x, a program to produce y given access to x, and (whence the log term) the length of one of the programs, so that we know where to separate the two programs for x and Template:Math upper-bounds this length).

For the ≥ direction, it suffices to show that for all Template:Mvar such that Template:Tmath we have that either

K (x | k, l) \leq k + O (1)

or

K (y | x, k, l) \leq l + O (1)

.

Consider the list (a₁,b₁), (a₂,b₂), ..., (a_e,b_e) of all pairs Template:Tmath produced by programs of length exactly Template:Tmath [hence Template:Tmath]. Note that this list

contains the pair Template:Tmath,
can be enumerated given Template:Mvar and Template:Mvar (by running all programs of length Template:Tmath in parallel),
has at most 2^K(x,y) elements (because there are at most 2ⁿ programs of length Template:Mvar).

First, suppose that x appears less than Template:Math times as first element. We can specify y given Template:Mvar by enumerating (a₁,b₁), (a₂,b₂), ... and then selecting Template:Tmath in the sub-list of pairs Template:Tmath. By assumption, the index of Template:Tmath in this sub-list is less than Template:Math and hence, there is a program for y given Template:Mvar of length Template:Tmath. Now, suppose that x appears at least Template:Math times as first element. This can happen for at most Template:Math different strings. These strings can be enumerated given Template:Mvar and hence x can be specified by its index in this enumeration. The corresponding program for x has size Template:Tmath. Theorem proved.

References

Template:Cite book

Template:Cite journal

Template:Cite journal

Chain rule for Kolmogorov complexity

Proof

References

Navigation menu

Search