Classical capacity

Template:Short description In quantum information theory, the classical capacity of a quantum channel is the maximum rate at which classical data can be sent over it error-free in the limit of many uses of the channel. Holevo, Schumacher, and Westmoreland proved the following least upper bound on the classical capacity of any quantum channel $𝒩$ :

χ (𝒩) = \max_{ρ^{X A}} I (X; B)_{𝒩 (ρ)}

where $ρ^{X A}$ is a classical-quantum state of the following form:

ρ^{X A} = \sum_{x} p_{X} (x) | x ⟩ ⟨ x |^{X} \otimes ρ_{x}^{A},

$p_{X} (x)$ is a probability distribution, and each $ρ_{x}^{A}$ is a density operator that can be input to the channel $𝒩$ .

Achievability using sequential decoding

We briefly review the HSW coding theorem (the statement of the achievability of the Holevo information rate $I (X; B)$ for communicating classical data over a quantum channel). We first review the minimal amount of quantum mechanics needed for the theorem. We then cover quantum typicality, and finally we prove the theorem using a recent sequential decoding technique.

Review of quantum mechanics

In order to prove the HSW coding theorem, we really just need a few basic things from quantum mechanics. First, a quantum state is a unit trace, positive operator known as a density operator. Usually, we denote it by $ρ$ , $σ$ , $ω$ , etc. The simplest model for a quantum channel is known as a classical-quantum channel:

x \mapsto ρ_{x} .

The meaning of the above notation is that inputting the classical letter $x$ at the transmitting end leads to a quantum state $ρ_{x}$ at the receiving end. It is the task of the receiver to perform a measurement to determine the input of the sender. If it is true that the states $ρ_{x}$ are perfectly distinguishable from one another (i.e., if they have orthogonal supports such that $T r {ρ_{x} ρ_{x^{'}}} = 0$ for $x \neq x^{'}$ ), then the channel is a noiseless channel. We are interested in situations for which this is not the case. If it is true that the states $ρ_{x}$ all commute with one another, then this is effectively identical to the situation for a classical channel, so we are also not interested in these situations. So, the situation in which we are interested is that in which the states $ρ_{x}$ have overlapping support and are non-commutative.

The most general way to describe a quantum measurement is with a positive operator-valued measure (POVM). We usually denote the elements of a POVM as ${Λ_{m}}_{m}$ . These operators should satisfy positivity and completeness in order to form a valid POVM:

Λ_{m} \geq 0 \forall m

\sum_{m} Λ_{m} = I .

The probabilistic interpretation of quantum mechanics states that if someone measures a quantum state $ρ$ using a measurement device corresponding to the POVM ${Λ_{m}}$ , then the probability $p (m)$ for obtaining outcome $m$ is equal to

p (m) = Tr {Λ_{m} ρ},

and the post-measurement state is

ρ_{m}^{'} = \frac{1}{p (m)} \sqrt{Λ_{m}} ρ \sqrt{Λ_{m}},

if the person measuring obtains outcome $m$ . These rules are sufficient for us to consider classical communication schemes over cq channels.

Quantum typicality

The reader can find a good review of this topic in the article about the typical subspace.

Gentle operator lemma

The following lemma is important for our proofs. It demonstrates that a measurement that succeeds with high probability on average does not disturb the state too much on average:

Lemma: [Winter] Given an ensemble ${p_{X} (x), ρ_{x}}$ with expected density operator $ρ \equiv \sum_{x} p_{X} (x) ρ_{x}$ , suppose that an operator $Λ$ such that $I \geq Λ \geq 0$ succeeds with high probability on the state $ρ$ :

Tr {Λ ρ} \geq 1 - ϵ .

Then the subnormalized state $\sqrt{Λ} ρ_{x} \sqrt{Λ}$ is close in expected trace distance to the original state $ρ_{x}$ :

𝔼_{X} {{‖ \sqrt{Λ} ρ_{X} \sqrt{Λ} - ρ_{X} ‖}_{1}} \leq 2 \sqrt{ϵ} .

(Note that ${‖ A ‖}_{1}$ is the nuclear norm of the operator $A$ so that ${‖ A ‖}_{1} \equiv$ Tr ${\sqrt{A^{†} A}}$ .)

The following inequality is useful for us as well. It holds for any operators $ρ$ , $σ$ , $Λ$ such that $0 \leq ρ, σ, Λ \leq I$ :

Template:NumBlk

The quantum information-theoretic interpretation of the above inequality is that the probability of obtaining outcome $Λ$ from a quantum measurement acting on the state $ρ$ is upper bounded by the probability of obtaining outcome $Λ$ on the state $σ$ summed with the distinguishability of the two states $ρ$ and $σ$ .

Non-commutative union bound

Lemma: [Sen's bound] The following bound holds for a subnormalized state $σ$ such that $0 \leq σ$ and $T r {σ} \leq 1$ with $Π_{1}$ , ... , $Π_{N}$ being projectors: $Tr {σ} - Tr {Π_{N} \dots Π_{1} σ Π_{1} \dots Π_{N}} \leq 2 \sqrt{\sum_{i = 1}^{N} Tr {(I - Π_{i}) σ}},$

We can think of Sen's bound as a "non-commutative union bound" because it is analogous to the following union bound from probability theory:

\Pr {{(A_{1} \cap \dots \cap A_{N})}^{c}} = \Pr {A_{1}^{c} \cup \dots \cup A_{N}^{c}} \leq \sum_{i = 1}^{N} \Pr {A_{i}^{c}},

where $A_{1}, \dots, A_{N}$ are events. The analogous bound for projector logic would be

Tr {(I - Π_{1} \dots Π_{N} \dots Π_{1}) ρ} \leq \sum_{i = 1}^{N} Tr {(I - Π_{i}) ρ},

if we think of $Π_{1} \dots Π_{N}$ as a projector onto the intersection of subspaces. Though, the above bound only holds if the projectors $Π_{1}$ , ..., $Π_{N}$ are commuting (choosing $Π_{1} = | + ⟩ ⟨ + |$ , $Π_{2} = | 0 ⟩ ⟨ 0 |$ , and $ρ = | 0 ⟩ ⟨ 0 |$ gives a counterexample). If the projectors are non-commuting, then Sen's bound is the next best thing and suffices for our purposes here.

HSW theorem with the non-commutative union bound

We now prove the HSW theorem with Sen's non-commutative union bound. We divide up the proof into a few parts: codebook generation, POVM construction, and error analysis.

Codebook Generation. We first describe how Alice and Bob agree on a random choice of code. They have the channel $x \to ρ_{x}$ and a distribution $p_{X} (x)$ . They choose $M$ classical sequences $x^{n}$ according to the IID\ distribution $p_{X^{n}} (x^{n})$ . After selecting them, they label them with indices as ${x^{n} (m)}_{m \in [M]}$ . This leads to the following quantum codewords:

ρ_{x^{n} (m)} = ρ_{x_{1} (m)} \otimes \dots \otimes ρ_{x_{n} (m)} .

The quantum codebook is then ${ρ_{x^{n} (m)}}$ . The average state of the codebook is then

Template:NumBlk

where $ρ = \sum_{x} p_{X} (x) ρ_{x}$ .

POVM Construction . Sens' bound from the above lemma suggests a method for Bob to decode a state that Alice transmits. Bob should first ask "Is the received state in the average typical subspace?" He can do this operationally by performing a typical subspace measurement corresponding to ${Π_{ρ, δ}^{n}, I - Π_{ρ, δ}^{n}}$ . Next, he asks in sequential order, "Is the received codeword in the $m^{th}$ conditionally typical subspace?" This is in some sense equivalent to the question, "Is the received codeword the $m^{th}$ transmitted codeword?" He can ask these questions operationally by performing the measurements corresponding to the conditionally typical projectors ${Π_{ρ_{x^{n} (m)}, δ}, I - Π_{ρ_{x^{n} (m)}, δ}}$ .

Why should this sequential decoding scheme work well? The reason is that the transmitted codeword lies in the typical subspace on average:

𝔼_{X^{n}} {Tr {Π_{ρ, δ} ρ_{X^{n}}}} = Tr {Π_{ρ, δ} 𝔼_{X^{n}} {ρ_{X^{n}}}}

= Tr {Π_{ρ, δ} ρ^{\otimes n}}

\geq 1 - ϵ,

where the inequality follows from (\ref{eq:1st-typ-prop}). Also, the projectors $Π_{ρ_{x^{n} (m)}, δ}$ are "good detectors" for the states $ρ_{x^{n} (m)}$ (on average) because the following condition holds from conditional quantum typicality:

𝔼_{X^{n}} {Tr {Π_{ρ_{X^{n}}, δ} ρ_{X^{n}}}} \geq 1 - ϵ .

Error Analysis. The probability of detecting the $m^{th}$ codeword correctly under our sequential decoding scheme is equal to

Tr {Π_{ρ_{X^{n} (m)}, δ} {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (1)}, δ} Π_{ρ, δ}^{n} ρ_{x^{n} (m)} Π_{ρ, δ}^{n} {\hat{Π}}_{ρ_{X^{n} (1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} Π_{ρ_{X^{n} (m)}, δ}},

where we make the abbreviation $\hat{Π} \equiv I - Π$ . (Observe that we project into the average typical subspace just once.) Thus, the probability of an incorrect detection for the $m^{th}$ codeword is given by

1 - Tr {Π_{ρ_{X^{n} (m)}, δ} {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (1)}, δ} Π_{ρ, δ}^{n} ρ_{x^{n} (m)} Π_{ρ, δ}^{n} {\hat{Π}}_{ρ_{X^{n} (1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} Π_{ρ_{X^{n} (m)}, δ}},

and the average error probability of this scheme is equal to

1 - \frac{1}{M} \sum_{m} Tr {Π_{ρ_{X^{n} (m)}, δ} {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (1)}, δ} Π_{ρ, δ}^{n} ρ_{x^{n} (m)} Π_{ρ, δ}^{n} {\hat{Π}}_{ρ_{X^{n} (1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} Π_{ρ_{X^{n} (m)}, δ}} .

Instead of analyzing the average error probability, we analyze the expectation of the average error probability, where the expectation is with respect to the random choice of code:

Template:NumBlk

Our first step is to apply Sen's bound to the above quantity. But before doing so, we should rewrite the above expression just slightly, by observing that

1 = 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {ρ_{X^{n} (m)}}}

= 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ, δ}^{n} ρ_{X^{n} (m)}} + Tr {{\hat{Π}}_{ρ, δ}^{n} ρ_{X^{n} (m)}}}

= 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}} + \frac{1}{M} \sum_{m} Tr {{\hat{Π}}_{ρ, δ}^{n} 𝔼_{X^{n}} {ρ_{X^{n} (m)}}}

= 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}} + Tr {{\hat{Π}}_{ρ, δ}^{n} ρ^{\otimes n}}

\leq 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}} + ϵ

Substituting into (Template:EquationNote) (and forgetting about the small $ϵ$ term for now) gives an upper bound of

𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}}

- 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ_{X^{n} (m)}, δ} {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (1)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n} {\hat{Π}}_{ρ_{X^{n} (1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} Π_{ρ_{X^{n} (m)}, δ}}} .

We then apply Sen's bound to this expression with $σ = Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}$ and the sequential projectors as $Π_{ρ_{X^{n} (m)}, δ}$ , ${\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ}$ , ..., ${\hat{Π}}_{ρ_{X^{n} (1)}, δ}$ . This gives the upper bound $𝔼_{X^{n}} {\frac{1}{M} \sum_{m} 2 {[Tr {(I - Π_{ρ_{X^{n} (m)}, δ}) Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}} + \sum_{i = 1}^{m - 1} Tr {Π_{ρ_{X^{n} (i)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}]}^{1 / 2}} .$ Due to concavity of the square root, we can bound this expression from above by

2 {[𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {(I - Π_{ρ_{X^{n} (m)}, δ}) Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}} + \sum_{i = 1}^{m - 1} Tr {Π_{ρ_{X^{n} (i)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}}]}^{1 / 2}

\leq 2 {[𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {(I - Π_{ρ_{X^{n} (m)}, δ}) Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}} + \sum_{i \neq m} Tr {Π_{ρ_{X^{n} (i)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}}]}^{1 / 2},

where the second bound follows by summing over all of the codewords not equal to the $m^{th}$ codeword (this sum can only be larger).

We now focus exclusively on showing that the term inside the square root can be made small. Consider the first term:

𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {(I - Π_{ρ_{X^{n} (m)}, δ}) Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}}

\leq 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {(I - Π_{ρ_{X^{n} (m)}, δ}) ρ_{X^{n} (m)}} + {‖ ρ_{X^{n} (m)} - Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n} ‖}_{1}}

\leq ϵ + 2 \sqrt{ϵ} .

where the first inequality follows from (Template:EquationNote) and the second inequality follows from the gentle operator lemma and the properties of unconditional and conditional typicality. Consider now the second term and the following chain of inequalities:

\sum_{i \neq m} 𝔼_{X^{n}} {Tr {Π_{ρ_{X^{n} (i)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n}}}

= \sum_{i \neq m} Tr {𝔼_{X^{n}} {Π_{ρ_{X^{n} (i)}, δ}} Π_{ρ, δ}^{n} 𝔼_{X^{n}} {ρ_{X^{n} (m)}} Π_{ρ, δ}^{n}}

= \sum_{i \neq m} Tr {𝔼_{X^{n}} {Π_{ρ_{X^{n} (i)}, δ}} Π_{ρ, δ}^{n} ρ^{\otimes n} Π_{ρ, δ}^{n}}

\leq \sum_{i \neq m} 2^{- n [H (B) - δ]} Tr {𝔼_{X^{n}} {Π_{ρ_{X^{n} (i)}, δ}} Π_{ρ, δ}^{n}}

The first equality follows because the codewords $X^{n} (m)$ and $X^{n} (i)$ are independent since they are different. The second equality follows from (Template:EquationNote). The first inequality follows from (\ref{eq:3rd-typ-prop}). Continuing, we have

\leq \sum_{i \neq m} 2^{- n [H (B) - δ]} 𝔼_{X^{n}} {Tr {Π_{ρ_{X^{n} (i)}, δ}}}

\leq \sum_{i \neq m} 2^{- n [H (B) - δ]} 2^{n [H (B | X) + δ]}

= \sum_{i \neq m} 2^{- n [I (X; B) - 2 δ]}

\leq M 2^{- n [I (X; B) - 2 δ]} .

The first inequality follows from $Π_{ρ, δ}^{n} \leq I$ and exchanging the trace with the expectation. The second inequality follows from (\ref{eq:2nd-cond-typ}). The next two are straightforward.

Putting everything together, we get our final bound on the expectation of the average error probability:

1 - 𝔼_{X^{n}} {\frac{1}{M} \sum_{m} Tr {Π_{ρ_{X^{n} (m)}, δ} {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (1)}, δ} Π_{ρ, δ}^{n} ρ_{X^{n} (m)} Π_{ρ, δ}^{n} {\hat{Π}}_{ρ_{X^{n} (1)}, δ} \dots {\hat{Π}}_{ρ_{X^{n} (m - 1)}, δ} Π_{ρ_{X^{n} (m)}, δ}}}

\leq ϵ + 2 {[(ϵ + 2 \sqrt{ϵ}) + M 2^{- n [I (X; B) - 2 δ]}]}^{1 / 2} .

Thus, as long as we choose $M = 2^{n [I (X; B) - 3 δ]}$ , there exists a code with vanishing error probability.

References

Template:Quantum computing

Classical capacity

Contents

Achievability using sequential decoding

Review of quantum mechanics

Quantum typicality

Gentle operator lemma

Non-commutative union bound

HSW theorem with the non-commutative union bound

See also

References

Navigation menu

Classical capacity

Achievability using sequential decoding

Review of quantum mechanics

Quantum typicality

Gentle operator lemma

Non-commutative union bound

HSW theorem with the non-commutative union bound

See also

References

Navigation menu

Search