Second moment method

From testwiki
Jump to navigation Jump to search

Template:Use American English Template:Short description

In mathematics, the second moment method is a technique used in probability theory and analysis to show that a random variable has positive probability of being positive. More generally, the "moment method" consists of bounding the probability that a random variable fluctuates far from its mean, by using its moments.[1]

The method is often quantitative, in that one can often deduce a lower bound on the probability that the random variable is larger than some constant times its expectation. The method involves comparing the second moment of random variables to the square of the first moment.

First moment method

The first moment method is a simple application of Markov's inequality for integer-valued variables. For a non-negative, integer-valued random variable Template:Math, we may want to prove that Template:Math with high probability. To obtain an upper bound for Template:Math, and thus a lower bound for Template:Math, we first note that since Template:Math takes only integer values, Template:Math. Since Template:Math is non-negative we can now apply Markov's inequality to obtain Template:Math. Combining these we have Template:Math; the first moment method is simply the use of this inequality.

Second moment method

In the other direction, Template:Math being "large" does not directly imply that Template:Math is small. However, we can often use the second moment to derive such a conclusion, using the Cauchy–Schwarz inequality.

Template:Math theorem

Template:Math proof

The method can also be used on distributional limits of random variables. Furthermore, the estimate of the previous theorem can be refined by means of the so-called Paley–Zygmund inequality. Suppose that Template:Math is a sequence of non-negative real-valued random variables which converge in law to a random variable Template:Math. If there are finite positive constants Template:Math, Template:Math such that E[Xn2]c1E[Xn]2E[Xn]c2

hold for every Template:Mvar, then it follows from the Paley–Zygmund inequality that for every Template:Mvar and Template:Mvar in Template:Open-open Pr(Xnc2θ)(1θ)2c1.

Consequently, the same inequality is satisfied by Template:Mvar.

Example application of method

Setup of problem

The Bernoulli bond percolation subgraph of a graph Template:Mvar at parameter Template:Mvar is a random subgraph obtained from Template:Mvar by deleting every edge of Template:Mvar with probability Template:Math, independently. The infinite complete binary tree Template:Mvar is an infinite tree where one vertex (called the root) has two neighbors and every other vertex has three neighbors. The second moment method can be used to show that at every parameter Template:Math with positive probability the connected component of the root in the percolation subgraph of Template:Mvar is infinite.

Application of method

Let Template:Mvar be the percolation component of the root, and let Template:Math be the set of vertices of Template:Mvar that are at distance Template:Mvar from the root. Let Template:Math be the number of vertices in Template:Math.

To prove that Template:Mvar is infinite with positive probability, it is enough to show that Pr(Xn>0  n)>0. Since the events {Xn>0} form a decreasing sequence, by continuity of probability measures this is equivalent to showing that infnPr(Xn>0)>0.

The Cauchy–Schwarz inequality gives E[Xn]2E[Xn2]E[(1Xn>0)2]=E[Xn2]Pr(Xn>0). Therefore, it is sufficient to show that infnE[Xn]2E[Xn2]>0, that is, that the second moment is bounded from above by a constant times the first moment squared (and both are nonzero). In many applications of the second moment method, one is not able to calculate the moments precisely, but can nevertheless establish this inequality.

In this particular application, these moments can be calculated. For every specific Template:Mvar in Template:Math, Pr(vK)=pn. Since |Tn|=2n, it follows that E[Xn]=2npn which is the first moment. Now comes the second moment calculation. E[Xn2]=E[vTnuTn1vK1uK]=vTnuTnPr(v,uK). For each pair Template:Mvar, Template:Mvar in Template:Math let Template:Math denote the vertex in Template:Math that is farthest away from the root and lies on the simple path in Template:Math to each of the two vertices Template:Mvar and Template:Mvar, and let Template:Math denote the distance from Template:Math to the root. In order for Template:Mvar, Template:Mvar to both be in Template:Math, it is necessary and sufficient for the three simple paths from Template:Math to Template:Mvar, Template:Mvar and the root to be in Template:Math. Since the number of edges contained in the union of these three paths is Template:Math, we obtain Pr(v,uK)=p2nk(v,u). The number of pairs Template:Math such that Template:Math is equal to 2s2ns2ns1=22ns1, for s=0,1,,n1 and equal to 2n for s=n. Hence, for p>12, E[Xn2]=(2p)n+s=0n122ns1p2ns=(2p)n+12(2p)n+(2p)2n+14p2, so that (E[Xn])2E[Xn2]=4p2(2p)1n2(2p)n+2p21p>0, which completes the proof.

Discussion

  • The choice of the random variables Template:Math was rather natural in this setup. In some more difficult applications of the method, some ingenuity might be required in order to choose the random variables Template:Math for which the argument can be carried through.
  • The Paley–Zygmund inequality is sometimes used instead of the Cauchy–Schwarz inequality and may occasionally give more refined results.
  • Under the (incorrect) assumption that the events Template:Mvar, Template:Mvar in Template:Mvar are always independent, one has Pr(v,uK)=Pr(vK)Pr(uK), and the second moment is equal to the first moment squared. The second moment method typically works in situations in which the corresponding events or random variables are “nearly independent".
  • In this application, the random variables Template:Math are given as sums Xn=vTn1vK. In other applications, the corresponding useful random variables are integrals Xn=fn(t)dμ(t), where the functions Template:Math are random. In such a situation, one considers the product measure Template:Math and calculates E[Xn2]=E[fn(x)fn(y)dμ(x)dμ(y)]=E[E[fn(x)fn(y)]dμ(x)dμ(y)], where the last step is typically justified using Fubini's theorem.

References

Template:Refbegin

Template:Refend