Giant component

From testwiki
Jump to navigation Jump to search

Template:Short description

An Erdős–Rényi–Gilbert random graph with 1000 vertices at the critical edge probability p=1/(n1), showing a large component and many small ones. At this edge probability, the large component is not yet a giant component: it contains only a sublinear number of vertices.

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of Template:Mvar vertices is present, independently of the other edges, with probability Template:Mvar. In this model, if p1ϵn for any constant ϵ>0, then with high probability (in the limit as n goes to infinity) all connected components of the graph have size Template:Math, and there is no giant component. However, for p1+ϵn there is with high probability a single giant component, with all other components having size Template:Math. For p=pc=1n, intermediate between these two possibilities, the number of vertices in the largest component of the graph, Pinf is with high probability proportional to n2/3.[1]

Giant component is also important in percolation theory.[1][2] When a fraction of nodes, q=1p, is removed randomly from an ER network of degree k, there exists a critical threshold, pc=1k. Above pc there exists a giant component (largest cluster) of size, Pinf. Pinf fulfills, Pinf=p(1exp(kPinf)). For p<pc the solution of this equation is Pinf=0, i.e., there is no giant component.

At pc, the distribution of cluster sizes behaves as a power law, n(s)~s5/2 which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately n/2 edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when Template:Mvar edges have been added, for values of Template:Mvar close to but larger than n/2, the size of the giant component is approximately 4t2n.[1] However, according to the coupon collector's problem, Θ(nlogn) edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in tree-like random graphs with non-uniform degree distributions P(k). The degree distribution does not define a graph uniquely. However, under the assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree k, then the giant component exists[3] if and only ifk22k>0.This is known as the Molloy and Reed condition.[4] The first moment of P(k) is the mean degree of the network. In general, the n-th moment is defined as kn=𝔼[kn]=knP(k).

When there is no giant component, the expected size of the small component can also be determined by the first and second moments and it is 1+k22k+k2. However, when there is a giant component, the size of the giant component is more tricky to evaluate.[2]

Criteria for giant component existence in directed and undirected configuration graphs

Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional.[5] There are three types of connected components in directed graphs. For a randomly chosen vertex:

  1. out-component is a set of vertices that can be reached by recursively following all out-edges forward;
  2. in-component is a set of vertices that can be reached by recursively following all in-edges backward;
  3. weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Let a randomly chosen vertex has kin in-edges and kout out edges. By definition, the average number of in- and out-edges coincides so that c=𝔼[kin]=𝔼[kout]. If G0(x)=kP(k)xk is the generating function of the degree distribution P(k) for an undirected network, then G1(x) can be defined as G1(x)=kkkP(k)xk1. For directed networks, generating function assigned to the joint probability distribution P(kin,kout) can be written with two valuables x and y as: 𝒢(x,y)=kin,koutP(kin,kout)xkinykout, then one can define g(x)=1c𝒢x|y=1 and f(y)=1c𝒢y|x=1. The criteria for giant component existence in directed and undirected random graphs are given in the table below:

Type Criteria
undirected: giant component 𝔼[k2]2𝔼[k]>0,[3] or G'1(1)=1[5]
directed: giant in/out-component 𝔼[kinkout]𝔼[kin]>0,[5] or g'1(1)=f'1(1)=1[5]
directed: giant weak component 2𝔼[kin]𝔼[kinkout]𝔼[kin]𝔼[kout2]𝔼[kin]𝔼[kin2]+𝔼[kin2]𝔼[kout2]𝔼[kinkout]2>0[6]

See also

References

Template:Reflist