Weibull distribution

From testwiki
Revision as of 04:28, 29 January 2025 by imported>Ivtue (MOS:REFSPACE)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Short description Template:Infobox probability distribution

In probability theory and statistics, the Weibull distribution Template:IPAc-en is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.

The distribution is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1939,[1][2] although it was first identified by René Maurice Fréchet and first applied by Template:Harvtxt to describe a particle size distribution.

Definition

Standard parameterization

The probability density function of a Weibull random variable is[3][4]

f(x;λ,k)={kλ(xλ)k1e(x/λ)k,x0,0,x<0,

where k > 0 is the shape parameter and λ > 0 is the scale parameter of the distribution. Its complementary cumulative distribution function is a stretched exponential function. The Weibull distribution is related to a number of other probability distributions; in particular, it interpolates between the exponential distribution (k = 1) and the Rayleigh distribution (k = 2 and λ=2σ).[5]

If the quantity, x, is a "time-to-failure", the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, k, is that power plus one, and so this parameter can be interpreted directly as follows:[6]

  • A value of k<1 indicates that the failure rate decreases over time (like in case of the Lindy effect, which however corresponds to Pareto distributions[7] rather than Weibull distributions). This happens if there is significant "infant mortality", or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population. In the context of the diffusion of innovations, this means negative word of mouth: the hazard function is a monotonically decreasing function of the proportion of adopters;
  • A value of k=1 indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure. The Weibull distribution reduces to an exponential distribution;
  • A value of k>1 indicates that the failure rate increases with time. This happens if there is an "aging" process, or parts that are more likely to fail as time goes on. In the context of the diffusion of innovations, this means positive word of mouth: the hazard function is a monotonically increasing function of the proportion of adopters. The function is first convex, then concave with an inflection point at (e1/k1)/e1/k,k>1.

In the field of materials science, the shape parameter k of a distribution of strengths is known as the Weibull modulus. In the context of diffusion of innovations, the Weibull distribution is a "pure" imitation/rejection model.

Alternative parameterizations

First alternative

Applications in medical statistics and econometrics often adopt a different parameterization.[8][9] The shape parameter k is the same as above, while the scale parameter is b=λk. In this case, for x ≥ 0, the probability density function is

f(x;k,b)=bkxk1ebxk,

the cumulative distribution function is

F(x;k,b)=1ebxk,

the quantile function is

Q(p;k,b)=(1bln(1p))1k,

the hazard function is

h(x;k,b)=bkxk1,

and the mean is

b1/kΓ(1+1/k).

Second alternative

A second alternative parameterization can also be found.[10][11] The shape parameter k is the same as in the standard case, while the scale parameter λ is replaced with a rate parameter β = 1/λ. Then, for x ≥ 0, the probability density function is

f(x;k,β)=βk(βx)k1e(βx)k

the cumulative distribution function is

F(x;k,β)=1e(βx)k,

the quantile function is

Q(p;k,β)=1β(ln(1p))1k,

and the hazard function is

h(x;k,β)=βk(βx)k1.

In all three parameterizations, the hazard is decreasing for k < 1, increasing for k > 1 and constant for k = 1, in which case the Weibull distribution reduces to an exponential distribution.

Properties

Density function

The form of the density function of the Weibull distribution changes drastically with the value of k. For 0 < k < 1, the density function tends to ∞ as x approaches zero from above and is strictly decreasing. For k = 1, the density function tends to 1/λ as x approaches zero from above and is strictly decreasing. For k > 1, the density function tends to zero as x approaches zero from above, increases until its mode and decreases after it. The density function has infinite negative slope at x = 0 if 0 < k < 1, infinite positive slope at x = 0 if 1 < k < 2 and null slope at x = 0 if k > 2. For k = 1 the density has a finite negative slope at x = 0. For k = 2 the density has a finite positive slope at x = 0. As k goes to infinity, the Weibull distribution converges to a Dirac delta distribution centered at x = λ. Moreover, the skewness and coefficient of variation depend only on the shape parameter. A generalization of the Weibull distribution is the hyperbolastic distribution of type III.

Cumulative distribution function

The cumulative distribution function for the Weibull distribution is

F(x;k,λ)=1e(x/λ)k

for x ≥ 0, and F(x; k; λ) = 0 for x < 0.

If x = λ then F(x; k; λ) = 1 − e−1 ≈ 0.632 for all values of k. Vice versa: at F(x; k; λ) = 0.632 the value of x ≈ λ.

The quantile (inverse cumulative distribution) function for the Weibull distribution is

Q(p;k,λ)=λ(ln(1p))1/k

for 0 ≤ p < 1.

The failure rate h (or hazard function) is given by

h(x;k,λ)=kλ(xλ)k1.

The Mean time between failures MTBF is

MTBF(k,λ)=λΓ(1+1/k).

Moments

The moment generating function of the logarithm of a Weibull distributed random variable is given by[12]

E[etlogX]=λtΓ(tk+1)

where Template:Math is the gamma function. Similarly, the characteristic function of log X is given by

E[eitlogX]=λitΓ(itk+1).

In particular, the nth raw moment of X is given by

mn=λnΓ(1+nk).

The mean and variance of a Weibull random variable can be expressed as

E(X)=λΓ(1+1k)

and

var(X)=λ2[Γ(1+2k)(Γ(1+1k))2].

The skewness is given by

γ1=2Γ133Γ1Γ2+Γ3[Γ2Γ12]3/2

where Γi=Γ(1+i/k), which may also be written as

γ1=Γ(1+3k)λ33μσ2μ3σ3

where the mean is denoted by Template:Math and the standard deviation is denoted by Template:Math.

The excess kurtosis is given by

γ2=6Γ14+12Γ12Γ23Γ224Γ1Γ3+Γ4[Γ2Γ12]2

where Γi=Γ(1+i/k). The kurtosis excess may also be written as:

γ2=λ4Γ(1+4k)4γ1σ3μ6μ2σ2μ4σ43.

Moment generating function

A variety of expressions are available for the moment generating function of X itself. As a power series, since the raw moments are already known, one has

E[etX]=n=0tnλnn!Γ(1+nk).

Alternatively, one can attempt to deal directly with the integral

E[etX]=0etxkλ(xλ)k1e(x/λ)kdx.

If the parameter k is assumed to be a rational number, expressed as k = p/q where p and q are integers, then this integral can be evaluated analytically.[13] With t replaced by −t, one finds

E[etX]=1λktkpkq/p(2π)q+p2Gp,qq,p(1kp,2kp,,pkp0q,1q,,q1q|pp(qλktk)q)

where G is the Meijer G-function.

The characteristic function has also been obtained by Template:Harvtxt. The characteristic function and moment generating function of 3-parameter Weibull distribution have also been derived by Template:Harvtxt by a direct approach.

Minima

Let X1,X2,,Xn be independent and identically distributed Weibull random variables with scale parameter λ and shape parameter k. If the minimum of these n random variables is Z=min(X1,X2,,Xn), then the cumulative probability distribution of Z is given by

F(z)=1en(z/λ)k.

That is, Z will also be Weibull distributed with scale parameter n1/kλ and with shape parameter k.

Reparametrization tricks

Fix some α>0. Let (π1,...,πn) be nonnegative, and not all zero, and let g1,...,gn be independent samples of Weibull(1,α1), then[14]

  • argmini(giπiα)Categorical(πjiπi)j
  • mini(giπiα)Weibull((iπi)α,α1).

Shannon entropy

The information entropy is given by[15]

H(λ,k)=γ(11k)+ln(λk)+1

where γ is the Euler–Mascheroni constant. The Weibull distribution is the maximum entropy distribution for a non-negative real random variate with a fixed expected value of xk equal to λk and a fixed expected value of ln(xk) equal to ln(λk) − γ.

Kullback–Leibler divergence

The Kullback–Leibler divergence between two Weibull distributions is given by[16]

DKL(Weib1Weib2)=logk1λ1k1logk2λ2k2+(k1k2)[logλ1γk1]+(λ1λ2)k2Γ(k2k1+1)1

Parameter estimation

Ordinary least square using Weibull plot

Weibull plot

The fit of a Weibull distribution to data can be visually assessed using a Weibull plot.[17] The Weibull plot is a plot of the empirical cumulative distribution function F^(x) of data on special axes in a type of Q–Q plot. The axes are ln(ln(1F^(x))) versus ln(x). The reason for this change of variables is the cumulative distribution function can be linearized:

F(x)=1e(x/λ)kln(1F(x))=(x/λ)kln(ln(1F(x)))'y'=klnx'mx'klnλ'c'

which can be seen to be in the standard form of a straight line. Therefore, if the data came from a Weibull distribution then a straight line is expected on a Weibull plot.

There are various approaches to obtaining the empirical distribution function from data. One method is to obtain the vertical coordinate for each point using

F^=i0.3n+0.4,

where i is the rank of the data point and n is the number of data points.[18][19] Another common estimator[20] is

F^=i0.5n.

Linear regression can also be used to numerically assess goodness of fit and estimate the parameters of the Weibull distribution. The gradient informs one directly about the shape parameter k and the scale parameter λ can also be inferred.

Method of moments

The coefficient of variation of Weibull distribution depends only on the shape parameter:[21]

CV2=σ2μ2=Γ(1+2k)(Γ(1+1k))2(Γ(1+1k))2.

Equating the sample quantities s2/x¯2 to σ2/μ2, the moment estimate of the shape parameter k can be read off either from a look up table or a graph of CV2 versus k. A more accurate estimate of k^ can be found using a root finding algorithm to solve

Γ(1+2k)(Γ(1+1k))2(Γ(1+1k))2=s2x¯2.

The moment estimate of the scale parameter can then be found using the first moment equation as

λ^=x¯Γ(1+1k^).

Maximum likelihood

The maximum likelihood estimator for the λ parameter given k is[21]

λ^=(1ni=1nxik)1k

The maximum likelihood estimator for k is the solution for k of the following equation[22]

0=i=1nxiklnxii=1nxik1k1ni=1nlnxi

This equation defines k^ only implicitly, one must generally solve for k by numerical means.

When x1>x2>>xN are the N largest observed samples from a dataset of more than N samples, then the maximum likelihood estimator for the λ parameter given k is[22]

λ^k=1Ni=1N(xikxNk)

Also given that condition, the maximum likelihood estimator for k isTemplate:Citation needed

0=i=1N(xiklnxixNklnxN)i=1N(xikxNk)1Ni=1Nlnxi

Again, this being an implicit function, one must generally solve for k by numerical means.

Applications

The Weibull distribution is usedTemplate:Citation needed

Fitted cumulative Weibull distribution to maximum one-day rainfalls using CumFreq, see also distribution fitting[23]
Fitted curves for oil production time series data[24]
  • In survival analysis
  • In reliability engineering and failure analysis
  • In electrical engineering to represent overvoltage occurring in an electrical system
  • In industrial engineering to represent manufacturing and delivery times
  • In extreme value theory
  • In weather forecasting and the wind power industry to describe wind speed distributions, as the natural distribution often matches the Weibull shape[25]
  • In communications systems engineering
    • In radar systems to model the dispersion of the received signals level produced by some types of clutters
    • To model fading channels in wireless communications, as the Weibull fading model seems to exhibit good fit to experimental fading channel measurements
  • In information retrieval to model dwell times on web pages.[26]
  • In general insurance to model the size of reinsurance claims, and the cumulative development of asbestosis losses
  • In forecasting technological change (also known as the Sharif-Islam model)[27]
  • In hydrology the Weibull distribution is applied to extreme events such as annual maximum one-day rainfalls and river discharges.
  • In decline curve analysis to model oil production rate curve of shale oil wells.[24]
  • In describing the size of particles generated by grinding, milling and crushing operations, the 2-Parameter Weibull distribution is used, and in these applications it is sometimes known as the Rosin–Rammler distribution.[28] In this context it predicts fewer fine particles than the log-normal distribution and it is generally most accurate for narrow particle size distributions.[29] The interpretation of the cumulative distribution function is that F(x;k,λ) is the mass fraction of particles with diameter smaller than x, where λ is the mean particle size and k is a measure of the spread of particle sizes.
  • In describing random point clouds (such as the positions of particles in an ideal gas): the probability to find the nearest-neighbor particle at a distance x from a given particle is given by a Weibull distribution with k=3 and ρ=1/λ3 equal to the density of the particles.[30]
  • In calculating the rate of radiation-induced single event effects onboard spacecraft, a four-parameter Weibull distribution is used to fit experimentally measured device cross section probability data to a particle linear energy transfer spectrum.[31] The Weibull fit was originally used because of a belief that particle energy levels align to a statistical distribution, but this belief was later proven falseTemplate:Cn and the Weibull fit continues to be used because of its many adjustable parameters, rather than a demonstrated physical basis.[32]

See also

References

Template:Reflist

Bibliography

Template:Toomanylinks

Template:ProbDistributions Template:Authority control