Hurdle model

From testwiki
Revision as of 10:49, 20 February 2025 by 185.104.138.94 (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Short description A hurdle model is a class of statistical models where a random variable is modelled using two parts, the first of which is the probability of attaining the value 0, and the second part models the probability of the non-zero values. The use of hurdle models is often motivated by an excess of zeroes in the data that is not sufficiently accounted for in more standard statistical models.

In a hurdle model, a random variable x is modelled as

Pr(x=0)=θ
Pr(x0)=px0(x)

where px0(x) is a truncated probability distribution function, truncated at 0.

Hurdle models were introduced by John G. Cragg in 1971,[1] where the non-zero values of x were modelled using a normal model, and a probit model was used to model the zeros. The probit part of the model was said to model the presence of "hurdles" that must be overcome for the values of x to attain non-zero values, hence the designation hurdle model. Hurdle models were later developed for count data, with Poisson, geometric,[2] and negative binomial[3] models for the non-zero counts .

Relationship with zero-inflated models

Hurdle models differ from zero-inflated models in that zero-inflated models model the zeros using a two-component mixture model. With a mixture model, the probability of the variable being zero is determined by both the main distribution function p(x=0) and the mixture weight π. Specifically, a zero-inflated model for a random variable x is

Pr(x=0)=π+(1π)×p(x=0)
Pr(x=hi)=(1π)×p(x=hi)

where π is the mixture weight that determines the amount of zero-inflation. A zero-inflated model can only increase the probability of Pr(x=0), but this is not a restriction in hurdle models.[4]

See also

References

Template:Reflist