In probability theory and statistics, the discrete Weibull distribution is the discrete variant of the Weibull distribution. The Discrete Weibull Distribution, first introduced by Toshio Nakagawa and Shunji Osaki, is a discrete analog of the continuous Weibull distribution, predominantly used in reliability engineering. It is particularly applicable for modeling failure data measured in discrete units like cycles or shocks. This distribution provides a versatile tool for analyzing scenarios where the timing of events is counted in distinct intervals, making it distinctively useful in fields that deal with discrete data patterns and reliability analysis. The discrete Weibull distribution is infinitely divisible only for
0<\beta\leq1
In the original paper by Nakagawa and Osaki they used the parametrization
p=
k-\beta | |
q |
(x+1)\beta | |
1-q |
q\in(0,1)
k-\beta | |
q |
-
(k+1)\beta | |
q |
\beta=1
An alternative parametrization — related to the Pareto distribution — has been used to estimate parameters in infectious disease modelling.[3] This parametrization introduces a parameter
\kappa= | \beta |
\alpha\beta |
\left( | 1 |
\alpha |
\right)\beta
\kappa | |
\beta |
\exp\left[- | \kappax\beta |
\beta |
\right]-\exp\left[-
\kappa\left(x+1\right)\beta | |
\beta |
\right]
and the cumulative mass function can be expressed as
1-\exp\left[- | \kappa\left(x+1\right)\beta |
\beta |
\right]
The continuous Weibull distribution has a close relationship with the Gumbel distribution which is easy to see when log-transforming the variable. A similar transformation can be made on the discrete Weibull.
Define
eY-1=X
Y=log(X+1)\in\{log(1),log(2),\ldots\}
\mu=log(\alpha)
\sigma=
1 | |
\beta |
x
\Pr(X\leqx)=\Pr(X\leqey-1).
We see that we get a location-scale parametrization:
=1-\exp\left[-\left(
x+1 | |
\alpha |
\right)\beta \right]=1-\exp\left[-\left(
ey | |
e\mu |
| ||||
\right) |
\right]=1-\exp\left[-\exp\left[
y-\mu | |
\sigma |
\right] \right]
which in estimation settings makes a lot of sense. This opens up the possibility of regression with frameworks developed for Weibull regression and extreme-value-theory.[4]
The discrete Weibull distribution can be compared with other common discrete distributions such as the Poisson, geometric, and negative binomial distributions, each of which has unique characteristics and applications.
The Poisson distribution is often used to model the number of rare event occurrences during a fixed period of time. It is characterized by a single parameter, λ, which is both the mean and variance of the distribution. The discrete Weibull distribution, on the other hand, is more flexible and can handle both over- and under-dispersion in count data. It has two parameters, q and β, which influence the shape and scale of the distribution. Unlike the Poisson distribution, which assumes events occur independently, the discrete Weibull can adapt to different event occurrence patterns.
The geometric distribution models the probability of the first success in a sequence of Bernoulli trials and is characterized by a single parameter, p, which is the probability of success on an individual trial. In contrast, the discrete Weibull distribution can model a broader range of data patterns due to its two parameters. While the geometric distribution is specifically for modeling the number of trials until the first success, the discrete Weibull can be used in a wider variety of scenarios, including those where the probability of success changes over trials.
The negative binomial distribution is used to model the number of Bernoulli trials needed before a particular number of successes is achieved. It is characterized by the probability of success and the number of successes. The discrete Weibull distribution, with its flexibility in modeling different data patterns, can be a better fit for data that does not conform to the specific scenario modeled by the negative binomial distribution.
Overall the discrete Weibull distribution is preferred over these alternatives when dealing with data that exhibit variability in dispersion (over- or under-dispersion) or when the data patterns do not fit the specific scenarios that Poisson, geometric, or negative binomial distributions are best suited for. Its adaptability in terms of shape and scale makes it a versatile tool in statistical modeling of discrete data.[5]
The Discrete Weibull distribution finds diverse applications in statistical analysis, as evidenced by various scholarly papers. One such paper illustrates the distribution's utility in modeling count data, specifically in the context of fertility plans. This study highlights how the Discrete Weibull distribution effectively captures complex relationships influenced by factors like education and family background. Unlike the Poisson distribution, it adeptly manages both overdispersed and underdispersed data, demonstrating its flexibility and efficacy in social science research. This application marks a significant extension of the distribution's usage beyond its traditional role in reliability engineering.[6]
Further expanding its scope, "On Bivariate Discrete Weibull Distribution" explores the application of the Discrete Weibull distribution to bivariate data. The paper delves into sophisticated statistical techniques, including maximum likelihood estimation and Bayesian inference, for analyzing bivariate discrete data. This exploration underscores the distribution's compatibility with complex statistical methods. Moreover, the paper presents practical analysis scenarios, such as examining football match scores and nasal drainage severity, highlighting the distribution's broad applicability across varied fields. These instances underscore the distribution's practicality in real-world situations, moving beyond mere theoretical constructs.[7]
Another significant advancement is presented in "The Exponentiated Discrete Weibull Distribution," which introduces an enhanced version of the distribution, termed the Exponentiated Discrete Weibull Distribution (EDW). This generalization increases the model's flexibility, enabling it to represent a broader spectrum of data patterns, including various hazard rate functions like increasing, decreasing, bathtub-shaped, and inverted bathtub-shaped. The EDW distribution's ability to model both overdispersed and underdispersed data, relative to a Poisson distribution, broadens its applicability. It proves to be a versatile tool for various fields, including reliability engineering and failure time studies, further broadening the distribution's practical utility.[8]