In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model count data with more than one parameter. This distribution is flexible in terms of its ability to allow a moderate over-dispersion in the data.
The authors Kemp and Kemp [1] have called it "Hermite distribution" from the fact its probability function and the moment generating function can be expressed in terms of the coefficients of (modified) Hermite polynomials.
The distribution first appeared in the paper Applications of Mathematics to Medical Problems,[2] by Anderson Gray McKendrick in 1926. In this work the author explains several mathematical methods that can be applied to medical research. In one of this methods he considered the bivariate Poisson distribution and showed that the distribution of the sum of two correlated Poisson variables follow a distribution that later would be known as Hermite distribution.
As a practical application, McKendrick considered the distribution of counts of bacteria in leucocytes. Using the method of moments he fitted the data with the Hermite distribution and found the model more satisfactory than fitting it with a Poisson distribution.
The distribution was formally introduced and published by C. D. Kemp and Adrienne W. Kemp in 1965 in their work Some Properties of ‘Hermite’ Distribution. The work is focused on the properties of this distribution for instance a necessary condition on the parameters and their maximum likelihood estimators (MLE), the analysis of the probability generating function (PGF) and how it can be expressed in terms of the coefficients of (modified) Hermite polynomials. An example they have used in this publication is the distribution of counts of bacteria in leucocytes that used McKendrick but Kemp and Kemp estimate the model using the maximum likelihood method.
Hermite distribution is a special case of discrete compound Poisson distribution with only two parameters.[3] [4]
The same authors published in 1966 the paper An alternative Derivation of the Hermite Distribution.[5] In this work established that the Hermite distribution can be obtained formally by combining a Poisson distribution with a normal distribution.
In 1971, Y. C. Patel[6] did a comparative study of various estimation procedures for the Hermite distribution in his doctoral thesis. It included maximum likelihood, moment estimators, mean and zero frequency estimators and the method of even points.
In 1974, Gupta and Jain[7] did a research on a generalized form of Hermite distribution.
Let X1 and X2 be two independent Poisson variables with parameters a1 and a2. The probability distribution of the random variable Y = X1 + 2X2 is the Hermite distribution with parameters a1 and a2 and probability mass function is given by [8]
pn=P(Y=n)=
-(a1+a2) | |
e |
\lfloorn/2\rfloor | |
\sum | |
j=0 |
| ||||||||||||||||
(n-2j)!j! |
where
The probability generating function of the probability mass is,[8]
GY(s)=
infty | |
\sum | |
n=0 |
pnsn=\exp(a1(s-1)+a
2-1)) | |
2(s |
When a random variable Y = X1 + 2X2 is distributed by an Hermite distribution, where X1 and X2 are two independent Poisson variables with parameters a1 and a2, we write
Y \sim\operatorname{Herm}(a1,a2)
The moment generating function of a random variable X is defined as the expected value of et, as a function of the real parameter t. For an Hermite distribution with parameters X1 and X2, the moment generating function exists and is equal to
M(t)=G(et)=
2t | |
\exp(a | |
2(e |
-1))
The cumulant generating function is the logarithm of the moment generating function and is equal to [4]
K(t)=log(M(t))=
2t | |
a | |
2(e |
-1)
If we consider the coefficient of (it)rr! in the expansion of K(t) we obtain the r-cumulant
kn=a1+2na2
Hence the mean and the succeeding three moments about it are
Order | Moment | Cumulant | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | \mu1=k1=a1+2a2 | \mu | |||||||||||||
2 | \mu2=k2=a1+4a2 | \sigma2 | |||||||||||||
3 | \mu3=k3=a1+8a2 | k3 | |||||||||||||
4 | \mu4=k4+3k
=a1+16a2+3(a1+4a
| k4 |
The skewness is the third moment centered around the mean divided by the 3/2 power of the standard deviation, and for the hermite distribution is,[4]
\gamma1=
\mu3 | ||||||
|
=
a1+8a2 | ||||||||||||
|
\gamma1>0
The kurtosis is the fourth moment centered around the mean, divided by the square of the variance, and for the Hermite distribution is,[4]
\beta2=
\mu4 | ||||||
|
=
| |||||||||||||
|
=
a1+16a2 | ||||||||||||
|
+3
The excess kurtosis is just a correction to make the kurtosis of the normal distribution equal to zero, and it is the following,
\gamma2=
\mu4 | ||||||
|
-3=
a1+16a2 | ||||||||||||
|
\beta2>3
\gamma2>0
In a discrete distribution the characteristic function of any real-valued random variable is defined as the expected value of
eitX
\phi(t)=E[eitX]=
infty | |
\sum | |
j=0 |
eijtP[X=j]
This function is related to the moment-generating function via
\phix(t)=MX(it)
\phix(t)=
it | |
\exp(a | |
1(e |
2it | |
-1)+a | |
2(e |
-1))
The cumulative distribution function is,[1]
\begin{align} F(x;a1,a2)&=P(X\leqx)\\ &=\exp(-(a1+a2))
\lfloorx\rfloor | |
\sum | |
i=0 |
[i/2] | |
\sum | |
j=0 |
| ||||||||||||||||
(i-2j)!j! |
\end{align}
\hat{a}1=0.0135
\hat{a}2=0.0932
X1\sim\operatorname{Herm}(a1,a2)
X2\sim\operatorname{Herm}(b1,b2)
Y\sim\operatorname{Herm}(a1+b1,a2+b2)
d=
\operatorname{Var | |
(Y)}{\operatorname{E}(Y)} |
=
a1+4a2 | |
a1+2a2 |
=1+
2a2 | |
a1+2a2 |
The mean and the variance of the Hermite distribution are
\mu=a1+2a2
\sigma2=a1+4a2
\begin{cases} \bar{x}=a1+2a2\\ \sigma2=a1+4a2 \end{cases}
Solving these two equation we get the moment estimators
\hat{a1}
\hat{a2}
\hat{a1}=2\bar{x}-\sigma2
\hat{a2}=
\sigma2-\hat{x | |
Since a1 and a2 both are positive, the estimator
\hat{a1}
\hat{a2}
\bar{x}<\sigma2<2\bar{x}
Given a sample X1, ..., Xm are independent random variables each having an Hermite distribution we wish to estimate the value of the parameters
\hat{a1}
\hat{a2}
\mu=a1+2a2
\sigma2=a1+4a2
\begin{cases} a1=\mu(2-d)\\[4pt] a2=\dfrac{\mu(d-1)}{2} \end{cases}
We can parameterize the probability function by μ and d
P(X=x)=\exp\left(-\left(\mu(2-d)+
\mu(d-1) | |
2 |
\right)\right)
[x/2] | |
\sum | |
j=0 |
| ||||||||||
(x-2j)!j! |
Hence the log-likelihood function is,[9]
\begin{align} l{L}(x1,\ldots,xm;\mu,d)&=log(l{L}(x1,\ldots,xm;\mu,d))\\ &=m\mu\left(-1+
d-1 | |
2 |
\right)+log(\mu(2-d))
m | |
\sum | |
i=1 |
xi+
m | |
\sum | |
i=1 |
log(qi(\theta)) \end{align}
where
qi(\theta)=
[xi/2] | |
\sum | |
j=0 |
\thetaj | |
(xi-2j)!j! |
\theta=
d-1 | |
2\mu(2-d)2 |
From the log-likelihood function, the likelihood equations are,[9]
\partiall | |
\partial\mu |
=m\left(-1+
d-1 | |
2 |
\right)+
1 | |
\mu |
m | |
\sum | |
i=1 |
xi-
d-1 | |
2\mu2(2-d)2 |
m | |
\sum | |
i=1 |
| ||||||||||
qi(\theta) |
\partiall | |
\partiald |
=m
\mu | |
2 |
-
| ||||||||||
2-d |
-
d | |
2\mu(2-d)3 |
m | |
\sum | |
i=1 |
m | |
\sum | |
i=1 |
| ||||||||||
qi(\theta) |
Straightforward calculations show that,[9]
\mu=\bar{x}
m | |
\sum | |
i=1 |
| ||||||||||
)}{q |
i(\tilde{\theta})}=m(\bar{x}(2-d))2
\tilde{\theta}=
d-1 | |
2\bar{x |
(2-d)2}
The likelihood equation does not always have a solution like as it shows the following proposition,
Proposition:[9] Let X1, ..., Xm come from a generalized Hermite distribution with fixed n. Then the MLEs of the parameters are
\hat{\mu}
\tilde{d}
m(2)/\bar{x}2>1
m(2)=
n | |
\sum | |
i=1 |
xi(xi-1)/n
m(2)/\bar{x}2>1
\tilde{d}>1
\tilde{d}=\sigma2/\bar{x}
\hat{\mu}=\bar{x}
\tilde{d}=1
A usual choice for discrete distributions is the zero relative frequency of the data set which is equated to the probability of zero under the assumed distribution. Observing that
f0=\exp(-(a1+a2))
\mu=a1+2a2
\begin{cases} \bar{x}=a1+2a2\\ f0=\exp(-(a1+a2)) \end{cases}
We obtain the zero frequency and the mean estimator a1 of
\hat{a1}
\hat{a2}
\hat{a1}=-(\bar{x}+2log(f0))
\hat{a2}=\bar{x}+log(f0)
where
f0=
n0 | |
n |
It can be seen that for distributions with a high probability at 0, the efficiency is high.
\hat{a1}
\hat{a2}
-log\left( | n0 |
n |
\right)<\bar{x}<-2log\left(
n0 | |
n |
\right)
When Hermite distribution is used to model a data sample is important to check if the Poisson distribution is enough to fit the data. Following the parametrized probability mass function used to calculate the maximum likelihood estimator, is important to corroborate the following hypothesis,
\begin{cases} H0:d=1\\ H1:d>1 \end{cases}
The likelihood-ratio test statistic [9] for hermite distribution is,
W=2(l{L}(X;\hat{\mu},\hat{d})-l{L}(X;\hat{\mu},1))
Where
l{L}
2 | |
\chi | |
1 |
2 | |
\chi | |
1 |
2 | |
\chi | |
1 |
The score statistic is,[9]
S2=2m\left[
m(2)-\bar{x | |
2}{2\bar{x}}\right] |
2=
m(\tilde{d | |
-1) |
2}{2}
where m is the number of observations.
The asymptotic distribution of the score test statistic under the null hypothesis is a
2 | |
\chi | |
1 |
\operatorname{sgn}(m(2)-\bar{x}2)\sqrt{S}