Conway–Maxwell–Poisson | ||||||||||
Type: | mass | |||||||||
Parameters: | λ>0,\nu\geq0 | |||||||||
Support: | x\in\{0,1,2,...\} | |||||||||
Pdf: |
| |||||||||
Cdf: |
\Pr(X=i) | |||||||||
Mean: |
| |||||||||
Median: | No closed form | |||||||||
Mode: | See text | |||||||||
Variance: |
-\operatorname{mean}2 | |||||||||
Skewness: | Not listed | |||||||||
Kurtosis: | Not listed | |||||||||
Entropy: | Not listed | |||||||||
Mgf: |
| |||||||||
Char: |
| |||||||||
Pgf: |
|
In probability theory and statistics, the Conway–Maxwell–Poisson (CMP or COM–Poisson) distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Poisson distribution by adding a parameter to model overdispersion and underdispersion. It is a member of the exponential family,[1] has the Poisson distribution and geometric distribution as special cases and the Bernoulli distribution as a limiting case.
The CMP distribution was originally proposed by Conway and Maxwell in 1962 as a solution to handling queueing systems with state-dependent service rates. The CMP distribution was introduced into the statistics literature by Boatwright et al. 2003 [2] and Shmueli et al. (2005).[3] The first detailed investigation into theprobabilistic and statistical properties of the distribution was published by Shmueli et al. (2005).[3] Some theoretical probability results of COM-Poisson distribution is studied and reviewed by Li et al. (2019),[4] especially the characterizations of COM-Poisson distribution.
The CMP distribution is defined to be the distribution with probability mass function
P(X=x)=f(x;λ,\nu)=
λx | |
(x!)\nu |
1 | |
Z(λ,\nu) |
.
where :
Z(λ,\nu)=
infty | |
\sum | |
j=0 |
λj | |
(j!)\nu |
.
The function
Z(λ,\nu)
Z(λ,\nu)
The domain of admissible parameters is
λ,\nu>0
0<λ<1
\nu=0
The additional parameter
\nu
P(X=x-1) | |
P(X=x) |
=
x\nu | |
λ |
.
When
\nu=1
\nu\toinfty
λ/(1+λ)
\nu=0
1-λ
λ<1
For the CMP distribution, moments can be found through the recursive formula [3]
\operatorname{E}[Xr+1]=\begin{cases} λ\operatorname{E}[X+1]1-\nu&ifr=0\\ λ
d | |
dλ |
\operatorname{E}[Xr]+\operatorname{E}[X]\operatorname{E}[Xr]&ifr>0.\\ \end{cases}
For general
\nu
X\simCMP(λ,\nu)
\nu\geq1
F(n)=P(X\leqn)=1-
1F\nu-1(;n+2,\ldots,n+2;λ) | |
{\{(n+1)!\ |
\nu-1
Many important summary statistics, such as moments and cumulants, of the CMP distribution can be expressed in terms of the normalizing constant
Z(λ,\nu)
\operatorname{E}sX=Z(sλ,\nu)/Z(λ,\nu)
\operatorname{E}X=λ | d |
dλ |
\{ln(Z(λ,\nu))\},
\operatorname{var}(X)=λ | d |
dλ |
\operatorname{E}X.
The cumulant generating function is
g(t)=ln(\operatorname{E}[etX])=ln(Z(λet,\nu))-ln(Z(λ,\nu)),
and the cumulants are given by
(n) | ||
\kappa | (0)= | |
n=g |
\partialn | |
\partialtn |
ln(Z(λet,\nu))|t=0, n\geq1.
Whilst the normalizing constant
| ||||
Z(λ,\nu)=\sum | ||||
i=0 |
Z(λ,1)=eλ
Z(λ,0)=(1-λ)-1
\lim\nu → inftyZ(λ,\nu)=1+λ
Z(λ,2)=I0(2\sqrt{λ})
I0(x)=\sum
| |||||
( | |||||
k=0 |
x | |
2 |
)2k
\nu
Z(λ,\nu)=0F\nu-1(;1,\ldots,1;λ)
Because the normalizing constant does not in general have a closed form, the following asymptotic expansion is of interest. Fix
\nu>0
λ → infty
Z(λ,\nu)= | \exp\left\{\nuλ1/\nu\right\ |
where the
cj
\left(\Gamma(t+1)\right)-\nu=
\nu\nu | |
\left(2\pi\right)(\nu-1)/2 |
| ||||
\sum | ||||
j=0 |
.
c0=1
c | ||||
|
c | ||||
|
\left(\nu2+23\right)
For general values of
\nu
(j)r=j(j-1) … (j-r+1)
X\simCMP(λ,\nu)
λ,\nu>0
\nu]=λ | |
\operatorname{E}[((X) | |
r) |
r,
for
r\inN
Since in general closed form formulas are not available for moments and cumulants of the CMP distribution, the following asymptotic formulas are of interest. Let
X\simCMP(λ,\nu)
\nu>0
\gamma | ||||
|
\gamma | ||||
|
\sigma2=Var(X)
λ → infty
\operatorname{E}X=λ1/\nu\left(1-
\nu-1 | |
2\nu |
λ-1/\nu-
\nu2-1 | |
24\nu2 |
λ-2/\nu-
\nu2-1 | |
24\nu3 |
λ-3/\nu+l{O}(λ-4/\nu)\right),
Var(X)=
λ1/\nu | (1+ | |
\nu |
\nu2-1 | |
24\nu2 |
λ-2/\nu+
\nu2-1 | |
12\nu3 |
λ-3/\nu+l{O}(λ-4/\nu)),
\kappan=
λ1/\nu | (1+ | |
\nun-1 |
(-1)n(\nu2-1) | |
24\nu2 |
λ-2/\nu+
(-2)n(\nu2-1) | |
48\nu3 |
λ-3/\nu+l{O}(λ-4/\nu)),
\gamma1=
λ-1/2\nu | |
\sqrt{\nu |
\gamma2=
λ-1/\nu | (1- | |
\nu |
(\nu2-1) | |
24\nu2 |
λ-2/\nu+
(\nu2-1) | |
6\nu3 |
λ-3/\nu+l{O}(λ-4/\nu)),
\operatorname{E}[Xn]=λn/\nu(1+
n(n-\nu) | |
2\nu |
λ-1/\nu
-2/\nu | |
+a | |
2λ |
+l{O}(λ-3/\nu)),
where
a | + | ||||
|
1 | |
\nu2 |
\{\binom{n}{3}+3\binom{n}{4}\}.
The asymptotic series for
\kappan
n\geq2
\kappa1=\operatorname{E}X
\nu
When
\nu
\nu=1
\nu=2
m\inN
\operatorname{E}[(X) | ||||
|
0(2\sqrt{λ})},
where
Ir(x)
Using the connecting formula for moments and factorial moments gives
m\left\{ | |
\operatorname{E}X | |
k=1 |
{m\atopk}\right\}
λk/2Ik(2\sqrt{λ | |
)}{I |
0(2\sqrt{λ})}.
In particular, the mean of
X
\operatorname{E}X= | \sqrt{λ |
I |
1(2\sqrt{λ})}{I0(2\sqrt{λ})}.
Also, since
\operatorname{E}X2=λ
Var(X)=λ\left(1- | I1(2\sqrt{λ |
) |
2}\right). | |
0(2\sqrt{λ}) |
Suppose now that
\nu\geq1
\operatorname{E}[(X) | ||||
|
In particular,
\operatorname{E}[X]=λ
0F\nu-1(;2,\ldots,2;λ) | |
0F\nu-1(;1,\ldots,1;λ) |
,
and
Var(X)= | {λ2 |
Let
X\simCMP(λ,\nu)
X
\lfloorλ1/\nu\rfloor
λ1/\nu<m
X
λ1/\nu
λ1/\nu-1
The mean deviation of
X\nu
λ
\operatorname{E}|X\nu-λ|=2Z(λ,\nu)-1
| |||||
\lfloorλ1/\nu\rfloor! |
.
No explicit formula is known for the median of
X
m
X\simCMP(λ,\nu)
m=λ1/\nu+l{O}\left(λ1/2\nu\right),
as
λ → infty
Let
X\simCMP(λ,\nu)
f:Z+\mapstoR
\operatorname{E}|f(X+1)|<infty
\operatorname{E}|X\nuf(X)|<infty
\operatorname{E}[λf(X+1)-X\nuf(X)]=0.
Conversely, suppose now that
W
Z+
\operatorname{E}[λf(W+1)-W\nuf(W)]=0
f:Z+\mapstoR
W\simCMP(λ,\nu)
Let
Yn
n
p=λ/n\nu
\nu
λ>0
\nu>0
Yn
CMP(λ,\nu)
n → infty
P(X=k)=
| ||||
{k!\Gamma(r) |
{r\to+infty}
X\sim\operatorname{CMP}(λ,1)
X
λ
λ<1
X\simCMP(λ,0)
X
P(X=k)=λk(1-λ)
k\geq0
X\nu\simCMP(λ,\nu)
\nu → infty
λ(1+λ)-1
There are a few methods of estimating the parameters of the CMP distribution from the data. Two methods will be discussed: weighted least squares and maximum likelihood. The weighted least squares approach is simple and efficient but lacks precision. Maximum likelihood, on the other hand, is precise, but is more complex and computationally intensive.
The weighted least squares provides a simple, efficient method to derive rough estimates of the parameters of the CMP distribution and determine if the distribution would be an appropriate model. Following the use of this method, an alternative method should be employed to compute more accurate estimates of the parameters if the model is deemed appropriate.
This method uses the relationship of successive probabilities as discussed above. By taking logarithms of both sides of this equation, the following linear relationship arises
log
px-1 | |
px |
=-logλ+\nulogx
where
px
\Pr(X=x)
x
x-1
logx
Once the appropriateness of the model is determined, the parameters can be estimated by fitting a regression of
log(\hatpx-1/\hatpx)
logx
\operatorname{var}\left[log
\hatpx-1 | |
\hatpx |
\right] ≈
1 | |
npx |
+
1 | |
npx-1 |
cov\left(log
\hatpx-1 | |
\hatpx |
,log
\hatpx | |
\hatpx+1 |
\right) ≈ -
1 | |
npx |
The CMP likelihood function is
l{L}(λ,\nu\midx1,...,xn)=
S1 | |
λ |
\exp(-\nuS2)Z-n(λ,\nu)
where
S1=
n | |
\sum | |
i=1 |
xi
S2=
n | |
\sum | |
i=1 |
logxi!
\operatorname{E}[X]=\barX
\operatorname{E}[logX!]=\overline{logX!}
which do not have an analytic solution.
Instead, the maximum likelihood estimates are approximated numerically by the Newton–Raphson method. In each iteration, the expectations, variances, and covariance of
X
logX!
λ
\nu
\operatorname{E}[f(x)]=
infty | |
\sum | |
j=0 |
f(j)
λj | |
(j!)\nuZ(λ,\nu) |
.
This is continued until convergence of
\hatλ
\hat\nu
The basic CMP distribution discussed above has also been used as the basis for a generalized linear model (GLM) using a Bayesian formulation. A dual-link GLM based on the CMP distribution has been developed,[9] and this model has been used to evaluate traffic accident data.[10] [11] The CMP GLM developed by Guikema and Coffelt (2008) is based on a reformulation of the CMP distribution above, replacing
λ
\mu=λ1/\nu
\mu
A classical GLM formulation for a CMP regression has been developed which generalizes Poisson regression and logistic regression.[12] This takes advantage of the exponential family properties of the CMP distribution to obtain elegant model estimation (via maximum likelihood), inference, diagnostics, and interpretation. This approach requires substantially less computational time than the Bayesian approach, at the cost of not allowing expert knowledge to be incorporated into the model.[12] In addition it yields standard errors for the regression parameters (via the Fisher Information matrix) compared to the full posterior distributions obtainable via the Bayesian formulation. It also provides a statistical test for the level of dispersion compared to a Poisson model. Code for fitting a CMP regression, testing for dispersion, and evaluating fit is available.[13]
The two GLM frameworks developed for the CMP distribution significantly extend the usefulness of this distribution for data analysis problems.