In statistics, the matrix t-distribution (or matrix variate t-distribution) is the generalization of the multivariate t-distribution from vectors to matrices.[1] [2]
The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution: If the matrix has only one row, or only one column, the distributions become equivalent to the corresponding (vector-)multivariate distribution. The matrix t-distribution is the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse Wishart distribution placed over either of its covariance matrices, and the multivariate t-distribution can be generated in a similar way.
In a Bayesian analysis of a multivariate linear regression model based on the matrix normal distribution, the matrix t-distribution is the posterior predictive distribution.
For a matrix t-distribution, the probability density function at the point
X
n x p
f(X;\nu,M,\boldsymbol\Sigma,\boldsymbol\Omega)=K x \left|In+\boldsymbol\Sigma-1(X-M)\boldsymbol\Omega-1(X-M)\rm
| ||||
\right| |
,
where the constant of integration K is given by
K=
| ||||||||||||||||
|
| ||||
|\boldsymbol\Omega| |
| ||||
|\boldsymbol\Sigma| |
.
Here
\Gammap
If
X\siml{T}n x (\nu,M,\Sigma,\Omega)
The mean, or expected value is, if
\nu>1
E[X]=M
\nu>2
E[(X-M)(X-M)T] =
\Sigma\operatorname{tr | |
(\Omega)}{\nu-2} |
E[(X-M)T(X-M)] =
\Omega\operatorname{tr | |
(\Sigma) |
}{\nu-2}
\operatorname{tr}
More generally, for appropriately dimensioned matrices A,B,C:
\begin{align} E[(X-M)A(X-M)T] &=
\Sigma\operatorname{tr | |
(A |
T\Omega)}{\nu-2}\\ E[(X-M)TB(X-M)] &=
\Omega\operatorname{tr | |
(B |
T\Sigma)}{\nu-2}\\ E[(X-M)C(X-M)] &=
\SigmaCT\Omega | |
\nu-2 |
\end{align}
Transpose transform:
XT\siml{T}p x (\nu,MT,\Omega,\Sigma)
Linear transform: let A (r-by-n), be of full rank r ≤ n and B (p-by-s), be of full rank s ≤ p, then:
AXB\siml{T}r x (\nu,AMB,A\SigmaAT,BT\OmegaB)
The characteristic function and various other properties can be derived from the re-parameterised formulation (see below).
An alternative parameterisation of the matrix t-distribution uses two parameters
\alpha
\beta
\nu
This formulation reduces to the standard matrix t-distribution with
\beta=2,\alpha=
\nu+p-1 | |
2 |
.
This formulation of the matrix t-distribution can be derived as the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse multivariate gamma distribution placed over either of its covariance matrices.
If
X\sim{\rmT}n,p(\alpha,\beta,M,\boldsymbol\Sigma,\boldsymbol\Omega)
X\rm\sim{\rmT}p,n(\alpha,\beta,M\rm,\boldsymbol\Omega,\boldsymbol\Sigma).
The property above comes from Sylvester's determinant theorem:
\det\left(In+
\beta | |
2 |
\boldsymbol\Sigma-1(X-M)\boldsymbol\Omega-1(X-M)\rm\right)=
\det\left(Ip+
\beta | |
2 |
\boldsymbol\Omega-1(X\rm-M\rm)\boldsymbol\Sigma-1(X\rm-M\rm)\rm\right).
If
X\sim{\rmT}n,p(\alpha,\beta,M,\boldsymbol\Sigma,\boldsymbol\Omega)
A(n x n)
B(p x p)
AXB\sim{\rmT}n,p(\alpha,\beta,AMB,A\boldsymbol\SigmaA\rm,B\rm\boldsymbol\OmegaB) .
The characteristic function is[3]
\phiT(Z)=
\exp({\rmtr | |
(iZ'M))|\boldsymbol\Omega| |
\alphap | |
p(\alpha)(2\beta) |
where
B\delta(WZ)=|W|-\delta\intS>0\exp\left({\rm
| ||||
tr}(-SW-S-1Z)\right)|S| |
dS,
and where
B\delta