The Schur complement of a block matrix, encountered in linear algebra and the theory of matrices, is defined as follows.
Suppose p, q are nonnegative integers such that p + q > 0, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Letso that M is a (p + q) × (p + q) matrix.
If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined byIf A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined byIn the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.
The Schur complement is named after Issai Schur[1] who used it to prove Schur's lemma, although it had been used previously.[2] Emilie Virginia Haynsworth was the first to call it the Schur complement.[3] The Schur complement is a key tool in the fields of numerical analysis, statistics, and matrix analysis.The Schur complement is sometimes referred to as the Feshbach map after a physicist Herman Feshbach.[4]
The Schur complement arises when performing a block Gaussian elimination on the matrix M. In order to eliminate the elements below the block diagonal, one multiplies the matrix M by a block lower triangular matrix on the right as follows:where Ip denotes a p×p identity matrix. As a result, the Schur complement
M/D=A-BD-1C
Continuing the elimination process beyond this point (i.e., performing a block Gauss–Jordan elimination),leads to an LDU decomposition of M, which readsThus, the inverse of M may be expressed involving D-1 and the inverse of Schur's complement, assuming it exists, asThe above relationship comes from the elimination operations that involve D-1 and M/D. An equivalent derivation can be done with the roles of A and D interchanged. By equating the expressions for M-1 obtained in these two different ways, one can establish the matrix inversion lemma, which relates the two Schur complements of M: M/D and M/A (see "Derivation from LDU decomposition" in).
M-1=
1 | |
AD-BC |
\left[\begin{matrix}D&-B\ -C&A\end{matrix}\right]
provided that AD - BC is non-zero.
\begin{align} M&=\begin{bmatrix}A&B\\C&D\end{bmatrix}=\begin{bmatrix}Ip&0\ CA-1&Iq\end{bmatrix}\begin{bmatrix}A&0\ 0&D-CA-1B\end{bmatrix}\begin{bmatrix}Ip&A-1B\ 0&Iq\end{bmatrix},\\[4pt] M-1&=\begin{bmatrix}A-1+A-1B(M/A)-1CA-1&-A-1B(M/A)-1\ -(M/A)-1CA-1&(M/A)-1\end{bmatrix}\end{align}
whenever this inverse exists.
\det(M)=\det(A)\det\left(D-CA-1B\right)
\det(M)=\det(D)\det\left(A-BD-1C\right)
which generalizes the determinant formula for 2 × 2 matrices.
\operatorname{rank}(M)=\operatorname{rank}(D)+\operatorname{rank}\left(A-BD-1C\right)
A/B=((A/C)/(B/C))
The Schur complement arises naturally in solving a system of linear equations such as
\begin{bmatrix}A&B\ C&D\end{bmatrix}\begin{bmatrix}x\ y\end{bmatrix} =\begin{bmatrix}u\ v\end{bmatrix}
Assuming that the submatrix
A
x
x=A-1(u-By).
Substituting this expression into the second equation yields
\left(D-CA-1B\right)y=v-CA-1u.
We refer to this as the reduced equation obtained by eliminating
x
A
M
S \overset{\underset{def
Solving the reduced equation, we obtain
y=S-1\left(v-CA-1u\right).
Substituting this into the first equation yields
x=\left(A-1+A-1BS-1CA-1\right)u-A-1BS-1v.
We can express the above two equation as:
\begin{bmatrix}x\ y\end{bmatrix}= \begin{bmatrix} A-1+A-1BS-1CA-1&-A-1BS-1\\ -S-1CA-1&S-1\end{bmatrix} \begin{bmatrix}u\ v\end{bmatrix}.
Therefore, a formulation for the inverse of a block matrix is:
\begin{bmatrix}A&B\ C&D\end{bmatrix}-1=\begin{bmatrix} A-1+A-1BS-1CA-1&-A-1BS-1\\ -S-1CA-1&S-1\end{bmatrix}=\begin{bmatrix} Ip&-A-1B\\ &Iq \end{bmatrix}\begin{bmatrix} A-1&\\ &S-1\end{bmatrix}\begin{bmatrix} Ip&\\ -CA-1&Iq \end{bmatrix}.
In particular, we see that the Schur complement is the inverse of the
2,2
M
In practice, one needs
A
This method is useful in electrical engineering to reduce the dimension of a network's equations. It is especially useful when element(s) of the output vector are zero. For example, when
u
v
v
x
x=\left(A-1+A-1BS-1CA-1\right)u
u
Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn + m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix
\Sigma=\left[\begin{matrix}A&B\ BT&C\end{matrix}\right],
where is the covariance matrix of X, is the covariance matrix of Y and is the covariance matrix between X and Y.
Then the conditional covariance of X given Y is the Schur complement of C in :[7]
\begin{align} \operatorname{Cov}(X\midY)&=A-BC-1BT\\ \operatorname{E}(X\midY)&=\operatorname{E}(X)+BC-1(Y-\operatorname{E}(Y)) \end{align}
If we take the matrix
\Sigma
\Sigma
Let X be a symmetric matrix of real numbers given byThen
The first and third statements can be derived[8] by considering the minimizer of the quantityas a function of v (for fixed u).
Furthermore, sinceand similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.
There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement. Precisely,
X\succeq0\LeftrightarrowA\succeq0,C-BTAgB\succeq0,\left(I-AAg\right)B=0
X\succeq0\LeftrightarrowC\succeq0,A-BCgBT\succeq0,\left(I-CCg\right)BT=0,
where
Ag
A