In mathematics, the Gershgorin circle theorem may be used to bound the spectrum of a square matrix. It was first published by the Soviet mathematician Semyon Aronovich Gershgorin in 1931. Gershgorin's name has been transliterated in several different ways, including Geršgorin, Gerschgorin, Gershgorin, Hershhorn, and Hirschhorn.
Let
A
n x n
aij
i\in\{1,...,n\}
Ri
i
Ri=\sumj ≠ {i
Let
D(aii,Ri)\subseteq\Complex
aii
Ri
Theorem. Every eigenvalue of
A
D(aii,Ri).
Proof. Let
λ
A
x=(xj)
xi
Ax=λx
\sumjaijxj=λxi.
Taking
aii
\sumjaijxj=(λ-aii)xi.
Therefore, applying the triangle inequality and recalling that
\left|xj\right| | |
\left|xi\right| |
\le1
\left|λ-aii\right| =\left|\sumj
aijxj | |
xi |
\right|\le\sumj\left|
aijxj | |
xi |
\right|=\sumj\left|aij\right|
\left|xj\right| | |
\left|xi\right| |
\le\sumj\left|aij\right|=Ri.
Corollary. The eigenvalues of A must also lie within the Gershgorin discs Cj corresponding to the columns of A.
Proof. Apply the Theorem to AT while recognizing that the eigenvalues of the transpose are the same as those of the original matrix.
Example. For a diagonal matrix, the Gershgorin discs coincide with the spectrum. Conversely, if the Gershgorin discs coincide with the spectrum, the matrix is diagonal.
One way to interpret this theorem is that if the off-diagonal entries of a square matrix over the complex numbers have small norms, the eigenvalues of the matrix cannot be "far from" the diagonal entries of the matrix. Therefore, by reducing the norms of off-diagonal entries one can attempt to approximate the eigenvalues of the matrix. Of course, diagonal entries may change in the process of minimizing off-diagonal entries.
The theorem does not claim that there is one disc for each eigenvalue; if anything, the discs rather correspond to the axes in
Cn
\begin{pmatrix}3&2&2\ 1&1&0\ 1&0&1\end{pmatrix} \begin{pmatrix}a&0&0\ 0&b&0\ 0&0&c\end{pmatrix} \begin{pmatrix}3&2&2\ 1&1&0\ 1&0&1\end{pmatrix}-1=\begin{pmatrix}-3a+2b+2c&6a-2b-4c&6a-4b-2c\ b-a&a+(a-b)&2(a-b)\ c-a&2(a-c)&a+(a-c)\end{pmatrix}
a
b
c
\left(\begin{smallmatrix}3\ 1\ 1\end{smallmatrix}\right)
\left(\begin{smallmatrix}2\ 1\ 0\end{smallmatrix}\right)
\left(\begin{smallmatrix}2\ 0\ 1\end{smallmatrix}\right)
a
b
a
c
If one of the discs is disjoint from the others then it contains exactly one eigenvalue. If however it meets another disc it is possible that it contains no eigenvalue (for example,
A=\left(\begin{smallmatrix}0&1\\4&0\end{smallmatrix}\right)
A=\left(\begin{smallmatrix}1&-2\\1&-1\end{smallmatrix}\right)
Theorem: If the union of k discs is disjoint from the union of the other n − k discs then the former union contains exactly k and the latter n − k eigenvalues of A, when the eigenvalues are counted with their algebraic multiplicities.
Proof: Let D be the diagonal matrix with entries equal to the diagonal entries of A and let
B(t)=(1-t)D+tA.
We will use the fact that the eigenvalues are continuous in
t
t
The statement is true for
D=B(0)
B(t)
B(t)
t\in[0,1]
d>0
B(t)
B(t)
λ(t)
B(t)
d(t)
d(0)\ged
λ(1)
d(1)=0
0<t0<1
0<d(t0)<d
λ(t0)
λ(1)
Remarks: It is necessary to count the eigenvalues with respect to their algebraic multiplicities. Here is a counter-example :
Consider the matrix,
The union of the first 3 disks does not intersect the last 2, but the matrix has only 2 eigenvectors, e1,e4, and therefore only 2 eigenvalues, demonstrating that theorem is false in its formulation. The demonstration of the shows only that eigenvalues are distinct, however any affirmation about number of them is something that does not fit, and this is a counterexample.
λ(t)
Cn
an\equiv1
λ(t)
B(t)
λ(t)
Cn
Added Remark:
The Gershgorin circle theorem is useful in solving matrix equations of the form Ax = b for x where b is a vector and A is a matrix with a large condition number.
In this kind of problem, the error in the final result is usually of the same order of magnitude as the error in the initial data multiplied by the condition number of A. For instance, if b is known to six decimal places and the condition number of A is 1000 then we can only be confident that x is accurate to three decimal places. For very high condition numbers, even very small errors due to rounding can be magnified to such an extent that the result is meaningless.
It would be good to reduce the condition number of A. This can be done by preconditioning: A matrix P such that P ≈ A−1 is constructed, and then the equation PAx = Pb is solved for x. Using the exact inverse of A would be nice but finding the inverse of a matrix is something we want to avoid because of the computational expense.
Now, since PA ≈ I where I is the identity matrix, the eigenvalues of PA should all be close to 1. By the Gershgorin circle theorem, every eigenvalue of PA lies within a known area and so we can form a rough estimate of how good our choice of P was.
Use the Gershgorin circle theorem to estimate the eigenvalues of:
A=\begin{bmatrix} 10&1&0&1\\ 0.2&8&0.2&0.2\\ 1&1&2&1\\ -1&-1&-1&-11\\ \end{bmatrix}.
Starting with row one, we take the element on the diagonal, aii as the center for the disc. We then take the remaining elements in the row and apply the formula
\sumj\ne|aij|=Ri
to obtain the following four discs:
D(10,2), D(8,0.6), D(2,3), and D(-11,3).
Note that we can improve the accuracy of the last two discs by applying the formula to the corresponding columns of the matrix, obtaining
D(2,1.2)
D(-11,2.2)
The eigenvalues are -10.870, 1.906, 10.046, 7.918. Note that this is a (column) diagonally dominant matrix: . This means that most of the matrix is in the diagonal, which explains why the eigenvalues are so close to the centers of the circles, and the estimates are very good. For a random matrix, we would expect the eigenvalues to be substantially further from the centers of the circles.