Kabsch algorithm explained

The Kabsch algorithm, also known as the Kabsch-Umeyama algorithm,[1] named after Wolfgang Kabsch and Shinji Umeyama, is a method for calculating the optimal rotation matrix that minimizes the RMSD (root mean squared deviation) between two paired sets of points. It is useful for point-set registration in computer graphics, and in cheminformatics and bioinformatics to compare molecular and protein structures (in particular, see root-mean-square deviation (bioinformatics)).

The algorithm only computes the rotation matrix, but it also requires the computation of a translation vector. When both the translation and rotation are actually performed, the algorithm is sometimes called partial Procrustes superimposition (see also orthogonal Procrustes problem).

Description

Let and be two sets, each containing points in

Rn

. We want to find the transformation from to . For simplicity, we will consider the three-dimensional case (

n=3

).The sets and can each be represented by matrices with the first row containing the coordinates of the first point, the second row containing the coordinates of the second point, and so on, as shown in this matrix:

\beginx_1 & y_1 & z_1 \\x_2 & y_2 & z_2 \\\vdots & \vdots & \vdots \\x_N & y_N & z_N \end

The algorithm works in three steps: a translation, the computation of a covariance matrix, and the computation of the optimal rotation matrix.

Translation

Both sets of coordinates must be translated first, so that their centroid coincides with the origin of the coordinate system. This is done by subtracting the centroid coordinates from the point coordinates.

Computation of the covariance matrix

The second step consists of calculating a matrix . In matrix notation,

H=PTQ

or, using summation notation,

Hij=

N
\sum
k=1

PkiQkj,

which is a cross-covariance matrix when and are seen as data matrices.

Computation of the optimal rotation matrix

It is possible to calculate the optimal rotation based on the matrix formula

R=\left(HT

12
H
H\right)

-1,

but implementing a numerical solution to this formula becomes complicated when all special cases are accounted for (for example, the case of not having an inverse).

If singular value decomposition (SVD) routines are available the optimal rotation,, can be calculated using the following algorithm.

First, calculate the SVD of the covariance matrix,

H=U\SigmaVT

where and are orthogonal and

\Sigma

is diagonal. Next, record if the orthogonal matrices contain a reflection,

d=\det\left(UVT\right)=\det(U)\det(V).

Finally, calculate our optimal rotation matrix as

R=U\begin{pmatrix} 1&0&0\\ 0&1&0\\ 0&0&d\end{pmatrix}VT.

This minimizes

N|R
\sum
k=1

qk-pk|

, where

qk

and

pk

are rows in and respectively.

Alternatively, optimal rotation matrix can also be directly evaluated as quaternion.[2] [3] [4] [5] This alternative description has been used in the development of a rigorous method for removing rigid-body motions from molecular dynamics trajectories of flexible molecules.[6] In 2002 a generalization for the application to probability distributions (continuous or not) was also proposed.[7]

Generalizations

The algorithm was described for points in a three-dimensional space. The generalization to dimensions is immediate.

External links

This SVD algorithm is described in more detail at https://web.archive.org/web/20140225050055/http://cnx.org/content/m11608/latest/

A Matlab function is available at http://www.mathworks.com/matlabcentral/fileexchange/25746-kabsch-algorithm

A C++ implementation (and unit test) using Eigen

A Python script is available at https://github.com/charnley/rmsd. Another implementation can be found inSciPy.

A free PyMol plugin easily implementing Kabsch is https://www.pymolwiki.org/index.php/Kabsch. (This previously linked to CEalign https://wiki.pymol.org/index.php/Cealign, but this uses the Combinatorial Extension (CE) algorithm.) VMD uses the Kabsch algorithm for its alignment.

The FoldX modeling toolsuite incorporates the Kabsch algorithm to measure RMSD between Wild Type and Mutated protein structures.

See also

References

Notes and References

  1. Lawrence . Jim . Bernal . Javier . Witzgall . Christoph . 2019-10-09 . A Purely Algebraic Justification of the Kabsch-Umeyama Algorithm . Journal of Research of the National Institute of Standards and Technology . en . 124 . 124028 . 10.6028/jres.124.028 . 2165-7254 . 7340555 . 34877177.
  2. Horn. Berthold K. P.. Berthold K.P. Horn. 1987-04-01. Closed-form solution of absolute orientation using unit quaternions. Journal of the Optical Society of America A. EN. 4. 4. 629. 10.1364/josaa.4.000629. 1987JOSAA...4..629H. 1520-8532. 10.1.1.68.7320. 11038004 .
  3. Kneller. Gerald R.. 1991-05-01. Superposition of Molecular Structures using Quaternions. Molecular Simulation. 7. 1–2. 113–119. 10.1080/08927029108022453. 0892-7022.
  4. Coutsias . E. A. . Seok . C. . Dill . K. A. . Using quaternions to calculate RMSD . J. Comput. Chem. . 25 . 15 . 1849–1857 . 2004 . 15376254 . 10.1002/jcc.20110. 18224579 .
  5. Petitjean . M. . On the root mean square quantitative chirality and quantitative symmetry measures . J. Math. Phys. . 40 . 9 . 4587–4595 . 1999 . 10.1063/1.532988. 1999JMP....40.4587P .
  6. 2011-08-24. Least constraint approach to the extraction of internal motions from molecular dynamics trajectories of flexible macromolecules. J. Chem. Phys.. 135. 8. 084110. 10.1063/1.3626275. 21895162. 0021-9606. Chevrot. Guillaume. Calligari. Paolo. Hinsen. Konrad. Kneller. Gerald R.. 2011JChPh.135h4110C.
  7. Petitjean . M. . Chiral mixtures . J. Math. Phys. . 43 . 8 . 4147–4157 . 2002 . 10.1063/1.1484559. 2002JMP....43.4147P . 85454709 .