Molecular replacement (MR)[1] is a method of solving the phase problem in X-ray crystallography. MR relies upon the existence of a previously solved protein structure which is similar to our unknown structure from which the diffraction data is derived. This could come from a homologous protein, or from the lower-resolution protein NMR structure of the same protein.[2]
The first goal of the crystallographer is to obtain an electron density map, density being related with diffracted wave as follows:
\rho(x,y,z)= | 1 |
V |
\sumh\sumk\sum\ell|Fhk\ell|\exp(2\pii(hx+ky+\ellz)+i\Phi(hk\ell)).
With usual detectors the intensity
I=F ⋅ F*
\Phi
We can derive a Patterson map for the intensities, which is an interatomic vector map created by squaring the structure factor amplitudes and setting all phases to zero. This vector map contains a peak for each atom related to every other atom, with a large peak at 0,0,0, where vectors relating atoms to themselves "pile up". Such a map is far too noisy to derive any high resolution structural information - however if we generate Patterson maps for the data derived from our unknown structure, and from the structure of a previously solved homologue, in the correct orientation and position within the unit cell, the two Patterson maps should be closely correlated. This principle lies at the heart of MR, and can allow us to infer information about the orientation and location of an unknown molecule with its unit cell.
Due to historic limitations in computing power, an MR search is typically divided into two steps: rotation and translation.
In the rotation function, our unknown Patterson map is compared to Patterson maps derived from our known homologue structure in different orientations. Historically r-factors and/or correlation coefficients were used to score the rotation function, however, modern programs use maximum likelihood-based algorithms. The highest correlation (and therefore scores) are obtained when the two structures (known and unknown) are in similar orientation(s) - these can then be output in Euler angles or spherical polar angles.
In the translation function, the now correctly oriented known model can be correctly positioned by translating it to the correct co-ordinates within the asymmetric unit. This is accomplished by moving the model, calculating a new Patterson map, and comparing it to the unknown-derived Patterson map. This brute-force search is computationally expensive and fast translation functions are now more commonly used. Positions with high correlations are output in Cartesian coordinates.
With the improvement of de novo protein structure prediction, many protocols including MR-Rosetta, QUARK, AWSEM-Suite and I-TASSER-MR can generate a lot of native-like decoy structures that are useful to solve the phase problem by molecular replacement.[3]
Following this, we should have correctly oriented and translated phasing models, from which we can derive phases which are (hopefully) accurate enough to derive electron density maps. These can be used to build and refine an atomic model of our unknown structure.