Synonym: | Trinucleotide repeat expansion disorders, Triplet repeat expansion disorders or Codon reiteration disorders |
In genetics, trinucleotide repeat disorders, a subset of microsatellite expansion diseases (also known as repeat expansion disorders), are a set of over 30 genetic disorders caused by trinucleotide repeat expansion, a kind of mutation in which repeats of three nucleotides (trinucleotide repeats) increase in copy numbers until they cross a threshold above which they cause developmental, neurological or neuromuscular disorders.[1] [2] [3] In addition to the expansions of these trinucleotide repeats, expansions of one tetranucleotide (CCTG),[4] five pentanucleotide (ATTCT, TGGAA, TTTTA, TTTCA, and AAGGG), three hexanucleotide (GGCCTG, CCCTCT, and GGGGCC), and one dodecanucleotide (CCCCGCCCCGCG) repeat cause 13 other diseases.[5] Depending on its location, the unstable trinucleotide repeat may cause defects in a protein encoded by a gene; change the regulation of gene expression; produce a toxic RNA, or lead to production of a toxic protein. In general, the larger the expansion the faster the onset of disease, and the more severe the disease becomes.
Trinucleotide repeats are a subset of a larger class of unstable microsatellite repeats that occur throughout all genomes.
The first trinucleotide repeat disease to be identified was fragile X syndrome, which has since been mapped to the long arm of the X chromosome. Patients carry from 230 to 4000 CGG repeats in the gene that causes fragile X syndrome, while unaffected individuals have up to 50 repeats and carriers of the disease have 60 to 230 repeats. The chromosomal instability resulting from this trinucleotide expansion presents clinically as intellectual disability, distinctive facial features, and macroorchidism in males. The second DNA-triplet repeat disease, fragile X-E syndrome, was also identified on the X chromosome, but was found to be the result of an expanded CCG repeat.[6] The discovery that trinucleotide repeats could expand during intergenerational transmission and could cause disease was the first evidence that not all disease-causing mutations are stably transmitted from parent to offspring.
Trinucleotide repeat disorders and the related microsatellite repeat disorders affect about 1 in 3,000 people worldwide. However, the frequency of occurrence of any one particular repeat sequence disorder varies greatly by ethnic group and geographic location.[7] Many regions of the genome (exons, introns, intergenic regions) normally contain trinucleotide sequences, or repeated sequences of one particular nucleotide, or sequences of 2, 4, 5 or 6 nucleotides. Such repetitive sequences occur at a low level that can be regarded as "normal".[8] Sometimes, a person may have more than the usual number of copies of a repeat sequence associated with a gene, but not enough to alter the function of that gene. These individuals are referred to as "premutation carriers". The frequency of carriers worldwide appears to be 1 in 340 individuals. Some carriers, during the formation of eggs or sperm, may give rise to higher levels of repetition of the repeat they carry. The higher level may then be at a "mutation" level and cause symptoms in their offspring.
Three categories of trinucleotide repeat disorders and related microsatellite (4, 5, or 6 repeats) disorders are described by Boivin and Charlet-Berguerand.[2]
The first main category these authors discuss is repeat expansions located within the promoter region of a gene or located close to, but upstream of, a promoter region of a gene. These repeats are able to promote localized DNA epigenetic changes such as methylation of cytosines. Such epigenetic alterations can inhibit transcription,[9] causing reduced expression of the associated encoded protein. The epigenetic alterations and their effects are described more fully by Barbé and Finkbeiner[10] These authors cite evidence that the age at which an individual begins to experience symptoms, as well as the severity of disease, is determined both by the size of the repeat and the epigenetic state within the repeat and around the repeat. There is often increased methylation at CpG islands near the repeat region, resulting in a closed chromatin state, causing gene downregulation. This first category is designated as "loss of function".
The second main category of trinucleotide repeat disorders and related microsatellite disorders involves a toxic RNA gain of function mechanism. In this second type of disorder, large repeat expansions in DNA are transcribed into pathogenic RNAs that form nuclear RNA foci. These foci attract and alter the location and function of RNA binding proteins. This, in turn, causes multiple RNA processing defects that lead to the diverse clinical manifestations of these diseases.
The third main category of trinucleotide repeat disorders and related microsatellite disorders is due to the translation of repeat sequenced into pathogenic proteins containing a stretch of repeated amino acids. This results in, variously, a toxic gain of function, a loss of function, a dominant negative effect and/or a mix of these mechanisms for the protein hosting the expansion. Translation of these repeat expansions occurs mostly through two mechanisms. First, there may be translation initiated at the usual AUG or a similar (CUG, GUG, UUG, or ACG) start codon. This results in expression of a pathogenic protein encoded by one particular coding frame. Second, a mechanism named "repeat-associated non-AUG (RAN) translation" uses translation initiation that starts directly within the repeat expansion. This potentially results in expression of three different proteins encoded by the three possible reading frames. Usually, one of the three proteins is more toxic than the other two. Typical of these RAN type expansions are those with the trinucleotide repeat CAG. These often are translated into polyglutamine-containing proteins that form inclusions and are toxic to neuronal cells. Examples of the disorders caused by this mechanism include Huntington's disease and Huntington disease-like 2, spinal-bulbar muscular atrophy, dentatorubral-pallidoluysian atrophy, and spinocerebellar ataxia 1–3, 6–8, and 17.
The first main category, the loss of function type with epigenetic contributions, can have repeats located in either a promoter, in 5'untranscribed regions upstream of promoters, or in introns. The second category, toxic RNAs, has repeats located in introns or in a 3' untranslated region of code beyond the stop codon. The third category, largely producing toxic proteins with polyalanines or polyglutamines, has trinucleotide repeats that occur in the exons of the affected genes.
Some of the problems in trinucleotide repeat syndromes result from causing alterations in the coding region of the gene, while others are caused by altered gene regulation.[1] In over half of these disorders, the repeated trinucleotide, or codon, is CAG. In a coding region, CAG codes for glutamine (Q), so CAG repeats result in an expanded polyglutamine tract. These diseases are commonly referred to as polyglutamine (or polyQ) diseases. The repeated codons in the remaining disorders do not code for glutamine, and these can be classified as non-polyQ or non-coding trinucleotide repeat disorders.
Type | Gene | Normal PolyQ repeats | Pathogenic PolyQ repeats | - | DRPLA (Dentatorubropallidoluysian atrophy) | ATN1 or DRPLA | 6 - 35 | 49 - 88 | - | HD (Huntington's disease) | 6 - 35 | 36 - 250 | - | SBMA (Spinal and bulbar muscular atrophy)[11] | 4 - 34 | 35 - 72 | - | SCA1 (Spinocerebellar ataxia Type 1) | 6 - 35 | 49 - 88 | - | SCA2 (Spinocerebellar ataxia Type 2) | 14 - 32 | 33 - 77 | - | SCA3 (Spinocerebellar ataxia Type 3 or Machado-Joseph disease) | 12 - 40 | 55 - 86 | - | SCA6 (Spinocerebellar ataxia Type 6) | 4 - 18 | 21 - 30 | - | SCA7 (Spinocerebellar ataxia Type 7) | 7 - 17 | 38 - 120 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SCA17 (Spinocerebellar ataxia Type 17) | TBP | 25 - 42 | 47 - 63 |
Type | Gene | Codon | Normal | Pathogenic | Mechanism | - | FRAXA (Fragile X syndrome) | CGG (5' UTR) | 6 - 53 | 230+ | abnormal methylation | - | FXTAS (Fragile X-associated tremor/ataxia syndrome) | CGG (5' UTR) | 6 - 53 | 55-200 | increased expression, and a novel polyglycine product[12] | - | FRAXE (Fragile XE mental retardation) | CCG (5' UTR) | 6 - 35 | 200+ | abnormal methylation | - | Baratela-Scott syndrome[13] | XYLT1 | GGC (5' UTR) | 6 - 35 | 200+ | abnormal methylation | - | FRDA (Friedreich's ataxia) | GAA (Intron) | 7 - 34 | 100+ | impaired transcription | - | DM1 (Myotonic dystrophy Type 1) | CTG (3' UTR) | 5 - 34 | 50+ | RNA-based; unbalanced DMPK/ZNF9 expression levels | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DM2 (Myotonic dystrophy Type 2) | CNBP | CCTG (3' UTR) | 11 - 26 | 75+ | RNA-based; Nuclear RNA accumulation[14] | ||||||||||||||||||||||||||||||||||||||||||
SCA8 (Spinocerebellar ataxia Type 8) | CTG (RNA) | 16 - 37 | 110 - 250 | ? RNA | |||||||||||||||||||||||||||||||||||||||||||
SCA12 (Spinocerebellar ataxia Type 12)[15] [16] | PPP2R2B | CAG (5' UTR) | 7 - 28 | 55 - 78 | effect on promoter function |
, ten neurological and neuromuscular disorders were known to be caused by an increased number of CAG repeats.[17] Although these diseases share the same repeated codon (CAG) and some symptoms, the repeats are found in different, unrelated genes. Except for the CAG repeat expansion in the 5' UTR of PPP2R2B in SCA12, the expanded CAG repeats are translated into an uninterrupted sequence of glutamine residues, forming a polyQ tract, and the accumulation of polyQ proteins damages key cellular functions such as the ubiquitin-proteasome system. A common symptom of polyQ diseases is the progressive degeneration of nerve cells, usually affecting people later in life. However different polyQ-containing proteins damage different subsets of neurons, leading to different symptoms.[18]
The non-polyQ diseases or non-coding trinucleotide repeat disorders do not share any specific symptoms and are unlike the PolyQ diseases. In some of these diseases, such as Fragile X syndrome, the pathology is caused by lack of the normal function of the protein encoded by the affected gene. In others, such as Myotonic Dystrophy Type 1, the pathology is caused by a change in protein expression or function mediated through changes in the messenger RNA produced by the expression of the affected gene. In yet others, the pathology is caused by toxic assemblies of RNA in the nuclei of cells.[19]
Repeat count | Classification | Disease status | |
---|---|---|---|
<28 | Normal | Unaffected | |
28–35 | Intermediate | Unaffected | |
36–40 | Reduced-penetrance | May be affected | |
>40 | Full-penetrance | Affected |
Huntington's very rarely occurs spontaneously; it is almost always the result of inheriting the defective gene from an affected parent. However, sporadic cases of Huntington's in individuals who have no history of the disease in their families do occur. Among these sporadic cases, there is a higher frequency of individuals with a parent who already has a significant number of CAG repeats in their HTT gene, especially those whose repeats approach the number (36) required for the disease to manifest. Each successive generation in a Huntington's-affected family may add additional CAG repeats, and the higher the number of repeats, the more severe the disease and the earlier its onset. As a result, families that have had Huntington's for many generations show an earlier age of disease onset and faster disease progression.
The majority of diseases caused by expansions of simple DNA repeats involve trinucleotide repeats, but tetra-, penta- and dodecanucleotide repeat expansions are also known that cause disease. For any specific hereditary disorder, only one repeat expands in a particular gene.[21]
Triplet expansion is caused by slippage during DNA replication or during DNA repair synthesis.[22] Because the tandem repeats have identical sequence to one another, base pairing between two DNA strands can take place at multiple points along the sequence. This may lead to the formation of 'loop out' structures during DNA replication or DNA repair synthesis.[23] This may lead to repeated copying of the repeated sequence, expanding the number of repeats. Additional mechanisms involving hybrid RNA:DNA intermediates have been proposed.[24] [25]