Glycoproteins are proteins which contain oligosaccharide (sugar) chains covalently attached to amino acid side-chains. The carbohydrate is attached to the protein in a cotranslational or posttranslational modification. This process is known as glycosylation. Secreted extracellular proteins are often glycosylated.
In proteins that have segments extending extracellularly, the extracellular segments are also often glycosylated. Glycoproteins are also often important integral membrane proteins, where they play a role in cell–cell interactions. It is important to distinguish endoplasmic reticulum-based glycosylation of the secretory system from reversible cytosolic-nuclear glycosylation. Glycoproteins of the cytosol and nucleus can be modified through the reversible addition of a single GlcNAc residue that is considered reciprocal to phosphorylation and the functions of these are likely to be an additional regulatory mechanism that controls phosphorylation-based signalling.[1] In contrast, classical secretory glycosylation can be structurally essential. For example, inhibition of asparagine-linked, i.e. N-linked, glycosylation can prevent proper glycoprotein folding and full inhibition can be toxic to an individual cell. In contrast, perturbation of glycan processing (enzymatic removal/addition of carbohydrate residues to the glycan), which occurs in both the endoplasmic reticulum and Golgi apparatus, is dispensable for isolated cells (as evidenced by survival with glycosides inhibitors) but can lead to human disease (congenital disorders of glycosylation) and can be lethal in animal models. It is therefore likely that the fine processing of glycans is important for endogenous functionality, such as cell trafficking, but that this is likely to have been secondary to its role in host-pathogen interactions. A famous example of this latter effect is the ABO blood group system.
Though there are different types of glycoproteins, the most common are N-linked and O-linked glycoproteins.[2] These two types of glycoproteins are distinguished by structural differences that give them their names. Glycoproteins vary greatly in composition, making many different compounds such as antibodies or hormones.[3] Due to the wide array of functions within the body, interest in glycoprotein synthesis for medical use has increased.[4] There are now several methods to synthesize glycoproteins, including recombination and glycosylation of proteins.
Glycosylation is also known to occur on nucleocytoplasmic proteins in the form of O-GlcNAc.[5]
There are several types of glycosylation, although the first two are the most common.
Monosaccharides commonly found in eukaryotic glycoproteins include:[7]
Sugar | Type | Abbreviation |
---|---|---|
β-D-Glucose | Hexose | Glc |
β-D-Galactose | Hexose | Gal |
β-D-Mannose | Hexose | Man |
α-L-Fucose | Deoxyhexose | Fuc |
N-Acetylgalactosamine | Aminohexose | GalNAc |
N-Acetylglucosamine | Aminohexose | GlcNAc |
N-Acetylneuraminic acid | Aminononulosonic acid (Sialic acid) | NeuNAc |
Xylose | Pentose | Xyl |
The sugar group(s) can assist in protein folding, improve proteins' stability and are involved in cell signalling.
The critical structural element of all glycoproteins is having oligosaccharides bonded covalently to a protein. There are 10 common monosaccharides in mammalian glycans including: glucose (Glc), fucose (Fuc), xylose (Xyl), mannose (Man), galactose (Gal), N-acetylglucosamine (GlcNAc), glucuronic acid (GlcA), iduronic acid (IdoA), N-acetylgalactosamine (GalNAc), sialic acid, and 5-N-acetylneuraminic acid (Neu5Ac). These glycans link themselves to specific areas of the protein amino acid chain.
The two most common linkages in glycoproteins are N-linked and O-linked glycoproteins. An N-linked glycoprotein has glycan bonds to the nitrogen containing an asparagine amino acid within the protein sequence. An O-linked glycoprotein has the sugar is bonded to an oxygen atom of a serine or threonine amino acid in the protein.
Glycoprotein size and composition can vary largely, with carbohydrate composition ranges from 1% to 70% of the total mass of the glycoprotein. Within the cell, they appear in the blood, the extracellular matrix, or on the outer surface of the plasma membrane, and make up a large portion of the proteins secreted by eukaryotic cells. They are very broad in their applications and can function as a variety of chemicals from antibodies to hormones.
Glycomics is the study of the carbohydrate components of cells. Though not exclusive to glycoproteins, it can reveal more information about different glycoproteins and their structure. One of the purposes of this field of study is to determine which proteins are glycosylated and where in the amino acid sequence the glycosylation occurs. Historically, mass spectrometry has been used to identify the structure of glycoproteins and characterize the carbohydrate chains attached.[9]
The unique interaction between the oligosaccharide chains have different applications. First, it aids in quality control by identifying misfolded proteins. The oligosaccharide chains also change the solubility and polarity of the proteins that they are bonded to. For example, if the oligosaccharide chains are negatively charged, with enough density around the protein, they can repulse proteolytic enzymes away from the bonded protein. The diversity in interactions lends itself to different types of glycoproteins with different structures and functions.
One example of glycoproteins found in the body is mucins, which are secreted in the mucus of the respiratory and digestive tracts. The sugars when attached to mucins give them considerable water-holding capacity and also make them resistant to proteolysis by digestive enzymes.
Glycoproteins are important for white blood cell recognition. Examples of glycoproteins in the immune system are:
H antigen of the ABO blood compatibility antigens.Other examples of glycoproteins include:
Soluble glycoproteins often show a high viscosity, for example, in egg white and blood plasma.
Variable surface glycoproteins allow the sleeping sickness Trypanosoma parasite to escape the immune response of the host.
The viral spike of the human immunodeficiency virus is heavily glycosylated.[11] Approximately half the mass of the spike is glycosylation and the glycans act to limit antibody recognition as the glycans are assembled by the host cell and so are largely 'self'. Over time, some patients can evolve antibodies to recognise the HIV glycans and almost all so-called 'broadly neutralising antibodies (bnAbs) recognise some glycans. This is possible mainly because the unusually high density of glycans hinders normal glycan maturation and they are therefore trapped in the premature, high-mannose, state.[12] [13] This provides a window for immune recognition. In addition, as these glycans are much less variable than the underlying protein, they have emerged as promising targets for vaccine design.[14]
P-glycoproteins are critical for antitumor research due to its ability block the effects of antitumor drugs.[15] P-glycoprotein, or multidrug transporter (MDR1), is a type of ABC transporter that transports compounds out of cells. This transportation of compounds out of cells includes drugs made to be delivered to the cell, causing a decrease in drug effectiveness. Therefore, being able to inhibit this behavior would decrease P-glycoprotein interference in drug delivery, making this an important topic in drug discovery. For example, P-Glycoprotein causes a decrease in anti-cancer drug accumulation within tumor cells, limiting the effectiveness of chemotherapies used to treat cancer.
Hormones that are glycoproteins include:
Quoting from recommendations for IUPAC:[16]
Function | Glycoproteins |
---|---|
Structural molecule | Collagens |
Lubricant and protective agent | Mucins |
Transport molecule | Transferrin, ceruloplasmin |
Immunologic molecule | Immunoglobulins, histocompatibility antigens |
Hormone | Human chorionic gonadotropin (HCG), thyroid-stimulating hormone (TSH) |
Enzyme | Various, e.g., alkaline phosphatase, patatin |
Cell attachment-recognition site | Various proteins involved in cell–cell (e.g., sperm–oocyte), virus–cell, bacterium–cell, and hormone–cell interactions |
Antifreeze protein | Certain plasma proteins of coldwater fish |
Interact with specific carbohydrates | Lectins, selectins (cell adhesion lectins), antibodies |
Receptor | Various proteins involved in hormone and drug action |
Affect folding of certain proteins | Calnexin, calreticulin |
Regulation of development | Notch and its analogs, key proteins in development |
Hemostasis (and thrombosis) | Specific glycoproteins on the surface membranes of platelets |
A variety of methods used in detection, purification, and structural analysis of glycoproteins are
Method | Use |
---|---|
Periodic acid-Schiff stain | Detects glycoproteins as pink bands after electrophoretic separation. |
Incubation of cultured cells with glycoproteins as radioactive decay bands | Leads to detection of a radioactive sugar after electrophoretic separation. |
Treatment with appropriate endo- or exoglycosidase or phospholipases | Resultant shifts in electrophoretic migration help distinguish among proteins with N-glycan, O-glycan, or GPI linkages and also between high mannose and complex N-glycans. |
Agarose-lectin column chromatography, lectin affinity chromatography | To purify glycoproteins or glycopeptides that bind the particular lectin used. |
Lectin affinity electrophoresis | Resultant shifts in electrophoretic migration help distinguish and characterize glycoforms, i.e. variants of a glycoprotein differing in carbohydrate. |
Compositional analysis following acid hydrolysis | Identifies sugars that the glycoprotein contains and their stoichiometry. |
Mass spectrometry | Provides information on molecular mass, composition, sequence, and sometimes branching of a glycan chain. It can also be used for site-specific glycosylation profiling. |
NMR spectroscopy | To identify specific sugars, their sequence, linkages, and the anomeric nature of glycosidic chain. |
Multi-angle light scattering | In conjunction with size-exclusion chromatography, UV/Vis absorption and differential refractometry, provides information on molecular mass, protein-carbohydrate ratio, aggregation state, size, and sometimes branching of a glycan chain. In conjunction with composition-gradient analysis, analyzes self- and hetero-association to determine binding affinity and stoichiometry with proteins or carbohydrates in solution without labeling. |
Dual Polarisation Interferometry | Measures the mechanisms underlying the biomolecular interactions, including reaction rates, affinities and associated conformational changes. |
Methylation (linkage) analysis | To determine linkage between sugars. |
Amino acid or cDNA sequencing | Determination of amino acid sequence. |
The glycosylation of proteins has an array of different applications from influencing cell to cell communication to changing the thermal stability and the folding of proteins.[17] Due to the unique abilities of glycoproteins, they can be used in many therapies. By understanding glycoproteins and their synthesis, they can be made to treat cancer, Crohn's Disease, high cholesterol, and more.
The process of glycosylation (binding a carbohydrate to a protein) is a post-translational modification, meaning it happens after the production of the protein. Glycosylation is a process that roughly half of all human proteins undergo and heavily influences the properties and functions of the protein. Within the cell, glycosylation occurs in the endoplasmic reticulum.
There are several techniques for the assembly of glycoproteins. One technique utilizes recombination. The first consideration for this method is the choice of host, as there are many different factors that can influence the success of glycoprotein recombination such as cost, the host environment, the efficacy of the process, and other considerations. Some examples of host cells include E. coli, yeast, plant cells, insect cells, and mammalian cells. Of these options, mammalian cells are the most common because their use does not face the same challenges that other host cells do such as different glycan structures, shorter half life, and potential unwanted immune responses in humans. Of mammalian cells, the most common cell line used for recombinant glycoprotein production is the Chinese hamster ovary line. However, as technologies develop, the most promising cell lines for recombinant glycoprotein production are human cell lines.
The formation of the link between the glycan and the protein is key element of the synthesis of glycoproteins. The most common method of glycosylation of N-linked glycoproteins is through the reaction between a protected glycan and a protected Asparagine. Similarly, an O-linked glycoprotein can be formed through the addition of a glycosyl donor with a protected Serine or Threonine. These two methods are examples of natural linkage. However, there are also methods of unnatural linkages. Some methods include ligation and a reaction between a serine-derived sulfamidate and thiohexoses in water. Once this linkage is complete, the amino acid sequence can be expanded upon using solid-phase peptide synthesis.