top of page


(5) Expanding the Genetic Code: Incorporation of Unnatural Amino Acids into Proteins

The canonical genetic code, which includes 64 codons encoding 20 amino acids and three stop signals, is preserved in three kingdoms of life. Proteins carry out most of the complex processes of life, although some of their functions depend on post-translational modifications. All 20 canonical amino acids are introduced into proteins using the standard translational machinery of the cells. However, Nature uses distinctive pathways to encode two additional amino acids, selenocysteine (Sec) and pyrrolysine (Pyl), in certain proteins. In the case of selenoprotein synthesis, the Sec-tRNA(Sec) is generated from a preloaded Ser-tRNA(Sec). Special mRNA elements and elongation factors are also required for the incorporation of Sec into proteins. A similar pathway may be used for the incorporation of Pyl.

Genetic code expansion for improving the delivery of proteins into the cells 

The transport of macromolecules across the cell plasma membrane is a major challenge in biomedical applications, including the development of drug candidates, and therefore, the biological applications of many exogenous proteins are restricted to extracellular targets. The inability of almost all macromolecules to spontaneously enter cells led to the development of strategies for their delivery into mammalian cells. These methods are generally based on cationic cell‐penetrating peptides (CPPs), antibodies, nanoparticles, receptor ligands, and virus‐like particles. Another approach involves the use of supercharged proteins (SCPs) for the delivery of functional macromolecules. Unfortunately, in most of these processes, large amounts of purified proteins are required for a reasonable cellular delivery. Although the attachment of cell‐permeable peptides enhances the cellular delivery of proteins, such modifications alter the protein function inside the cells.


Figure 5.1. A general method for genetically encoding unnatural amino acids into proteins. CREDIT: PRESTON HUEY/SCIENCE

Funding for Research

The genetic code expansion (GCE) of an organism provides an interesting strategy for the genetic incorporation of unnatural amino acids (UAAs) into proteins. In contrast to the post-translational modifications of proteins, the high efficiency and fidelity of GCE allows a site-selective modification at a single amino acid site. Genetically encoding UAAs by using an orthogonal tRNA and tRNA synthetase (Figure 5.1) makes it possible to introduce new chemical, physical, and biological properties into a variety of proteins. UAAs that are fluorescent, photo-responsive, bio-orthogonally reactive, or able to mimic various protein modifications have been successfully incorporated into proteins and the stability, specificity, and catalytic activities of such proteins have been studied. The GCE approach has proved remarkably effective in adding a large number of novel amino acids to the genetic codes of both prokaryotic and eukaryotic organisms. By combining the expertise in the areas of synthetic chemistry and synthetic biology, our group is interested in synthesizing proteins, including metalloproteins, for various biochemical and therapeutic applications. We are interested in the genetic incorporation of UAAs that can improve the delivery of proteins into the cells.


Recently, we reported that a single atom change, hydrogen to halogen (Cl, Br, or I), at one of the tyrosyl residues facilitates the delivery of a green fluorescent protein (GFP into mammalian cells and the highest cellular uptake is observed for GFP having an iodine atom. The halo variants of GFP (EmGFP) were synthesized by expanding the genetic code of E. coli with 3‐halo‐l‐tyrosine (3XY). Methanococcus jannaschii tyrosyl‐tRNA synthetase and tRNA pair was used to incorporate 3XY into EmGFP using an amber codon (UAG) (Figure  5.2). The surface tyrosyl residue at the 39‐position of EmGFP was selected for the single atom modification to facilitate a favorable interaction between the halogen atom and the cell plasma membrane. Also, the introduction of heavier halogen atoms on the surface may not alter the secondary structure of the proteins. The recoded E. coli strain C321ΔA.exp was used for the protein expression.

Figure 5.2. A) The strategy for the site‐specific incorporation of 3‐halo‐l‐tyrosine at the 39‐position of EmGFP utilizing amber stop codon suppression by an engineered orthogonal aaRS/tRNA pair. The proteins are represented as 1TAG‐3ClY, 1TAG‐3BrY, or 1TAG‐3IY, respectively. B) Confocal images confirming the expression of 1TAG‐3ClY or ‐3BrY or ‐3IY EmGFP in E. coli. C,D) Confirmation of purified proteins by SDS‐PAGE and mass spectrometry.

Ref: Angew. Chem. Int. Ed. 2019, 58, 7713. 

The absorption and fluorescence spectra of all four proteins indicated that there is no change in the spectral properties (Figure 2). The circular dichroism (CD) studies indicated that the heavier halogen atoms do not alter the secondary structures of the proteins. As the introduction of halogen atoms into GFP may increase the toxicity, the cell viability was determined by using HepG2 (human liver carcinoma) cells. The toxicity of all three halogenated proteins was found to be almost identical to that of the wild‐type protein, indicating that the replacement of a hydrogen atom with a heavier halogen does not lead to toxicity in mammalian cells. The interesting fluorescent properties of the EmGFPs prompted us to investigate the cellular uptake in mammalian cells. We studied the uptake of the wild‐type (WT) as well as the modified proteins by using laser scanning microscopy and fluorescence microplate reader techniques. The HepG2 cells treated with 1 μM EmGFP‐WT for 90 min showed a very low fluorescence (Figure 2D–F), indicating that the WT protein is not taken up readily by the cells. The cellular uptake was marginally increased for 1TAG‐3ClY, suggesting that the replacement of a hydrogen atom with a chlorine atom at the 39‐position increases the cellular uptake. A further increase in the uptake was observed for the bromo analogue (1TAG‐3BrY) and the amount of protein entered the cells was found to be almost two times higher than that of the WT. Remarkably, a much higher uptake (almost sixfold increase with respect to WT) was observed for 1TAG‐3IY, indicating that the iodine atom facilitates the transport of the protein as observed earlier for the iodinated small molecules.


Hydrophobic small molecules such as benzene and gaseous molecules such as O2 and CO2 can cross the cell membrane by simple diffusion. However, such simple diffusion through the plasma membrane is extremely difficult for macromolecules. Our studies reveal that the proteins are taken up by the cells via an energy‐dependent pathway, involving caveolar endocytosis. The caveolae‐mediated internalization was further confirmed by carrying out the uptake experiments in the presence of genistein (GST), a tyrosine‐kinase inhibitor known not only to cause local disruption of the actin network at the site of endocytosis, but also to inhibit the recruitment of dynamin II. In the presence of GST, an almost complete inhibition of the cellular uptake was observed confirming that all four proteins are taken up by the cells through the ATP‐dependent caveolae‐mediated endocytosis.

Figure 5.3. A,B) Normalized fluorescence intensity and CD spectra of the WT and modified proteins. C) Cell viability of HepG2 cells incubated with halo‐EmGFP (1TAG) for 90 min. D) The fluorescence measured by a plate reader, E) mean fluorescence as determined by flow cytometry, and F) confocal images after 90 min treatment of HepG2 cells with WT and modified EmGFP.

It is known that biomolecules, particularly proteins, that enter the cells by endocytosis pathways get entrapped in endosomes, transported to lysosomes, and are subsequently degraded by specific enzymes. The time‐dependent cellular uptake experiments indicate that both 1TAG‐3IY and 2TAG‐3IY rapidly enter the cells through the receptor‐mediated endocytosis, but a significant decrease in the fluorescence intensity was observed after 24 h (Figure 5.4D,E), probably due to degradation of the proteins in lysosomes (Figure 5.4F). To release the entrapped proteins from the endosomes before their degradation, we co‐treated the cells with the histidine‐rich 20‐mer peptide, ppTG21 (pI 7.7, charge +1.3 @ pH 7.4), which has been reported as a promising endosomolytic agent for plasmid‐based gene delivery. Interestingly, a remarkable enhancement in the fluorescence intensity was observed for both 1TAG‐3IY and 2TAG‐3IY in the presence of ppTG21 after 90 min, and no decrease in the fluorescence was observed even after 24 h (Figure 5.4.E). We obtained similar results when the cells were treated with the peptide after 60 min of treatment with 1TAG‐3IY, indicating the intracellular effect of ppTG21 in the cellular uptake of 1TAG‐3IY. These observations confirm that the iodinated proteins are taken up by the cells preferentially through a receptor‐mediated endocytosis and the proteins can be released effectively into cytosol by using endosomolytic agents such as ppTG21.


In summary, we showed for the first time that proteins can be transported across the cell plasma membrane by a single atom substitution. The introduction of an iodine atom to green fluorescent proteins remarkably enhances the cellular uptake and halogen bonding may play a key role in the caveolae‐mediated endocytosis. The proteins can be released effectively into cytosol by co‐treatment with the histidine‐rich 20‐mer peptide ppTG21, which can mediate the rupture of endosomal membrane by altering the proton gradient in endosomes. This study provides a novel finding that proteins can be transported across the cell membrane by the introduction of iodine atoms on the protein surface.

Figure 5.4. A,B) The fluorescence measured by a plate reader and the corresponding confocal images after 90 min of treatment of HepG2 cells with 1 μM proteins under various conditions. C) Amino acid sequence of ppTG21. D,E) The fluorescence measured by a plate reader and the corresponding confocal images in HepG2 cells after 90 min or 24 h of cotreatment with 1 μm EmGFP and 30 μM ppTG21. F) A schematic representation of caveolae‐mediated cellular uptake and proposed endosomal escape route for EmGFP.

Ref: Angew. Chem. Int. Ed. 2019, 58, 7713. 

bottom of page