Thursday, June 26, 2014

Adding new alphabets to the language of genes






Write down “FIFA World Cup Football” in Tamil. Now read. It reads: “Peepa voled cup putpall” or something similar. Makes no sense! Prior knowledge of pronunciation becomes relevant. Existing alphabets are modified or new ones added to accommodate new sounds and needs. Scholars have attempted to expand and modify the Tamil alphabet to include sounds such as “f”. Several European languages have modified the basic alphabet using diacritical marks such as the umlaut or the circumflex.
While it is easy with man-made languages, how can one do it with that of biology, which has arisen and stayed on over billions of years of evolution? But when one attempts to do so, new forms might emerge with new properties and new uses. This has been the dream of ‘synthetic biologists,’ who want to create new proteins, new drugs and materials of use.
Actually, biology and living organisms have a two language policy — one of DNA and RNA, the information carrier and transmitter of genetic information, and a second one of proteins, the molecules that carry out the orders written out in the DNA, read out and translated by the RNA (which also helps in stringing up of the protein chain).
Only four
The alphabets of DNA/RNA language are only four — the chemical units designated as A, G, C and T (or U in RNA). Even here the letters are arranged in a two stranded manner, as in a ladder or double helix. The letter A in one strand always faces and pairs with the letter T in the other, while C pairs up and matches the letter G in the second strand. This A-T and G-C pairing is the rule. Words are strung using these four bases, and each word is no more than three bases long (e. g, AGC, CJA, TTG…). With only 4 letters and 3 to a word, the total number of words in a DNA sequence is but 64, and that includes punctuation marks like comma, full stop and so on. RNA alphabet too has only 4 letters A, G, C, and U (instead of T in DNA) and again with the base pairing of A-U and G-C. Sentences, i.e., genes and their transcript RNA sequences however, can run into many thousands of words in length.
The alphabet of the worker molecules, proteins, is richer, but still only 20 in number. Each alphabet is an amino acid; thus the protein chain is a string of 20 such amino acids in sequence. Scientists like Dr Hargobind Khorana helped translate the 64 words of the DNA/RNA language into the 24 letter language of the proteins.
Just as Tamil and European linguists, many biologists too have wondered about expanding the DNA and protein alphabets in biology. After all, nature provides us with many more DNA/RNA alphabets and hundreds more amino acids. And in the chemical lab, we can make many more. Why then can we not coax DNA to expand its vocabulary beyond the 4, and if we can, why not ask it to code for and translate into unusual amino acids strung into novel proteins?
What it involves

What does this involve? First, the DNA sequence needs to be altered, incorporating unnatural bases (call them X and Y), which too pair up like A-T and G-C. This can be done in the lab. Next insert these X and Y into an existing DNA double helical sequence. This too can be done. Now the cell should be able to grow, replicating its DNA and carry on its activity.
In order to do so, we need to supply X and Y from the outside to be transported into the cell. That would need natural molecules that treat X and Y as ‘normal’ and transport them into the cell, so that they can be fit into the DNA sequence. Next this novel DNA should not be thought of as ‘undesirable’ and be degraded by the DNA-repair security guard enzymes. Once these are assured, it should be possible to incorporate X and Y into the genes, and have the gene replication happen.
It has been possible to take care of both these demands. This remarkable achievement has been made by Dr. Floyd Romesberg and colleagues from the Scripps Research Institute at La Jolla, CA, U.S., using a live bacterial cell. The group made X and Y in the lab, isolated proteins from an alga which transport X and Y into the cell. They then took a circular bangle- like DNA molecule from a plasmid, cut it, inserted the sequence containing X and Y, and closed the circle. This semi-natural plasmid was inserted into E. coli and allowed to replicate. Behold, the X and Y are incorporated in the replicated DNA! This report on the making of such a “semi-synthetic organism with an expanded genetic alphabet” appears in the May 15, 2014 issue of Nature, showing that we can expand biological alphabets in vivo, in living systems. Time is not far off when we can make new types of proteins using unnatural genetic structures!

No comments:

Post a Comment