Genetic code | Definition, Characteristics, Table, & Facts

In beginning, Dr. George Gamow (1904–1968), the well known nuclear physicist, proposed that the genetic code consists of three nitrogen base and the adjacent triplets overlap.
If the first code is CAG, then the next must begin with AG and the third one with G.
This hypothesis was not accepted in later experiments
Further research showed that the codons are arranged in a linear order. This explains as to why the change in one involves only one amino acid and not three.
If Gamow’s hypothesis were correct, change of one nitrogenous base would have involved 3 amino acids.

Cracking the Genetic Code

The first codon was deciphered in 1961 by Marshall W. Nirenberg of the National Institutes of Health.
Marshall Nirenberg determined the first match, that UUU coded for the amino acid phenylalanine.
He created an artificial mRNA molecule entirely of uracil and added it to a test tube mixture of amino acids, ribosomes, and other components for protein synthesis.
This “poly(U)” translated into a polypeptide containing a single amino acid, phenyalanine, in a long chain.
Other more elaborate techniques were required to decode mixed triplets such a AUA and CGA.

Codons: Singlet & Doublet of Bases

The basic problem of such a genetic code is to indicate how information written in a four-letter language (four nucleotides or nitrogen bases of DNA) can be translated into a twenty-letter-language (twenty amino acids of proteins).

If the genetic code consisted of a single nucleotide or even pairs of nucleotides per amino acid, there would not be enough combinations [4 (singlet) and 16 (doublet)] to code for all 20 amino acids.

Codons: Triplet of Bases

Triplets of nucleotide bases are the smallest units of uniform length that can code for all the amino acids.
In the triplet code, three consecutive bases specify an amino acid, creating 4³ (64) possible code words.
The genetic instructions for a polypeptide chain are written in DNA as a series of three-nucleotide words.

The genetic code has now been experimentally deciphered and perfected by the combined efforts of many biochemists, notably Marshall Warren Nirenberg and Har Gobind Khorana, who were awarded the 1968 Noble Prize for their work, along with Robert Holley who was the first scientist to determine the nucleotide sequence of several tRNAs.

Definition: The Genetic Code

The genetic code is a non-overlapping code, with each amino acid plus polypeptide initiation and termination specified by RNA codons composed of three nucleotides.
The genetic code for protein synthesis is contained in the base sequence of DNA
The specific correspondence between a set of 3 bases and 1 of the 20 amino acids is called the genetic code
Of the 64, as many as 61 triplets code for amino acids.
The codon AUG and GUG encodes methionine and valine, respectively which initiates translation.
All other amino acids, except tryptophan (which is encoded only by UGG), are represented by 2 to 6 triplets.
Three codons (UAA, UAG and UGA) do not indicate amino acids but signal the termination of translation.

Characteristics of the genetic code

Triplet nature
Degeneracy
Non-overlapping
Comma-free
Non-ambguity
Universality
Polarity
Chain initiation codons
Chain termination codons

1. Triplet Nature

As pointed out above, in a triplet code of 64 codons, there is an excess of (64 – 20) = 44 codons and, therefore, more than one codons are present for the same amino acid. Indicates Triplet codon exists.

Certain patterns of the genetic code

Amino acids with similar structural properties tend to have related codons. Thus, aspartic acid codons (GAU, GAC) are similar to glutamic acid codons (GAA, GAG); the difference being exhibited only in the third base (toward 3′ end).
Similarly, the codons for the aromatic amino acids phenylalanine (UUU, UUC), tyrosine (UAU, UAC) and tryptophan (UGG) all begin with uracil (U).
The first two bases of all the 4 codons assigned to each of the 5 amino acids are similar : GC for alanine, GG for glycine, CC for proline, AC for threonine and GU for valine.
All codons with U in the second position specify hydrophobic amino acids (Ile, Leu, Met, Phe, Val).
All codons with A in the second position specify the charged amino acids, except Arg.
All the acidic (Asp, Glu) and basic (Arg, Lys) amino acids have A or G as the second base.

2. Degeneracy

The code is degenerate which means that the same amino acid is coded by more than one base triplet (codon).

The code degeneracy is basically of 2 types:

Partial degeneracy: the first two nucleotides are identical but the third (i.e., 3′ base) nucleotide of the degenerate codon differs; for example, CUU and CUC code for leucine.

Complete degeneracy: occurs when any of the 4 bases can take third position and still code for the same amino acid; for example, UCU, UCC, UCA and UCG all code for serine.

3. Non-overlapping

The genetic code is nonoverlapping, i.e.,the adjacent codons do not overlap. A nonoverlapping code means that the same letter is not used for two different codons. In other words, no single base can take part in the formation of more than one codon.
In actual practice, six bases code for not more than two amino acids.
For example, an end-to-end sequence of 5′ UUUCCC 3′ on mRNA will code only 2 amino acids, i.e., phenylalanine (UUU) and proline (CCC).

4. Comma-free

There is no signal to indicate the end of one codon and the beginning of the next. The genetic code is comma-free (or commaless).
A commaless code means that no codon is reserved for punctuations or the code is without spacers or space words.
There are no intermediary nucleotides (or commas) between the codons. In other words, we can say that after one amino acid is coded, the second amino acid will be automatically coded by the next three letters.

5. Non-ambiguity

There is no ambiguity about a particular codon. A particular codon will always code for the same amino acid

the same amino acid can be coded by more than one codon (degeneracy), the same codon shall not code for two or more different amino acids (non-ambiguous).
In an ambiguous code, the same codon could code for two or more than two different amino acids. Such is not the case.
Exceptional case: Sometimes the genetic code is ambiguous, that is, same codon may specify more than one amino acid. For example, UUU codon usually codes for phenylalanine but in the presence of streptomycin, may also code for isoleucine, leucine or serine.

6. Universality

The genetic code applies to all modern organisms with only very minor exceptions. Although the code is based on work conducted on the bacterium Escherichia coli but it is valid for other organisms. This important characteristic of the genetic code is called its universality. It means that the same sequences of three bases encode the same amino acids in all life forms from simple microorganisms to complex organisms.

In other words, the code is a conservative one, i.e., the code was fixed early in the course of evolution and has been maintained to the present day.

7. Polarity

The genetic code has polarity, that is, the code is always read in a fixed direction, i.e. 5′ → 3′ direction. It is apparent that if the code is read in opposite direction (i.e., 3′ → 5′), it would specify 2 different proteins, since the codon would have reversed base sequence :

8. Chain Initiation Codon

The triplets AUG and GUG play double roles in E. coli. When they occur in between the two ends of a cistron (intermediate position), they code for the amino acids methionine and valine, respectively in an intermediate position in the protein molecule.
But when they occur immediately after a terminator codon, they act as “chain initiation” (C.I.) signals or “starter codons” for the synthesis of a polypeptide chain.
It has also been shown that the initiating methionine molecule should be found in the formylated state.
This makes a distinction between the initiating methionine and the methionine at internal position. The methionine when required at internal position should not be formylated.

9. Chain Termination Codon

The 3 triplets UAA, UAG, UGA do not code for any amino acid. They were originally described as non-sense codons, as against the remaining 61 codons.
The so-called non-sense codons have now been found to be of “special sense”. When any one of them occurs immediately before the triplet AUG or GUG, it causes the release of the polypeptide chain from the ribosome.
Hence, the use of the term ‘non-sense’ is not suitable.

UAA is also called ochre

UAG is also known as amber

UGA is also termed as opal

Start codon sets the reading frame for all remaining codons.

Mutation and Genetic Code

Based on work done in this context, there are two kinds of mutations which have played a very significant role in the study of the genetic code.

These are:

(i) Frame-shift mutations

(ii) Base substitutions

Frame-shift Mutation

The framework of codons would be disturbed as soon as there is a deletion or addition of one or more bases.

When such frame shift mutations were intercrossed, in certain combinations, they gave wild type. It was concluded that one of them was deletion and the other an addition (insertion).

Base Substitution or Amino Acid Replacement

When one base is replaced by another without any deletion or addition, the meaning of one codon containing this altered base will ordinarily change.
In place of a particular amino acid at a particular position in a polypeptide, another amino acid will be incorporated.

Examples:

tryptophan synthetase
hemoglobin, (using sickle cell anemia)

Exceptions to the genetic code

Thank You

Vikas Kashyap

Read Agriculture

Ticker

Genetic code | Definition, Characteristics, Table, & Facts