7.9 The Genetic Code

There is a direct correspondence from gene sequence to protein sequence. The genetic code uses groups of three bases, called codons, to represent individual amino acids. Here it is using DNA bases; it's not important to memorize it, though that can be done.

First
letter
Second letter
T C A G
T TTT phenylalanine F
TTC phenylalanine F
TTA leucine L
TTG leucine L
TCT serine S
TCC serine S
TCA serine S
TCG serine S
TAT tyrosine Y
TAC tyrosine Y
TAA stop (Ochre)
TAG stop (Amber)
TGT cysteine C
TGC cysteine C
TGA stop (Opal)
TGG tryptophan W
C CTT leucine L
CTC leucine L
CTA leucine L
CTG leucine L
CCT proline P
CCC proline P
CCA proline P
CCG proline P
CAT histidine H
CAC histidine H
CAA glutamine Q
CAG glutamine Q
CGT arginine R
CGC arginine R
CGA arginine R
CGG arginine R
A ATT isoleucine I
ATC isoleucine I
ATA isoleucine I
ATG methionine M
ACT threonine T
ACC threonine T
ACA threonine T
ACG threonine T
AAT asparagine N
AAC asparagine N
AAA lysine K
AAG lysine K
AGT serine S
AGC serine S
AGA arginine R
AGG arginine R
G GTT valine V
GTC valine V
GTA valine V
GTG valine V
GCT alanine A
GCC alanine A
GCA alanine A
GCG alanine A
GAT aspartic acid D
GAC aspartic acid D
GAA glutamic acid E
GAG glutamic acid E
GGT glycine G
GGC glycine G
GGA glycine G
GGG glycine G

(RNA codons are the same except replacing all instances of T with U.)

Coincidentally, all guanines (GGG) is a codon for glycine (G), and that's the only letter this is true for. But that is purely a quirk of the symbols we use and doesn't mean anything in terms of the molecules themselves.

The genes in DNA are demarcated by patterns of DNA sequence called a promoter and a terminator. In order to translate a gene into a protein, the cellular machinery must first make an RNA copy of the gene. The promoter is a region of DNA that signals the enzymes to "start here"; the terminator tells it "okay, all done." Once the section of DNA is copied, the new RNA will have the same sequence as the coding side of the DNA; it will have been copied from the opposite DNA strand, so it will be a complement of a complement and therefore identical to the original sequence, except of course with U instead of T.

Protein synthesis always begins with a start codon, which is AUG, which happens to be the codon for methionine (M). (In fact, it is the only codon for methionine.) The protein ends when the sequence reaches a stop codon. There are three stop codons: UAA, UAG, and UGA, code named ochre, amber, and opal, however some organisms have co-opted one or another of the stop codons to code for nonstandard amino acids.

Next Lesson | Table of Contents | Main