7.3 Protein Structures

Each amino acid is assigned a letter code, and proteins have a sequence that can be represented by the letters of their residues. Sequences are read from the N-terminus (the end with an unreacted alpha-amine) to the C-terminus (the end with the exposed carboxylic acid). The sequence is the protein's primary structure.

        10         20         30         40         50         60         70         80         90        100 
MNGTEGPNFY VPFSNATGVV RSPFEYPQYY LAEPWQFSML AAYMFLLIVL GFPINFLTLY VTVQHKKLRT PLNYILLNLA VADLFMVLGG FTSTLYTSLH

       110        120        130        140        150        160        170        180        190        200 
GYFVFGPTGC NLEGFFATLG GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLAGWSRYIP EGLQCSCGID YYTLKPEVNN

       210        220        230        240        250        260        270        280        290        300 
ESFVIYMFVV HFTIPMIIIF FCYGQLVFTV KEAAAQQQES ATTQKAEKEV TRMVIIMVIA FLICWVPYAS VAFYIFTHQG SNFGPIFMTI PAFFAKSAAI

       310        320        330        340
YNPVIYIMMN KQFRNCMLTT ICCGKNPLGD DEASATVSKT ETSQVAPA
An example of a protein sequence, human rhodopsin, the light-sensing protein of the rod cells in the eye. The residue numbers are used for identifying residues in order to talk about how they function as part of the protein.

The backbone of a protein is flexible, but only in certain places. The peptide bond is rigid and nearly always occurs in the trans configuration. But the bond between the amino nitrogen and Cα atom of a residue can rotate, and the bond between the Cα atom and the carbonyl group can rotate. The rotations of these two bonds give rise to a great many possible secondary structures.

The most common secondary structures are the alpha helix, the beta sheet, and disordered loops.

In an alpha helix, the backbone is coiled, performing one revolution every 3.6 residues. The side chains stick out at an angle, like the branches of an evergreen tree. The helix is held together by hydrogen bonds between the backbone atoms.

Each amino acid has an alpha helix penalty, as we saw in lesson 7.1. It is a measure of how likely that residue is to disrupt a helix when it occurs in that part of the sequence. Alanine is the most favorable amino acid for alpha helices, with a penalty defined as zero; the lower the penalty, the more likely an amino acid is to be found in an alpha helix. Glycine can disrupt helices by being too flexible; proline can disrupt them by kinks. Serine and threonine can disrupt alpha helix formation by hydrogen bonding its side chain to the backbone.

Beta sheets are made of parallel strands where the backbones are straightened out and lie almost flat. Each strand forms hydrogen bonds with its neighbors, and the side chains point out perpendicular to the sheet.

A disordered loop or disordered region is just a part of the protein that isn't in any specific secondary structure. Oftentimes these are flexible and free to just wave around.

One of the most common shapes for proteins is the seven-helix GPCR or G-protein coupled receptor. The G protein itself was originally a binder for guanine (that's the G of the genetic code), but evolution, being as it is, stumbled upon a clever way to co-opt G proteins as a means of cellular signalling. The GPCR sits embedded in the membrane that encloses a cell. When a stimulus, such as a hormone, a taste compound, an odor compound, a cannabinoid, or (in the case of opsins) a photon arrives at the GPCR, the receptor changes shape slightly, allowing the G protein to make contact and begin the signalling process. Humans have more than 800 different GPCRs, of which more than 160 are targets for medicines and drugs.


3D model of human rhodopsin, showing regions of alpha helix and disordered loops.
Next Lesson | Table of Contents | Main