Since 1960s we know that protein's three-dimensional structure is determined by its amino acid sequence (today we also know that sometimes help from "chaperone" proteins is needed during folding). Predicting the shape of the protein (tertiary structure) based only on the amino acid sequence (primary structure), however, was a daunting task.

Proteinomics is a science of solving this problem. It is important, because once we solve it, we will be able to predict the shape of proteins generated by particular genetic code, which is just one step away from predicting the chemical and biological activity of the protein, which is one step from predicting the results (entirely in silico) of a genetic modification. This will essentially (ignoring all the technical difficulties) make it easy to genetically engineer anything to a specification and fix all problems with any living organism, including ourselves.

Blue - simulation, red - experiment

The development of supercomputers (as well as distributed projects such as Folding@Home) is one important aspect of proteinomics, as the power of brute force should never be underestimated. But in addition to that scientists are working on better prediction methods, theories, algorithms, etc. There are some typical structures in all proteins (spirals, etc.), but we are still far from understading how proteins work completely.

The state of the art (as described here) is getting about one-third of relatively short proteins more or less correctly, getting the general idea right. On the illustration blue shows the simulation result, red shows the experimental data. Pretty close, isn't it?

As said above, the next step after solving the folding problem will be understanding how proteins work in generally, predicting the interactions and characteristics of the protein, knowing its 3d shape.


The Challenges

  • The 3D structure of a protein is not a fixed thing. Many proteins are exported from the ER and to do so they unwind, pass through a narrow channel, like a thread through the eye of a needle, and reform on the other side. Siganl sequences on C or N terminal are part of what helps determine the target compartment in the cell or export to extracellular location.
  • At these scales and at normal functional temperature the atoms in the molecules are in constant motion. Imagine the whole molecule vibrating, as if it were a model made of wire weights and springs placed on a washing machine on spin-cycle, and you get a somewhat better picture of reality. Ends of the chain may whip-around in the environment and this may be crucial to the function.
  • The amino acid sequence of a protein is only a part of its actual form. This is the part that can be deduced from the DNA sequence, not allowing for post translational modifications. Many proteins have prosthetic groups, such as heme in hemaglobin, which are central to their catalytic activity. Many proteins carry oligo-saccharides on their surface which radically change how they act.
  • We already know the 3D structure of many proteins from X-Ray crystalography experiments. We also know what these well studied proteins do. Understanding how they do it, the details of how the reaction is catalysed, can still be a huge challenge. Proteins have 'channels' or 'pockets' that allow chosen molecules in and exclude others. How exactly the shape of channel or pocket makes this possible and how the electron interactions make the reactions so specific are anything but easy to understand. It is a whole other ball-game trying to come up with even small changes to achieve specific outcomes.
This is a factual article as opposed to fiction or scenario. It describes the current state of the field and explains expected future developments without speculation or fantasy.