Big Data’s Direct Coupling Analysis Reveals Clues About Molecular Protease Machines

By LabMedica International staff writers
Posted on 08 Apr 2014
Researchers have merged genetic and structural data in a Big Data attempt to solve one of the most fascinating mysteries in biology: how proteins perform the regulatory processes in cells upon which all life depends.

The daily life of a motor molecule involves eating and excreting damaged proteins and converting them into harmless peptides ready for disposal. Without these garbage bins, the Escherichia coli bacteria they attend to would die. Biophysicists from Rice University (Atlanta, GA, USA) used a protease called an FtsH-AAA hexameric peptidase as a model to examine calculations that combine genetic and structural data.

Image: Co-evolved mutations in genetic sequences that code proteins show researchers how a protein is likely to fold and what forms it may take as it carries out its function. Scientists from Rice University used the technique called direct coupling analysis in combination with structure-based models to find a previously hidden conformation of a molecular motor responsible for degrading misfolded proteins in bacteria (Photo courtesy of Faruck Morcos/Rice University).

Dr. José Onuchic, a biologic physicist, and postdoctoral researchers Drs. Biman Jana and Faruck Morcos published their new findings March 2014 in the Royal Society of Chemistry journal, Physical Chemistry Chemical Physics. The study is the first successful attempt to feed data through their computational technique to describe the complex activity of a large molecular machine formed by proteins. Ultimately, understanding these machines will help researchers design drugs to treat diseases including cancer, the focus of Rice’s Center for Theoretical Biological Physics.

“Structural techniques like X-ray crystallography and nuclear magnetic resonance have worked quite well to help us understand how smaller proteins function,” Dr. Onuchic stated. X-rays only take snapshots of constantly moving proteins, he said, “but functional proteins, big protein complexes and molecular machines have multiple conformations. Computational models are also useful, but to understand the full dynamics of these large proteins, where a lot of the interesting biology takes place, we have to supplement them with more information.”

That information comes from direct coupling analysis (DCA), a statistical tool developed by Drs. Morcos and Onuchic with colleagues at the University of California, San Diego (UCSD; USA), and the Pierre and Marie Curie University (Paris, France). DCA looks at the genetic roots of proteins to see how amino acids—the “beads” in the unfolded protein strands—co-evolved to influence the way a protein folds. Each bead carries an inherent energy that contributes to the strand’s unique energy topography, which decrees how it folds into its functional state.

Proteins, even after they fold, are in constant motion, acting as catalysts for countless bodily functions. They can combine into larger molecular machines that grab other molecules, “walk” their payloads within a cell or cause muscles to contract. One such biomachine is FtsH (filamentous temperature-sensitive H), a membrane-bound molecule in E. coli made of six protein copies that form two connected hexagonal rings. The molecule attracts and degrades misfolded proteins and other cellular waste pulling them in through one ring, which closes similar to a shutter of a camera and traps the proteins. They are sliced apart as they leave through the other ring.

Through molecular simulations using structure-based models and the discovery using DCA of probable couplings in the genetic source of the proteins, the researchers found evidence to support the hypothesis of a “paddling” process in the molecule that Dr. Morcos described as a collapse of the two rings once waste found its way inside. “First the ring pore closes to grab the protein; then the molecule flattens,” he said. “Then when the motor is flat, the rings open to release the peptides and the molecule expands again to restart the cycle.”

Key to the success of DCA is the understanding that amino acid mutations represent contacts that co-evolve for specific reasons. The contact maps generated by DCA can reveal previously unknown aspects to model transitions between functional states, such as the paddling in FtsH, Dr. Onuchic said. “We can look at the evolutionary tree of these proteins and see which pairs of amino acids changed together. We then assume these are contacts,” he said. “Through DCA, Faruck uses a lot of physics to understand when two amino acids can act directly or indirectly, and separate the two. Then we predict how coupled they are, and the higher the probability, the more evidence that these are real contacts.”

DCA would do little without the deluge of data available since the ability to scan entire genomes became possible, and even routine, in recent years. Recent developments in the 100-year-old skill of crystallography are making better structure-based models available as well. “Even if the mathematical framework was ready and we had crystallographic data for this motor protein in the 1990s, there weren’t enough sequences available until the 2000s,” Dr. Morcos said. “Now we have all the pieces converging.”

Dr. Morcos noted that by better determining essential motor proteins in bacteria will be important as researchers begin to apply DCA to optimize human healthcare. “For us, the most exciting part is that we’re now able to tackle really big systems,” he said.

Related Links:

Rice University
University of California, San Diego
Pierre and Marie Curie University



Latest BioResearch News