Human Proteome Catalog Created for Speeding Research and Diagnostic Development

By LabMedica International staff writers
Posted on 30 Jun 2014
An international team of researchers recently created an initial catalog of the human “proteome,” in an effort to provide a protein equivalent of the Human Genome Project. Using 30 different human tissues in total, the scientists identified proteins encoded by 17,294 genes, which is approximately 84% of all of the genes in the human genome predicted to encode proteins.

The project was described May 28, 2014, in the journal Nature, the scientists also reported the identification of 193 novel proteins that came from regions of the genome not expected to code for proteins, suggesting that the human genome is more complicated than earlier believed. The cataloging project, led by researchers at the Johns Hopkins University (Baltimore, MD, USA) and the Institute of Bioinformatics (Bangalore, India; www.ibioinformatics.org), should provide a vital source for biologic research and medical diagnostics, according to the team’s leaders.

“You can think of the human body as a huge library where each protein is a book,” said Akhilesh Pandey, MD, PhD, a professor at the McKusick-Nathans Institute of Genetic Medicine and of biological chemistry, pathology, and oncology at the Johns Hopkins University and the founder and director of the Institute of Bioinformatics. “The difficulty is that we don’t have a comprehensive catalog that gives us the titles of the available books and where to find them. We think we now have a good first draft of that comprehensive catalog.”

Whereas genes determine many of the characteristics of an organism, they do so by providing instructions for creating proteins, the building blocks and taskmasters of cells, and therefore of tissues and organs. For this reason, many investigators believe a catalog of human proteins—and their location within the body—to be even more informative and useful than the catalog of genes in the human genome.

Examining proteins is far more technically problematic than studying genes, Dr. Pandey noted, because the structures and functions of proteins are complex and varied. Furthermore, to just list of existing proteins would not be very helpful without accompanying data about where in the body those proteins are found. Therefore, most protein studies to date have focused on individual tissues, often in the context of specific diseases, he added.

To achieve a more comprehensive survey of the proteome, the researchers started by taking samples of 30 tissues, extracting their proteins and using enzymes like chemical scissors to cut them into smaller pieces, called peptides. They then ran the peptides through a series of instruments designed to figure out their identity and measure their relative abundance. “By generating a comprehensive human protein dataset, we have made it easier for other researchers to identify the proteins in their experiments,” said Dr. Pandey. “We believe our data will become the gold standard in the field, especially because they were all generated using uniform methods and analysis, and state-of-the-art machines.”

Among the proteins whose data patterns have been characterized for the first time are many that were never predicted to exist. The researchers’ most unexpected finding was that 193 of the proteins they identified could be traced back to these apparently noncoding regions of DNA. “This was the most exciting part of this study, finding further complexities in the genome,” remarked Dr. Pandey. “The fact that 193 of the proteins came from DNA sequences predicted to be noncoding means that we don’t fully understand how cells read DNA, because clearly those sequences do code for proteins.”

Dr. Pandey believes that the human proteome is so extensive and complex that researchers’ catalog of it will never be fully complete, but this research provides a solid foundation that others can effectively build upon.

Related Links:

Johns Hopkins University
Institute of Bioinformatics



Latest BioResearch News