Robust Method Developed for Microbiome Analysis
By LabMedica International staff writers Posted on 16 Mar 2016 |

Image: Clostridium difficile colonies after 48 hours’ growth on blood agar plate. By competitive exclusion, healthy gut flora help limit growth of pathogenic yeasts and bacteria. Overgrowth of C. difficile in the gut can cause pseudomembranous colitis, and is the most frequently identified cause of antibiotic-associated diarrhea. (Image courtesy of Dr. Holdeman, Centers for Disease Control & Prevention (CDC; image ID #3647), and Wikimedia.)
Scientists have developed a technique for genome sequence data analysis that enables to more efficiently and accurately identify differences between metagenomes for a variety of bacterial communities, which can help to study, diagnose, and treat many human diseases. In a new study, the method was successfully tested on intestinal microbiota.
A team led by scientists from the Moscow Institute of Physics and Technology (Moscow, Russia) have proposed a new method for comparison of metagenome-coupled DNA sequences from all organisms in a biological sample. The method makes it possible to more effectively solve the task of comparing samples and can be easily embedded into a metagenome data-analysis process.
Bacterial cells in the human body, most of which are located in the gut, hold a special place in metagenomics, including the "Human Microbiome Project." Microbiota composition is sensitive to processes occurring in the body. Thus, comparing samples from patients with samples from people with a healthy intestinal metagenome, will likely lead to methods that can evaluate risk of various diseases, including diabetes and inflammatory bowel disease.
The traditional approach to metagenome analysis is to compare samples on the basis of their taxonomic composition: percentages for each microbial species found. To determine sample composition, its genetic sequences are compared with a “reference set” database of known bacterial genomes. However, this approach has several disadvantages. Firstly, the reference genomes are often inaccurate, since the composition of the reference genome is a computationally complex and time-consuming task, especially for difficult-to-cultivate species; and the genomes of species isolated in the laboratory can carry genes significantly different from the same species living in a natural environment. Secondly, generally not all organisms are collected in the reference set of genomes (e.g., viruses). Therefore, the part of the sample sequence that does not match with the reference sample, is simply not taken into account during the analysis, despite the fact that it can be quite significant.
The new method is based on a comparison of “k-mer” frequencies, which does not require recourse to a reference sample or availability of any information on organisms being examined. All sequences in the sample are subjected to analysis, providing optimal results. Each genomic sequence is represented as a set with all instances of nucleotide "words" of specified length "k," called k-mers. As each genome sequence is unique, the sets of such "words" differ between individual organisms. Thus, the set of all k-mers for a metagenome can be viewed as a set of sets, namely of its constituent organisms. This enables assessment of the differences in the bacterial composition when comparing samples.
To test the effectiveness of the k-mer technique compared to traditional approaches, two sets of metagenome data were used—a set of real data and a set of artificially generated data. Artificial data (created from genomes with proportions known beforehand) is convenient for testing the method as the sequence is precisely known and the result can be assessed by comparing with an a priori correct value.
As the real-data set: intestinal metagenomes from residents of the United States and China were used. Intestinal bacterial communities differ significantly between different populations, and algorithms have claimed to allow to find exactly those indicators that show the difference in composition. Therefore, the criterion for assessing the effectiveness of the new method was the extent to which the metagenomes can be distinguished, that is how much the Chinese metagenomes differ in general from American ones.
The k-mers comparison method has shown better results in both data types than by using traditional mapping with a reference set. In addition, when using real data, a mismatch between the intestinal results for k-mer and traditional approaches allowed the researchers to detect another important component of the intestinal metagenome: namely the bacterial phage crAssphage, which had escaped the notice of researchers using the traditional method.
"Interestingly, the genes can be viewed not only as segments of DNA with proteins encoded in them, but also as information in general. It is this information distinction that has allowed us to identify new segments of DNA not described in the catalog of known genes. It [will be] interesting to see how this approach will be used by other research groups," said coauthor Dmitry Alexeev.
The study, by Dubinkina VB et al., was published January 16, 2016, in the journal BMC Bioinformatics.
Related Links:
Moscow Institute of Physics and Technology
A team led by scientists from the Moscow Institute of Physics and Technology (Moscow, Russia) have proposed a new method for comparison of metagenome-coupled DNA sequences from all organisms in a biological sample. The method makes it possible to more effectively solve the task of comparing samples and can be easily embedded into a metagenome data-analysis process.
Bacterial cells in the human body, most of which are located in the gut, hold a special place in metagenomics, including the "Human Microbiome Project." Microbiota composition is sensitive to processes occurring in the body. Thus, comparing samples from patients with samples from people with a healthy intestinal metagenome, will likely lead to methods that can evaluate risk of various diseases, including diabetes and inflammatory bowel disease.
The traditional approach to metagenome analysis is to compare samples on the basis of their taxonomic composition: percentages for each microbial species found. To determine sample composition, its genetic sequences are compared with a “reference set” database of known bacterial genomes. However, this approach has several disadvantages. Firstly, the reference genomes are often inaccurate, since the composition of the reference genome is a computationally complex and time-consuming task, especially for difficult-to-cultivate species; and the genomes of species isolated in the laboratory can carry genes significantly different from the same species living in a natural environment. Secondly, generally not all organisms are collected in the reference set of genomes (e.g., viruses). Therefore, the part of the sample sequence that does not match with the reference sample, is simply not taken into account during the analysis, despite the fact that it can be quite significant.
The new method is based on a comparison of “k-mer” frequencies, which does not require recourse to a reference sample or availability of any information on organisms being examined. All sequences in the sample are subjected to analysis, providing optimal results. Each genomic sequence is represented as a set with all instances of nucleotide "words" of specified length "k," called k-mers. As each genome sequence is unique, the sets of such "words" differ between individual organisms. Thus, the set of all k-mers for a metagenome can be viewed as a set of sets, namely of its constituent organisms. This enables assessment of the differences in the bacterial composition when comparing samples.
To test the effectiveness of the k-mer technique compared to traditional approaches, two sets of metagenome data were used—a set of real data and a set of artificially generated data. Artificial data (created from genomes with proportions known beforehand) is convenient for testing the method as the sequence is precisely known and the result can be assessed by comparing with an a priori correct value.
As the real-data set: intestinal metagenomes from residents of the United States and China were used. Intestinal bacterial communities differ significantly between different populations, and algorithms have claimed to allow to find exactly those indicators that show the difference in composition. Therefore, the criterion for assessing the effectiveness of the new method was the extent to which the metagenomes can be distinguished, that is how much the Chinese metagenomes differ in general from American ones.
The k-mers comparison method has shown better results in both data types than by using traditional mapping with a reference set. In addition, when using real data, a mismatch between the intestinal results for k-mer and traditional approaches allowed the researchers to detect another important component of the intestinal metagenome: namely the bacterial phage crAssphage, which had escaped the notice of researchers using the traditional method.
"Interestingly, the genes can be viewed not only as segments of DNA with proteins encoded in them, but also as information in general. It is this information distinction that has allowed us to identify new segments of DNA not described in the catalog of known genes. It [will be] interesting to see how this approach will be used by other research groups," said coauthor Dmitry Alexeev.
The study, by Dubinkina VB et al., was published January 16, 2016, in the journal BMC Bioinformatics.
Related Links:
Moscow Institute of Physics and Technology
Latest Microbiology News
- Handheld Device Delivers Low-Cost TB Results in Less Than One Hour
- New AI-Based Method Improves Diagnosis of Drug-Resistant Infections
- Breakthrough Diagnostic Technology Identifies Bacterial Infections with Almost 100% Accuracy within Three Hours
- Innovative ID/AST System to Help Diagnose Infectious Diseases and Combat AMR
- Gastrointestinal Panel Delivers Rapid Detection of Five Common Bacterial Pathogens for Outpatient Use
- Rapid PCR Testing in ICU Improves Antibiotic Stewardship
- Unique Genetic Signature Predicts Drug Resistance in Bacteria
- Unique Barcoding System Tracks Pneumonia-Causing Bacteria as They Infect Blood Stream
- Rapid Sepsis Diagnostic Test Demonstrates Improved Patient Care and Cost Savings in Hospital Application
- Rapid Diagnostic System to Detect Neonatal Sepsis Within Hours
- Novel Test to Diagnose Bacterial Pneumonia Directly from Whole Blood
- Interferon-γ Release Assay Effective in Patients with COPD Complicated with Pulmonary Tuberculosis
- New Point of Care Tests to Help Reduce Overuse of Antibiotics
- 30-Minute Sepsis Test Differentiates Bacterial Infections, Viral Infections, and Noninfectious Disease
- CRISPR-TB Blood Test to Enable Early Disease Diagnosis and Public Screening
- Syndromic Panel Provides Fast Answers for Outpatient Diagnosis of Gastrointestinal Conditions
Channels
Clinical Chemistry
view channel
Low-Cost Portable Screening Test to Transform Kidney Disease Detection
Millions of individuals suffer from kidney disease, which often remains undiagnosed until it has reached a critical stage. This silent epidemic not only diminishes the quality of life for those affected... Read more
New Method Uses Pulsed Infrared Light to Find Cancer's 'Fingerprints' In Blood Plasma
Cancer diagnoses have traditionally relied on invasive or time-consuming procedures like tissue biopsies. Now, new research published in ACS Central Science introduces a method that utilizes pulsed infrared... Read moreMolecular Diagnostics
view channel
Novel Autoantibody Against DAGLA Discovered in Cerebellitis
Autoimmune cerebellar ataxias are strongly disabling disorders characterized by an impaired ability to coordinate muscle movement. Cerebellar autoantibodies serve as useful biomarkers to support rapid... Read more
Gene-Based Blood Test Accurately Predicts Tumor Recurrence of Advanced Skin Cancer
Melanoma, an aggressive form of skin cancer, becomes extremely difficult to treat once it spreads to other parts of the body. For patients with metastatic melanoma tumors that cannot be surgically removed... Read moreHematology
view channel
New Scoring System Predicts Risk of Developing Cancer from Common Blood Disorder
Clonal cytopenia of undetermined significance (CCUS) is a blood disorder commonly found in older adults, characterized by mutations in blood cells and a low blood count, but without any obvious cause or... Read more
Non-Invasive Prenatal Test for Fetal RhD Status Demonstrates 100% Accuracy
In the United States, approximately 15% of pregnant individuals are RhD-negative. However, in about 40% of these cases, the fetus is also RhD-negative, making the administration of RhoGAM unnecessary.... Read moreImmunology
view channel
Stem Cell Test Predicts Treatment Outcome for Patients with Platinum-Resistant Ovarian Cancer
Epithelial ovarian cancer frequently responds to chemotherapy initially, but eventually, the tumor develops resistance to the therapy, leading to regrowth. This resistance is partially due to the activation... Read more
Machine Learning-Enabled Blood Test Predicts Immunotherapy Response in Lymphoma Patients
Chimeric antigen receptor (CAR) T-cell therapy has emerged as one of the most promising recent developments in the treatment of blood cancers. However, over half of non-Hodgkin lymphoma (NHL) patients... Read morePathology
view channel
Novel UV and Machine Learning-Aided Method Detects Microbial Contamination in Cell Cultures
Cell therapy holds great potential in treating diseases such as cancers, inflammatory conditions, and chronic degenerative disorders by manipulating or replacing cells to restore function or combat disease.... Read more
New Error-Corrected Method to Help Detect Cancer from Blood Samples Alone
"Liquid biopsy" technology, which relies on blood tests for early cancer detection and monitoring cancer burden in patients, has the potential to transform cancer care. However, detecting the mutational... Read more
"Metal Detector" Algorithm Hunts Down Vulnerable Tumors
Scientists have developed an algorithm capable of functioning as a "metal detector" to identify vulnerable tumors, marking a significant advancement in personalized cancer treatment. This breakthrough... Read more
Novel Technique Uses ‘Sugar’ Signatures to Identify and Classify Pancreatic Cancer Cell Subtypes
Pancreatic cancer is often asymptomatic in its early stages, making it difficult to detect until it has progressed. Consequently, only 15% of pancreatic cancers are diagnosed early enough to allow for... Read moreTechnology
view channel
Pain-On-A-Chip Microfluidic Device Determines Types of Chronic Pain from Blood Samples
Chronic pain is a widespread condition that remains difficult to manage, and existing clinical methods for its treatment rely largely on self-reporting, which can be subjective and especially problematic... Read more
Innovative, Label-Free Ratiometric Fluorosensor Enables More Sensitive Viral RNA Detection
Viruses present a major global health risk, as demonstrated by recent pandemics, making early detection and identification essential for preventing new outbreaks. While traditional detection methods are... Read moreIndustry
view channel
Cepheid and Oxford Nanopore Technologies Partner on Advancing Automated Sequencing-Based Solutions
Cepheid (Sunnyvale, CA, USA), a leading molecular diagnostics company, and Oxford Nanopore Technologies (Oxford, UK), the company behind a new generation of sequencing-based molecular analysis technologies,... Read more
Grifols and Tecan’s IBL Collaborate on Advanced Biomarker Panels
Grifols (Barcelona, Spain), one of the world’s leading producers of plasma-derived medicines and innovative diagnostic solutions, is expanding its offer in clinical diagnostics through a strategic partnership... Read more