Robust Method Developed for Microbiome Analysis
|
By LabMedica International staff writers Posted on 16 Mar 2016 |

Image: Clostridium difficile colonies after 48 hours’ growth on blood agar plate. By competitive exclusion, healthy gut flora help limit growth of pathogenic yeasts and bacteria. Overgrowth of C. difficile in the gut can cause pseudomembranous colitis, and is the most frequently identified cause of antibiotic-associated diarrhea. (Image courtesy of Dr. Holdeman, Centers for Disease Control & Prevention (CDC; image ID #3647), and Wikimedia.)
Scientists have developed a technique for genome sequence data analysis that enables to more efficiently and accurately identify differences between metagenomes for a variety of bacterial communities, which can help to study, diagnose, and treat many human diseases. In a new study, the method was successfully tested on intestinal microbiota.
A team led by scientists from the Moscow Institute of Physics and Technology (Moscow, Russia) have proposed a new method for comparison of metagenome-coupled DNA sequences from all organisms in a biological sample. The method makes it possible to more effectively solve the task of comparing samples and can be easily embedded into a metagenome data-analysis process.
Bacterial cells in the human body, most of which are located in the gut, hold a special place in metagenomics, including the "Human Microbiome Project." Microbiota composition is sensitive to processes occurring in the body. Thus, comparing samples from patients with samples from people with a healthy intestinal metagenome, will likely lead to methods that can evaluate risk of various diseases, including diabetes and inflammatory bowel disease.
The traditional approach to metagenome analysis is to compare samples on the basis of their taxonomic composition: percentages for each microbial species found. To determine sample composition, its genetic sequences are compared with a “reference set” database of known bacterial genomes. However, this approach has several disadvantages. Firstly, the reference genomes are often inaccurate, since the composition of the reference genome is a computationally complex and time-consuming task, especially for difficult-to-cultivate species; and the genomes of species isolated in the laboratory can carry genes significantly different from the same species living in a natural environment. Secondly, generally not all organisms are collected in the reference set of genomes (e.g., viruses). Therefore, the part of the sample sequence that does not match with the reference sample, is simply not taken into account during the analysis, despite the fact that it can be quite significant.
The new method is based on a comparison of “k-mer” frequencies, which does not require recourse to a reference sample or availability of any information on organisms being examined. All sequences in the sample are subjected to analysis, providing optimal results. Each genomic sequence is represented as a set with all instances of nucleotide "words" of specified length "k," called k-mers. As each genome sequence is unique, the sets of such "words" differ between individual organisms. Thus, the set of all k-mers for a metagenome can be viewed as a set of sets, namely of its constituent organisms. This enables assessment of the differences in the bacterial composition when comparing samples.
To test the effectiveness of the k-mer technique compared to traditional approaches, two sets of metagenome data were used—a set of real data and a set of artificially generated data. Artificial data (created from genomes with proportions known beforehand) is convenient for testing the method as the sequence is precisely known and the result can be assessed by comparing with an a priori correct value.
As the real-data set: intestinal metagenomes from residents of the United States and China were used. Intestinal bacterial communities differ significantly between different populations, and algorithms have claimed to allow to find exactly those indicators that show the difference in composition. Therefore, the criterion for assessing the effectiveness of the new method was the extent to which the metagenomes can be distinguished, that is how much the Chinese metagenomes differ in general from American ones.
The k-mers comparison method has shown better results in both data types than by using traditional mapping with a reference set. In addition, when using real data, a mismatch between the intestinal results for k-mer and traditional approaches allowed the researchers to detect another important component of the intestinal metagenome: namely the bacterial phage crAssphage, which had escaped the notice of researchers using the traditional method.
"Interestingly, the genes can be viewed not only as segments of DNA with proteins encoded in them, but also as information in general. It is this information distinction that has allowed us to identify new segments of DNA not described in the catalog of known genes. It [will be] interesting to see how this approach will be used by other research groups," said coauthor Dmitry Alexeev.
The study, by Dubinkina VB et al., was published January 16, 2016, in the journal BMC Bioinformatics.
Related Links:
Moscow Institute of Physics and Technology
A team led by scientists from the Moscow Institute of Physics and Technology (Moscow, Russia) have proposed a new method for comparison of metagenome-coupled DNA sequences from all organisms in a biological sample. The method makes it possible to more effectively solve the task of comparing samples and can be easily embedded into a metagenome data-analysis process.
Bacterial cells in the human body, most of which are located in the gut, hold a special place in metagenomics, including the "Human Microbiome Project." Microbiota composition is sensitive to processes occurring in the body. Thus, comparing samples from patients with samples from people with a healthy intestinal metagenome, will likely lead to methods that can evaluate risk of various diseases, including diabetes and inflammatory bowel disease.
The traditional approach to metagenome analysis is to compare samples on the basis of their taxonomic composition: percentages for each microbial species found. To determine sample composition, its genetic sequences are compared with a “reference set” database of known bacterial genomes. However, this approach has several disadvantages. Firstly, the reference genomes are often inaccurate, since the composition of the reference genome is a computationally complex and time-consuming task, especially for difficult-to-cultivate species; and the genomes of species isolated in the laboratory can carry genes significantly different from the same species living in a natural environment. Secondly, generally not all organisms are collected in the reference set of genomes (e.g., viruses). Therefore, the part of the sample sequence that does not match with the reference sample, is simply not taken into account during the analysis, despite the fact that it can be quite significant.
The new method is based on a comparison of “k-mer” frequencies, which does not require recourse to a reference sample or availability of any information on organisms being examined. All sequences in the sample are subjected to analysis, providing optimal results. Each genomic sequence is represented as a set with all instances of nucleotide "words" of specified length "k," called k-mers. As each genome sequence is unique, the sets of such "words" differ between individual organisms. Thus, the set of all k-mers for a metagenome can be viewed as a set of sets, namely of its constituent organisms. This enables assessment of the differences in the bacterial composition when comparing samples.
To test the effectiveness of the k-mer technique compared to traditional approaches, two sets of metagenome data were used—a set of real data and a set of artificially generated data. Artificial data (created from genomes with proportions known beforehand) is convenient for testing the method as the sequence is precisely known and the result can be assessed by comparing with an a priori correct value.
As the real-data set: intestinal metagenomes from residents of the United States and China were used. Intestinal bacterial communities differ significantly between different populations, and algorithms have claimed to allow to find exactly those indicators that show the difference in composition. Therefore, the criterion for assessing the effectiveness of the new method was the extent to which the metagenomes can be distinguished, that is how much the Chinese metagenomes differ in general from American ones.
The k-mers comparison method has shown better results in both data types than by using traditional mapping with a reference set. In addition, when using real data, a mismatch between the intestinal results for k-mer and traditional approaches allowed the researchers to detect another important component of the intestinal metagenome: namely the bacterial phage crAssphage, which had escaped the notice of researchers using the traditional method.
"Interestingly, the genes can be viewed not only as segments of DNA with proteins encoded in them, but also as information in general. It is this information distinction that has allowed us to identify new segments of DNA not described in the catalog of known genes. It [will be] interesting to see how this approach will be used by other research groups," said coauthor Dmitry Alexeev.
The study, by Dubinkina VB et al., was published January 16, 2016, in the journal BMC Bioinformatics.
Related Links:
Moscow Institute of Physics and Technology
Latest Microbiology News
- Comprehensive Review Identifies Gut Microbiome Signatures Associated With Alzheimer’s Disease
- AI-Powered Platform Enables Rapid Detection of Drug-Resistant C. Auris Pathogens
- New Test Measures How Effectively Antibiotics Kill Bacteria
- New Antimicrobial Stewardship Standards for TB Care to Optimize Diagnostics
- New UTI Diagnosis Method Delivers Antibiotic Resistance Results 24 Hours Earlier
- Breakthroughs in Microbial Analysis to Enhance Disease Prediction
- Blood-Based Diagnostic Method Could Identify Pediatric LRTIs
- Rapid Diagnostic Test Matches Gold Standard for Sepsis Detection
- Rapid POC Tuberculosis Test Provides Results Within 15 Minutes
- Rapid Assay Identifies Bloodstream Infection Pathogens Directly from Patient Samples
- Blood-Based Molecular Signatures to Enable Rapid EPTB Diagnosis
- 15-Minute Blood Test Diagnoses Life-Threatening Infections in Children
- High-Throughput Enteric Panels Detect Multiple GI Bacterial Infections from Single Stool Swab Sample
- Fast Noninvasive Bedside Test Uses Sugar Fingerprint to Detect Fungal Infections
- Rapid Sepsis Diagnostic Device to Enable Personalized Critical Care for ICU Patients
- Microfluidic Platform Assesses Neutrophil Function in Sepsis Patients
Channels
Clinical Chemistry
view channel
New PSA-Based Prognostic Model Improves Prostate Cancer Risk Assessment
Prostate cancer is the second-leading cause of cancer death among American men, and about one in eight will be diagnosed in their lifetime. Screening relies on blood levels of prostate-specific antigen... Read more
Extracellular Vesicles Linked to Heart Failure Risk in CKD Patients
Chronic kidney disease (CKD) affects more than 1 in 7 Americans and is strongly associated with cardiovascular complications, which account for more than half of deaths among people with CKD.... Read moreMolecular Diagnostics
view channel
Diagnostic Device Predicts Treatment Response for Brain Tumors Via Blood Test
Glioblastoma is one of the deadliest forms of brain cancer, largely because doctors have no reliable way to determine whether treatments are working in real time. Assessing therapeutic response currently... Read more
Blood Test Detects Early-Stage Cancers by Measuring Epigenetic Instability
Early-stage cancers are notoriously difficult to detect because molecular changes are subtle and often missed by existing screening tools. Many liquid biopsies rely on measuring absolute DNA methylation... Read more
“Lab-On-A-Disc” Device Paves Way for More Automated Liquid Biopsies
Extracellular vesicles (EVs) are tiny particles released by cells into the bloodstream that carry molecular information about a cell’s condition, including whether it is cancerous. However, EVs are highly... Read more
Blood Test Identifies Inflammatory Breast Cancer Patients at Increased Risk of Brain Metastasis
Brain metastasis is a frequent and devastating complication in patients with inflammatory breast cancer, an aggressive subtype with limited treatment options. Despite its high incidence, the biological... Read moreHematology
view channel
New Guidelines Aim to Improve AL Amyloidosis Diagnosis
Light chain (AL) amyloidosis is a rare, life-threatening bone marrow disorder in which abnormal amyloid proteins accumulate in organs. Approximately 3,260 people in the United States are diagnosed... Read more
Fast and Easy Test Could Revolutionize Blood Transfusions
Blood transfusions are a cornerstone of modern medicine, yet red blood cells can deteriorate quietly while sitting in cold storage for weeks. Although blood units have a fixed expiration date, cells from... Read more
Automated Hemostasis System Helps Labs of All Sizes Optimize Workflow
High-volume hemostasis sections must sustain rapid turnaround while managing reruns and reflex testing. Manual tube handling and preanalytical checks can strain staff time and increase opportunities for error.... Read more
High-Sensitivity Blood Test Improves Assessment of Clotting Risk in Heart Disease Patients
Blood clotting is essential for preventing bleeding, but even small imbalances can lead to serious conditions such as thrombosis or dangerous hemorrhage. In cardiovascular disease, clinicians often struggle... Read moreImmunology
view channelBlood Test Identifies Lung Cancer Patients Who Can Benefit from Immunotherapy Drug
Small cell lung cancer (SCLC) is an aggressive disease with limited treatment options, and even newly approved immunotherapies do not benefit all patients. While immunotherapy can extend survival for some,... Read more
Whole-Genome Sequencing Approach Identifies Cancer Patients Benefitting From PARP-Inhibitor Treatment
Targeted cancer therapies such as PARP inhibitors can be highly effective, but only for patients whose tumors carry specific DNA repair defects. Identifying these patients accurately remains challenging,... Read more
Ultrasensitive Liquid Biopsy Demonstrates Efficacy in Predicting Immunotherapy Response
Immunotherapy has transformed cancer treatment, but only a small proportion of patients experience lasting benefit, with response rates often remaining between 10% and 20%. Clinicians currently lack reliable... Read morePathology
view channel
Engineered Yeast Cells Enable Rapid Testing of Cancer Immunotherapy
Developing new cancer immunotherapies is a slow, costly, and high-risk process, particularly for CAR T cell treatments that must precisely recognize cancer-specific antigens. Small differences in tumor... Read more
First-Of-Its-Kind Test Identifies Autism Risk at Birth
Autism spectrum disorder is treatable, and extensive research shows that early intervention can significantly improve cognitive, social, and behavioral outcomes. Yet in the United States, the average age... Read moreTechnology
view channel
Robotic Technology Unveiled for Automated Diagnostic Blood Draws
Routine diagnostic blood collection is a high‑volume task that can strain staffing and introduce human‑dependent variability, with downstream implications for sample quality and patient experience.... Read more
ADLM Launches First-of-Its-Kind Data Science Program for Laboratory Medicine Professionals
Clinical laboratories generate billions of test results each year, creating a treasure trove of data with the potential to support more personalized testing, improve operational efficiency, and enhance patient care.... Read moreAptamer Biosensor Technology to Transform Virus Detection
Rapid and reliable virus detection is essential for controlling outbreaks, from seasonal influenza to global pandemics such as COVID-19. Conventional diagnostic methods, including cell culture, antigen... Read more
AI Models Could Predict Pre-Eclampsia and Anemia Earlier Using Routine Blood Tests
Pre-eclampsia and anemia are major contributors to maternal and child mortality worldwide, together accounting for more than half a million deaths each year and leaving millions with long-term health complications.... Read moreIndustry
view channelNew Collaboration Brings Automated Mass Spectrometry to Routine Laboratory Testing
Mass spectrometry is a powerful analytical technique that identifies and quantifies molecules based on their mass and electrical charge. Its high selectivity, sensitivity, and accuracy make it indispensable... Read more
AI-Powered Cervical Cancer Test Set for Major Rollout in Latin America
Noul Co., a Korean company specializing in AI-based blood and cancer diagnostics, announced it will supply its intelligence (AI)-based miLab CER cervical cancer diagnostic solution to Mexico under a multi‑year... Read more
Diasorin and Fisher Scientific Enter into US Distribution Agreement for Molecular POC Platform
Diasorin (Saluggia, Italy) has entered into an exclusive distribution agreement with Fisher Scientific, part of Thermo Fisher Scientific (Waltham, MA, USA), for the LIAISON NES molecular point-of-care... Read more







