New Tools Enable Rapid Analysis of Coronavirus Sequences and Tracking of SARS-CoV-2 Variants

By LabMedica International staff writers
Posted on 11 May 2021
A new tool allows researchers to quickly see how a new viral sequence is related to all other variants of SARS-CoV-2, crucial information for tracking transmission dynamics.

The sheer number of coronavirus genome sequences and their rapid accumulation makes it hard to place new sequences on a “family tree” showing how they are all related. But researchers at the UC Santa Cruz Genomics Institute (Santa Cruz, CA, USA) have developed a new method that does this with unprecedented speed. Called Ultrafast Sample Placement on Existing Trees (UShER), this powerful tool identifies the relationships between a user’s newly sequenced viral genomes and all known SARS-CoV-2 virus genomes by adding them to an existing phylogenetic tree, a branching diagram like a family tree that shows how the virus has evolved in different lineages as it accumulates mutations.

Image: In this example of UShER results, displayed using Nextstrain, sequences representing a hypothetical outbreak are yellow, previously sampled sequences are blue, and branches are labeled by nucleotide mutations (Photo courtesy of UCSC Genomics Institute)

This kind of sequence analysis can be used to discover new strains of the virus as they emerge and track their evolution and transmission dynamics. It can also be used to identify links between individual cases of coronavirus infection and to trace chains of transmission, an approach known as genomic contact tracing. UShER and related data visualization tools are available to the research community through the UCSC SARS-CoV-2 Genome Browser, which also provides access to a wide range of data and results from ongoing scientific research on the virus, including new variants that are especially concerning.

Like all viruses, SARS-CoV-2 acquires mutations as it replicates and spreads. Most of these random variations in the genome sequence have no effect on the behavior of the virus, but researchers can still use them to identify different variants or strains of the virus, see how they are related, and determine if two samples are part of the same transmission chain. Viral genomics can reveal transmission chains not found through conventional contact tracing. This approach can help identify superspreader events, where one person transmitted the virus to many others, and it can also show that two cases from the same location are actually unrelated infections, not part of the same transmission chain, because the viral sequences differ too much.

“We are able to maintain a comprehensive phylogenetic tree of more than 1.2 million coronavirus sequences and update it with new sequences in real time. No other tool can handle trees of this size with a comparable efficiency,” said first author Yatish Turakhia, a postdoctoral scholar at the Genomics Institute. “This helps us keep track of all variants in circulation, including new variants that are emerging.”

“It’s an approach that is likely to be valuable moving forward, so we’re building the tools to enable people to do this in real time,” said Russ Corbett-Detig, assistant professor of biomolecular engineering at UC Santa Cruz. “If you want to know who transmitted the virus to whom, or where in the world a new sample may have come from, you need to take the samples from your community and project them onto the known phylogenetic tree of all the other SARS-CoV-2 genome sequences, and conventional phylogenetic methods just can’t do this in a reasonable amount of time.”

Related Links:
UC Santa Cruz Genomics Institute


Latest COVID-19 News