Health News

Cov2clusters: Novel method for producing stable genomic clusters of SARS-CoV-2 cases

In a recent study published on the preprint server medRxiv*, researchers present a novel method for producing stable genomic clustering of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cases known as Cov2clusters.

This clustering tool utilizes sequence data collected over time to produce more stable clusters than other commonly used phylogenetic clustering methods. Moreover, their method is provided as an R package, thereby allowing for its use within research and public health community settings to investigate transmission dynamics of SARS-CoV-2.

Study: Cov2clusters: genomic clustering of SARS-CoV-2 sequences. Image Credit: Coffeemill / Shutterstock.com

Background

The rapid development of coronavirus disease 2019 (COVID-19) vaccines, in addition to the implementation of non-pharmaceutical/social distancing measures, has successfully alleviated the impact of the pandemic by reducing viral transmission, hospitalization, and mortality rates. Nevertheless, COVID-19 remains a worldwide concern due to the continued emergence of more transmissible and virulent SARS-CoV-2 variants of concern (VOCs), waning vaccine-induced antibodies, vaccine hesitancy, and unequal access to vaccines and therapeutics.

An increasing amount of SARS-CoV-2 whole genome sequence (WGS) data is being shared every day through global repositories, which allows almost real-time genomic comparison of the pathogen. These data can be utilized to develop novel and easy-to-implement tools that can identify clusters of linked cases aiding in the understanding of regional epidemiology and informing public health policies, such as implementing restrictions in certain settings with a high transmission risk.

The cumulative number (A) and lineage proportion (B) of SARS-CoV-2 sequences per week included in the study, coloured by lineage. Major lineages present in the data are annotated.

The utility of defining SARS-CoV-2 clusters

Genomically-linked cases with shared demography should be identified at a higher resolution than a shared lineage assignment or simply through contact tracing. Currently, the Pangolin system is used for assigning nomenclature to SARS-CoV-2 lineages; however, Pangolin has been dynamic through the pandemic and cannot provide sufficient resolution for epidemiological investigations.

Thus, the researchers of the current study recommend a system where the clustering of sequences by genomic similarity is aided by epidemiological information. This would consequently provide a resolution and stability that is necessary for public health applications over the course of a dynamic pandemic.

To date, phylogenetic tree clustering methods have been applied to identify putative transmission clusters in SARS-CoV-2 based on genomic divergences. However, due to the rapid spread of the SARS-CoV-2 with relatively lesser alterations in genetic diversity, as well as periods of lineage replacement with new VOCs with reduced regional genetic diversity in the virus, clustering-based solely on genetic variation may not be sufficient to effectively identify meaningful clusters in SARS-CoV-2. Moreover, defining clusters using a fixed genetic distance threshold may cause sequences to alter cluster designation over time as more sequences become available.

Improved resolution and sensitivity of Cov2clusters

Through the use of their novel method to construct SARS-CoV-2 genomic clusters, the researchers use the pairwise probability of clustering under a logit regression model, wherein they link cases under a given probability threshold. The model uses a logit regression model based on sequence divergence and the sample collection dates. The model is flexible enough to add further resolution to this clustering by incorporating epidemiological data, such as geography, contact data, and exposure events.

In contrast to previous clustering approaches that often rely solely on phylogenetic inference (tree cluster reference), clustering isolates in this pairwise manner allows for greater cluster stability through time, as well as resolution by including epidemiological information without the need for time-consuming manual investigation.”

The team tested their novel method on SARS-CoV-2 sequence data collected during the first, second, and third waves of the COVID-19 pandemic in the British Columbia province of Canada from March 15, 2020, to August 13, 2021.

The results of the novel genomic clustering method were compared at three pairwise probability thresholds of 0.7, 0.8, and 0.9 for linking sequences to form clusters. To this end, the researchers found that their approach formed the most stable clusters at a probability threshold of 0.8 in the clinical data.

When compared to other phylogenetic clustering tools, the sensitivity of Cov2clusters at a 0.8 probability threshold was higher than both TreeCluster ‘max_clade’ and ‘single_linkage.” Furthermore, the produced clusters were more stable as cases were added over time.

This result has particular significance for the utility of this method in real-time public health surveillance, where sequencing datasets grow over time, and stability in cluster designations is beneficial for reporting and surveillance.”

*Important notice

medRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
  • Sobkowiak B., Kamelian, K., Zlosnik, J. E. A., et al. (2022) Cov2clusters: genomic clustering of SARS-CoV-2 sequences. medRxiv. doi:10.1101/2022.03.10.22272213, https://www.medrxiv.org/content/10.1101/2022.03.10.22272213v2

Posted in: Genomics | Medical Science News | Medical Research News | Disease/Infection News

Tags: Antibodies, Coronavirus, Coronavirus Disease COVID-19, covid-19, Epidemiology, Genetic, Genome, Genomic, Mortality, Pandemic, Pathogen, Public Health, Research, Respiratory, SARS, SARS-CoV-2, Severe Acute Respiratory, Severe Acute Respiratory Syndrome, Syndrome, Therapeutics, Vaccine, Virus

Comments (0)

Written by

Namita Mitra

After earning a bachelor’s degree in Veterinary Sciences and Animal Health  (BVSc) in 2013, Namita went on to pursue a Master of Veterinary Microbiology from GADVASU, India. Her Master’s research on the molecular and histopathological diagnosis of avian oncogenic viruses in poultry brought her two national awards. In 2013, she was conferred a doctoral degree in Animal Biotechnology that concluded with her research findings on expression profiling of apoptosis-associated genes in canine mammary tumors. Right after her graduation, Namita worked as Assistant Professor of Animal Biotechnology and taught the courses of Animal Cell Culture, Animal Genetic Engineering, and Molecular Immunology.

Source: Read Full Article