Update: Is the novel coronavirus evolving?

Novel Coronavirus


Genomic sequence analysis of SARS-CoV-2 is important for developing therapeutic strategies like drugs or vaccines. Such studies would help not only in containing the spread of the novel coronavirus, but also in identifying mutations that might be specific to a region or a country. The article ‘Is the novel coronavirus evolving?’ provided insights on different studies carried out worldwide on genomic mutations of SARS-CoV2. This piece discusses some key findings from genome sequencing studies done in India.

Virus genomes are usually analyzed from global databases such as GISAID (primary source for genomic data of influenza viruses and the SARS-CoV-2), and Nextstrain (an open source project consisting data of SARS-CoV-2 genomes and providing with analyses and reports). In order to assist tracking the global spread of the virus, a nomenclature of commonly agreed upon labels is used to identify the different virus clades (group of organisms evolved from a common ancestor). This not only helps identify strains of the virus from different parts of the world, but also helps in tracing  its lineage and its ancestor.

A recent report based on 3636 genome sequences from 55 countries referenced from the GISAID indicates that SARS-CoV-2 may have undergone 11 different mutations and can be categorized into 10 clades. Based on the nomenclature used by Nextstrain, the clades are identified as O (ancestral), B, B1, B2, B4, A3, A6, A7, A1a, A2, A2a. Among these, the A2a clade and its variant D614G have emerged as the dominant strain across the world.

In India, two institutes under the umbrella  of the Council of Scientific and Industrial Research (CSIR) -- Centre for Cellular and Molecular Biology (CCMB) and Institute of Genomics and Integrative Biology (IGIB) are analyzing genomic sequences of the virus sourced from within the country. This study has analyzed 361 genome sequences of SARS-CoV-2 from the states of Gujarat, West Bengal, Maharashtra, Tamil Nadu, and Telangana (as of 25 May 2020). They reported a large number of dominant clusters (sequences grouped together based on resemblance) that did not fall under the existing 10 clades and hence have been separately categorized as Clade I/A3i, a strain with unique occurrence in India (highest proportions in Tamil Nadu, Telangana, Maharashtra, and Delhi; lower concentrations in Bihar, Karnataka, Uttar Pradesh, West Bengal, Gujarat and Madhya Pradesh). This indicates that SARS-CoV-2 may have mutated in India, resulting in changes in nucleocapsid proteins and envelope genes, compared to changes in spike protein and membrane genes, occurring in the globally predominant A2a clade.

Another study was conducted by the National Centre for Disease Control (NCDC) and CSIR-IGIB, to examine local transmission of the virus within the country. This study reveals multiple introductions of SARS-CoV-2 from Europe, West Asia and East Asia. The study looked at 104 whole genome sequences from across the country and revealed a few novel variants of the virus, similar to those from East and South-East Asia. This indicates a common origin, with possible links with geo-climatic conditions. Interestingly, D614G, a prevalent global mutation in the virus’s spike protein appears only in half of the genomes sequenced. 

Studies suggest that the virus strains from India may be bifurcations of SARS-like coronaviruses derived from bats and pangolins but distinct from MERS-CoV and other human coronaviruses known so far. A study carried out by the by Indian Council of Medical Research-National Institute of Cholera and Enteric Diseases (ICMR-NICED) has assessed 46 Indian SARS-CoV-2 genome sequences from a repository of 10 states. The findings based on their observations of variations in DNA homology (two segments of DNA with shared ancestry) indicate the emergence of novel co-evolving mutations, highlighting the speedy evolution of SARS-CoV-2. 

Analyzing the pattern by which the Indian strains of SARS-CoV-2 have undergone novel co-mutations over time with reference to the ancestral clade can help us understand the resultant changes in  the protein structure and function of the virus itself. The results of this study along with other phylogenetic studies have helped establish bat and pangolins as the proximal origin of SARS-CoV-2.

Studying specific mutations of the SARS-CoV-2 provides us with better understanding of virus transmission, and also helps in the development of drug treatments. In addition, genome sequences from specific geographical settings will enable epidemiologists to develop region-specific strategies.

Meena Kharatmal is a scientific officer at the Homi Bhabha Centre for Science Eduation (TIFR), and a PhD student working in Biology Education.