PSAS Bachelor Project Portal

Comparison of Human Isolates SARS-CoV-2 Sequences Across Different Geographical Regions

Hamoud Mohammed Al-Seaghi, Muna (2021) Comparison of Human Isolates SARS-CoV-2 Sequences Across Different Geographical Regions. [Project Paper] (Submitted)

[img] Text
FPSK2 2021 2.pdf

Download (7MB)

Abstract

Background: Coronavirus disease 2019 (COVID-19) is a highly pathogenic disease caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). As of 11th May 2021, there were 204 million infected cases and 4.31 million deaths reported worldwide. SARS-CoV-2 is an emerging virus that is constantly evolving, and new strains are being discovered. To understand SARS-CoV-2 evolution and genetic diversity of circulating strains, transmission patterns, and relationships between infected individuals, whole genome and spike protein sequence analysis are imperative. Therefore, phylogenetic analysis based on the whole genome and spike protein sequences are inferred. Objective: This study aims to compare the SARS-CoV-2 whole genome and spike protein sequences of human isolates between and within the diverse geographical regions (Asia, Africa, Europe, North America, Oceania, and South America) and to identify their lineages using Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) tool. Methodology: The SARS-CoV-2 whole genome and spike protein sequences were retrieved from the National Centre for Biotechnology Information (NCBI) database. The retrieved sequences were then aligned in the presence of a reference sequence for the whole genome (NC_045512.2) and spike gene (YP_009724390) utilizing Multiple Alignment using the Fast Fourier Transform (MAFFT) program. Then, MAFFT was used to construct the phylogenetic tree for the obtained multiple alignments. Molecular Evolutionary Genetics Analysis X (MEGA X) was employed for the consensus sequence tree. Finally, the PANGOLIN software was used to assign the lineages. Results and Discussion: A total of 67,521 whole genomes and 11,198 spike protein sequences from 76 countries were used to construct the trees. The constructed phylogenetic trees showed the divergence of the SARS-CoV-2 into two monophyletic groups; Clade A represents earliest divergent events while Clade B recently diverged isolates. Based on the multifurcating consensus tree, the consensus sequences from several geographical regions were clustered and intermingled together in a single clade irrespective of their geographic origin (whether from the same continent or neighbouring countries), indicating the co-circulation of the same strains in different countries. Furthermore, the obtained results of lineage correlate with the number of cases from each continent, as regions with high percentage of lineage B.1.1.7 (Alpha-variant), such as Europe and North America, have higher infectious rates compared to Australia which belong to lineage D.2. Conclusion: Findings from this study provide a better insight into the global transmission of SARS-CoV-2 and support the need for a global monitoring of the genetic diversity of the virus to predict the spread of the virus.

Item Type: Project Paper
Faculty: Faculty of Medicine and Health Science
Depositing User: Ms. Nor Safa'aton Saidin
Date Deposited: 13 Feb 2023 00:49
Last Modified: 21 Jul 2023 01:33
URI: http://psaspb.upm.edu.my/id/eprint/747

Actions (login required)

View Item View Item