EuropePMC; 2022.
Preprint in English | EuropePMC | ID: ppcovidwho-336833


ABSTRACT HostSeq was launched in April 2020 as a national initiative to integrate whole genome sequencing data from 10,000 Canadians infected with SARS-CoV-2 with clinical information related to their disease experience. The mandate of HostSeq is to support the Canadian and international research communities in their efforts to understand the risk factors for disease and associated health outcomes and support the development of interventions such as vaccines and therapeutics. HostSeq is a collaboration among 13 independent epidemiological studies of SARS-CoV-2 across five provinces in Canada. Aggregated data collected by HostSeq are made available to the public through two data portals: a phenotype portal showing summaries of major variables and their distributions, and a variant search portal enabling queries in a genomic region. Individual-level data is available to the global research community through a Data Access Agreement and Data Access Compliance Office approval. Here we provide an overview of the collective project design along with summary level information for HostSeq. We highlight several statistical considerations for researchers using the HostSeq platform regarding data aggregation, sampling mechanism, covariate adjustment, and X chromosome analysis. In addition to serving as a rich data source, the diversity of study designs, sample sizes, and research objectives among the participating studies provides unique opportunities for the research community.

Molecular Genetics and Metabolism ; 132:S258-S259, 2021.
Article in English | EMBASE | ID: covidwho-1735098


Background: Severe acute respiratory syndrome coronavirus (SARSCoV- 2) is a novel virus that causes Coronavirus Disease 2019 (COVID- 19). High-throughput sequencing technologies such as whole genome sequencing (WGS) and sequencing of viral genome DNA are being implemented to identify and report on genetic factors that may influence variability in symptom severity and immune response among patients infected by SARS-CoV-2. Genome sequencing has been useful for clinical diagnostic purposes, and can reveal other useful information such as disease risk factors that might lead to disease prevention or patient management strategies. UsingWGS and bioinformatics software tools, we describe a novel pipeline for the analysis of medically relevant genetic results and other findings identified in COVID-19 positive individuals, and the generation of a genome report that can effectively communicate these results to patients and their physicians. Study design: Enrollment will include up to 1500 patients with a positive COVID-19 nasopharyngeal swab. Blood samples will be collected at baseline, 1 month, 6 months, and 1 year after diagnosis. Antibody isotype (IgG, IgA, and IgM), titers, and viral neutralization will be analyzed. DNA will be isolated from blood lymphocytes and host genomes will be sequenced. Whole genomes will be assessed using ACMG criteria for the interpretation of pathogenic sequence variation using in-house and third-party software tools, and publicly available disease and control databases. Comprehensive gene panels will be implemented to allow for patients to receive clinically significant findings, including risk factor and carrier status, from multiple categories of potential genetic conditions including blood and immunology, endocrine, metabolic/mitochondrial, musculoskeletal, hearing loss, neurology, cardiology, ophthalmology, renal, skin, and gastrointestinal disorders. Common disease risk will be assessed using polygenic risk scores calculated for 6 diseases (atrial fibrillation, coronary artery disease, type 2 diabetes, prostate cancer, colorectal cancer, breast cancer). Pharmacogenomic gene variants that alter metabolizer phenotype and drug response in individuals will be reported, in addition to patient HLA-type. The genomic predictions fromABO and Rh blood types will be summarized and reported. Largescale continental ancestry estimation will be performed using publicly available reference populations. Finally, using viral genome DNA sequencing, the SARS-CoV-2 viral lineage will be identified and reported. An appointment with the study genetic counsellor will be scheduled to discuss results identified in the genome report and manage appropriate clinical referrals if necessary. Serology results will be reported. Regression models will examine associations between antibody response (titer, antigen target, viral neutralization ability), physiological response (biochemical, hematological and clinical characteristics), patient outcomes, viral lineage and genomic results. Significance: This study will link clinically relevant genomic results, in addition to other biological and serological characteristics, to potential factors that contribute to variability in SARS-CoV-2 outcomes. Results will be shared with family physicians for clinical follow up. This study will establish an efficient workflow using highthroughput genomic sequencing technology coupled with emerging bioinformatics platforms for the generation of comprehensive genome reports to aid in COVID-19 patient management and follow-up.