Search | VHL Regional Portal

1.

Exome copy number variant detection, analysis, and classification in a large cohort of families with undiagnosed rare genetic disease.

Lemire, Gabrielle; Sanchis-Juan, Alba; Russell, Kathryn; Baxter, Samantha; Chao, Katherine R; Singer-Berk, Moriel; Groopman, Emily; Wong, Isaac; England, Eleina; Goodrich, Julia; Pais, Lynn; Austin-Tse, Christina; DiTroia, Stephanie; O'Heir, Emily; Ganesh, Vijay S; Wojcik, Monica H; Evangelista, Emily; Snow, Hana; Osei-Owusu, Ikeoluwa; Fu, Jack; Singh, Mugdha; Mostovoy, Yulia; Huang, Steve; Garimella, Kiran; Kirkham, Samantha L; Neil, Jennifer E; Shao, Diane D; Walsh, Christopher A; Argilli, Emanuela; Le, Carolyn; Sherr, Elliott H; Gleeson, Joseph G; Shril, Shirlee; Schneider, Ronen; Hildebrandt, Friedhelm; Sankaran, Vijay G; Madden, Jill A; Genetti, Casie A; Beggs, Alan H; Agrawal, Pankaj B; Bujakowska, Kinga M; Place, Emily; Pierce, Eric A; Donkervoort, Sandra; Bönnemann, Carsten G; Gallacher, Lyndon; Stark, Zornitza; Tan, Tiong Yang; White, Susan M; Töpf, Ana.

Am J Hum Genet ; 111(5): 863-876, 2024 May 02.

Article in English | MEDLINE | ID: mdl-38565148

ABSTRACT

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and, with new innovative methods, can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the Genomics Research to Elucidate the Genetics of Rare Diseases consortium and analyzed using the seqr platform. The addition of CNV detection to exome analysis identified causal CNVs for 171 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb. The causal CNVs consisted of 140 deletions, 15 duplications, 3 suspected complex structural variants (SVs), 3 insertions, and 10 complex SVs, the latter two groups being identified by orthogonal confirmation methods. To classify CNV variant pathogenicity, we used the 2020 American College of Medical Genetics and Genomics/ClinGen CNV interpretation standards and developed additional criteria to evaluate allelic and functional data as well as variants on the X chromosome to further advance the framework. We interpreted 151 CNVs as likely pathogenic/pathogenic and 20 CNVs as high-interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher-resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.

Subject(s)

DNA Copy Number Variations , Exome Sequencing , Exome , Rare Diseases , Humans , DNA Copy Number Variations/genetics , Rare Diseases/genetics , Rare Diseases/diagnosis , Exome/genetics , Male , Female , Cohort Studies , Genetic Testing/methods

2.

Exome copy number variant detection, analysis and classification in a large cohort of families with undiagnosed rare genetic disease.

Lemire, Gabrielle; Sanchis-Juan, Alba; Russell, Kathryn; Baxter, Samantha; Chao, Katherine R; Singer-Berk, Moriel; Groopman, Emily; Wong, Isaac; England, Eleina; Goodrich, Julia; Pais, Lynn; Austin-Tse, Christina; DiTroia, Stephanie; O'Heir, Emily; Ganesh, Vijay S; Wojcik, Monica H; Evangelista, Emily; Snow, Hana; Osei-Owusu, Ikeoluwa; Fu, Jack; Singh, Mugdha; Mostovoy, Yulia; Huang, Steve; Garimella, Kiran; Kirkham, Samantha L; Neil, Jennifer E; Shao, Diane D; Walsh, Christopher A; Argili, Emanuela; Le, Carolyn; Sherr, Elliott H; Gleeson, Joseph; Shril, Shirlee; Schneider, Ronen; Hildebrandt, Friedhelm; Sankaran, Vijay G; Madden, Jill A; Genetti, Casie A; Beggs, Alan H; Agrawal, Pankaj B; Bujakowska, Kinga M; Place, Emily; Pierce, Eric A; Donkervoort, Sandra; Bönnemann, Carsten G; Gallacher, Lyndon; Stark, Zornitza; Tan, Tiong; White, Susan M; Töpf, Ana.

medRxiv ; 2023 Oct 05.

Article in English | MEDLINE | ID: mdl-37873196

ABSTRACT

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the GREGoR consortium. Each family's CNV data was analyzed using the seqr platform and candidate CNVs classified using the 2020 ACMG/ClinGen CNV interpretation standards. We developed additional evidence criteria to address situations not covered by the current standards. The addition of CNV calling to exome analysis identified causal CNVs for 173 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb with estimates that 44% would not have been detected by standard chromosomal microarrays. The causal CNVs consisted of 141 deletions, 15 duplications, 4 suspected complex structural variants (SVs), 3 insertions and 10 complex SVs, the latter two groups being identified by orthogonal validation methods. We interpreted 153 CNVs as likely pathogenic/pathogenic and 20 CNVs as high interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.

3.

High level of complexity and global diversity of the 3q29 locus revealed by optical mapping and long-read sequencing.

Yilmaz, Feyza; Gurusamy, Umamaheswaran; Mosley, Trenell J; Hallast, Pille; Kim, Kwondo; Mostovoy, Yulia; Purcell, Ryan H; Shaikh, Tamim H; Zwick, Michael E; Kwok, Pui-Yan; Lee, Charles; Mulle, Jennifer G.

Genome Med ; 15(1): 35, 2023 05 10.

Article in English | MEDLINE | ID: mdl-37165454

ABSTRACT

BACKGROUND: High sequence identity between segmental duplications (SDs) can facilitate copy number variants (CNVs) via non-allelic homologous recombination (NAHR). These CNVs are one of the fundamental causes of genomic disorders such as the 3q29 deletion syndrome (del3q29S). There are 21 protein-coding genes lost or gained as a result of such recurrent 1.6-Mbp deletions or duplications, respectively, in the 3q29 locus. While NAHR plays a role in CNV occurrence, the factors that increase the risk of NAHR at this particular locus are not well understood. METHODS: We employed an optical genome mapping technique to characterize the 3q29 locus in 161 unaffected individuals, 16 probands with del3q29S and their parents, and 2 probands with the 3q29 duplication syndrome (dup3q29S). Long-read sequencing-based haplotype resolved de novo assemblies from 44 unaffected individuals, and 1 trio was used for orthogonal validation of haplotypes and deletion breakpoints. RESULTS: In total, we discovered 34 haplotypes, of which 19 were novel haplotypes. Among these 19 novel haplotypes, 18 were detected in unaffected individuals, while 1 novel haplotype was detected on the parent-of-origin chromosome of a proband with the del3q29S. Phased assemblies from 44 unaffected individuals enabled the orthogonal validation of 20 haplotypes. In 89% (16/18) of the probands, breakpoints were confined to paralogous copies of a 20-kbp segment within the 3q29 SDs. In one del3q29S proband, the breakpoint was confined to a 374-bp region using long-read sequencing. Furthermore, we categorized del3q29S cases into three classes and dup3q29S cases into two classes based on breakpoints. Finally, we found no evidence of inversions in parent-of-origin chromosomes. CONCLUSIONS: We have generated the most comprehensive haplotype map for the 3q29 locus using unaffected individuals, probands with del3q29S or dup3q29S, and available parents, and also determined the deletion breakpoint to be within a 374-bp region in one proband with del3q29S. These results should provide a better understanding of the underlying genetic architecture that contributes to the etiology of del3q29S and dup3q29S.

Subject(s)

Genomics , Segmental Duplications, Genomic , Humans , Chromosome Mapping , Syndrome , Haplotypes , DNA Copy Number Variations

4.

Genomic regions associated with microdeletion/microduplication syndromes exhibit extreme diversity of structural variation.

Mostovoy, Yulia; Yilmaz, Feyza; Chow, Stephen K; Chu, Catherine; Lin, Chin; Geiger, Elizabeth A; Meeks, Naomi J L; Chatfield, Kathryn C; Coughlin, Curtis R; Surti, Urvashi; Kwok, Pui-Yan; Shaikh, Tamim H.

Genetics ; 217(2)2021 02 09.

Article in English | MEDLINE | ID: mdl-33724415

ABSTRACT

Segmental duplications (SDs) are a class of long, repetitive DNA elements whose paralogs share a high level of sequence similarity with each other. SDs mediate chromosomal rearrangements that lead to structural variation in the general population as well as genomic disorders associated with multiple congenital anomalies, including the 7q11.23 (Williams-Beuren Syndrome, WBS), 15q13.3, and 16p12.2 microdeletion syndromes. Population-level characterization of SDs has generally been lacking because most techniques used for analyzing these complex regions are both labor and cost intensive. In this study, we have used a high-throughput technique to genotype complex structural variation with a single molecule, long-range optical mapping approach. We characterized SDs and identified novel structural variants (SVs) at 7q11.23, 15q13.3, and 16p12.2 using optical mapping data from 154 phenotypically normal individuals from 26 populations comprising five super-populations. We detected several novel SVs for each locus, some of which had significantly different prevalence between populations. Additionally, we localized the microdeletion breakpoints to specific paralogous duplicons located within complex SDs in two patients with WBS, one patient with 15q13.3, and one patient with 16p12.2 microdeletion syndromes. The population-level data presented here highlights the extreme diversity of large and complex SVs within SD-containing regions. The approach we outline will greatly facilitate the investigation of the role of inter-SD structural variation as a driver of chromosomal rearrangements and genomic disorders.

Subject(s)

Chromosome Disorders/genetics , Craniofacial Abnormalities/genetics , Genomic Structural Variation , Heart Defects, Congenital/genetics , Intellectual Disability/genetics , Segmental Duplications, Genomic , Seizures/genetics , Williams Syndrome/genetics , Chromosome Breakpoints , Chromosome Deletion , Chromosomes, Human, Pair 15/genetics , Chromosomes, Human, Pair 16/genetics , Developmental Disabilities/genetics , Humans , Mental Disorders/genetics

5.

Mutations in Metabotropic Glutamate Receptor 1 Contribute to Natural Short Sleep Trait.

Shi, Guangsen; Yin, Chen; Fan, Zenghua; Xing, Lijuan; Mostovoy, Yulia; Kwok, Pui-Yan; Ashbrook, Liza H; Krystal, Andrew D; Ptácek, Louis J; Fu, Ying-Hui.

Curr Biol ; 31(1): 13-24.e4, 2021 01 11.

Article in English | MEDLINE | ID: mdl-33065013

ABSTRACT

Sufficient and efficient sleep is crucial for our health. Natural short sleepers can sleep significantly shorter than the average population without a desire for more sleep and without any obvious negative health consequences. In searching for genetic variants underlying the short sleep trait, we found two different mutations in the same gene (metabotropic glutamate receptor 1) from two independent natural short sleep families. In vitro, both of the mutations exhibited loss of function in receptor-mediated signaling. In vivo, the mice carrying the individual mutations both demonstrated short sleep behavior. In brain slices, both of the mutations changed the electrical properties and increased excitatory synaptic transmission. These results highlight the important role of metabotropic glutamate receptor 1 in modulating sleep duration.

Subject(s)

Receptors, Metabotropic Glutamate/genetics , Sleep/genetics , Animals , DNA Mutational Analysis , Excitatory Postsynaptic Potentials/physiology , Female , Hippocampus/physiology , Humans , Male , Mice , Mice, Transgenic , Models, Animal , Mutation , Neuronal Plasticity/physiology , Patch-Clamp Techniques , Pedigree , Polysomnography , Receptors, Metabotropic Glutamate/metabolism , Time Factors , Exome Sequencing

6.

The Driver of Extreme Human-Specific Olduvai Repeat Expansion Remains Highly Active in the Human Genome.

Heft, Ilea E; Mostovoy, Yulia; Levy-Sakin, Michal; Ma, Walfred; Stevens, Aaron J; Pastor, Steven; McCaffrey, Jennifer; Boffelli, Dario; Martin, David I; Xiao, Ming; Kennedy, Martin A; Kwok, Pui-Yan; Sikela, James M.

Genetics ; 214(1): 179-191, 2020 01.

Article in English | MEDLINE | ID: mdl-31754017

ABSTRACT

Sequences encoding Olduvai protein domains (formerly DUF1220) show the greatest human lineage-specific increase in copy number of any coding region in the genome and have been associated, in a dosage-dependent manner, with brain size, cognitive aptitude, autism, and schizophrenia. Tandem intragenic duplications of a three-domain block, termed the Olduvai triplet, in four NBPF genes in the chromosomal 1q21.1-0.2 region, are primarily responsible for the striking human-specific copy number increase. Interestingly, most of the Olduvai triplets are adjacent to, and transcriptionally coregulated with, three human-specific NOTCH2NL genes that have been shown to promote cortical neurogenesis. Until now, the underlying genomic events that drove the Olduvai hyperamplification in humans have remained unexplained. Here, we show that the presence or absence of an alternative first exon of the Olduvai triplet perfectly discriminates between amplified (58/58) and unamplified (0/12) triplets. We provide sequence and breakpoint analyses that suggest the alternative exon was produced by an nonallelic homologous recombination-based mechanism involving the duplicative transposition of an existing Olduvai exon found in the CON3 domain, which typically occurs at the C-terminal end of NBPF genes. We also provide suggestive in vitro evidence that the alternative exon may promote instability through a putative G-quadraplex (pG4)-based mechanism. Lastly, we use single-molecule optical mapping to characterize the intragenic structural variation observed in NBPF genes in 154 unrelated individuals and 52 related individuals from 16 families and show that the presence of pG4-containing Olduvai triplets is strongly correlated with high levels of Olduvai copy number variation. These results suggest that the same driver of genomic instability that allowed the evolutionarily recent, rapid, and extreme human-specific Olduvai expansion remains highly active in the human genome.

Subject(s)

Carrier Proteins/genetics , Genome, Human , Trinucleotide Repeat Expansion , Animals , Base Sequence , DNA Copy Number Variations , Evolution, Molecular , G-Quadruplexes , Gene Amplification , Gene Dosage , Genomic Instability , Homologous Recombination , Humans , Primates , Protein Domains , Sequence Homology

7.

Mutant neuropeptide S receptor reduces sleep duration with preserved memory consolidation.

Xing, Lijuan; Shi, Guangsen; Mostovoy, Yulia; Gentry, Nicholas W; Fan, Zenghua; McMahon, Thomas B; Kwok, Pui-Yan; Jones, Christopher R; Ptácek, Louis J; Fu, Ying-Hui.

Sci Transl Med ; 11(514)2019 10 16.

Article in English | MEDLINE | ID: mdl-31619542

ABSTRACT

Sleep is a crucial physiological process for our survival and cognitive performance, yet the factors controlling human sleep regulation remain poorly understood. Here, we identified a missense mutation in a G protein-coupled neuropeptide S receptor 1 (NPSR1) that is associated with a natural short sleep phenotype in humans. Mice carrying the homologous mutation exhibited less sleep time despite increased sleep pressure. These animals were also resistant to contextual memory deficits associated with sleep deprivation. In vivo, the mutant receptors showed increased sensitivity to neuropeptide S exogenous activation. These results suggest that the NPS/NPSR1 pathway might play a critical role in regulating human sleep duration and in the link between sleep homeostasis and memory consolidation.

Subject(s)

Memory Consolidation/physiology , Receptors, G-Protein-Coupled/metabolism , Sleep/physiology , Animals , Female , Humans , Male , Mice , Mice, Inbred C57BL , Mice, Mutant Strains , Mutation/genetics , Polymerase Chain Reaction , Real-Time Polymerase Chain Reaction , Receptors, G-Protein-Coupled/genetics , Sleep/genetics

8.

The 22q11 low copy repeats are characterized by unprecedented size and structural variability.

Demaerel, Wolfram; Mostovoy, Yulia; Yilmaz, Feyza; Vervoort, Lisanne; Pastor, Steven; Hestand, Matthew S; Swillen, Ann; Vergaelen, Elfi; Geiger, Elizabeth A; Coughlin, Curtis R; Chow, Stephen K; McDonald-McGinn, Donna; Morrow, Bernice; Kwok, Pui-Yan; Xiao, Ming; Emanuel, Beverly S; Shaikh, Tamim H; Vermeesch, Joris R.

Genome Res ; 29(9): 1389-1401, 2019 09.

Article in English | MEDLINE | ID: mdl-31481461

ABSTRACT

Low copy repeats (LCRs) are recognized as a significant source of genomic instability, driving genome variability and evolution. The Chromosome 22 LCRs (LCR22s) mediate nonallelic homologous recombination (NAHR) leading to the 22q11 deletion syndrome (22q11DS). However, LCR22s are among the most complex regions in the genome, and their structure remains unresolved. The difficulty in generating accurate maps of LCR22s has also hindered localization of the deletion end points in 22q11DS patients. Using fiber FISH and Bionano optical mapping, we assembled LCR22 alleles in 187 cell lines. Our analysis uncovered an unprecedented level of variation in LCR22s, including LCR22A alleles ranging in size from 250 to 2000 kb. Further, the incidence of various LCR22 alleles varied within different populations. Additionally, the analysis of LCR22s in 22q11DS patients and their parents enabled further refinement of the rearrangement site within LCR22A and -D, which flank the 22q11 deletion. The NAHR site was localized to a 160-kb paralog shared between the LCR22A and -D in seven 22q11DS patients. Thus, we present the most comprehensive map of LCR22 variation to date. This will greatly facilitate the investigation of the role of LCR variation as a driver of 22q11 rearrangements and the phenotypic variability among 22q11DS patients.

Subject(s)

22q11 Deletion Syndrome/genetics , Chromosome Mapping/methods , Chromosomes, Human, Pair 22/genetics , Repetitive Sequences, Nucleic Acid , Animals , Cell Line , Chromosomal Instability , Evolution, Molecular , Humans , In Situ Hybridization, Fluorescence , Primates/genetics

9.

Evaluating the quality of the 1000 genomes project data.

Belsare, Saurabh; Levy-Sakin, Michal; Mostovoy, Yulia; Durinck, Steffen; Chaudhuri, Subhra; Xiao, Ming; Peterson, Andrew S; Kwok, Pui-Yan; Seshagiri, Somasekar; Wall, Jeffrey D.

BMC Genomics ; 20(1): 620, 2019 Aug 16.

Article in English | MEDLINE | ID: mdl-31416423

ABSTRACT

BACKGROUND: Data from the 1000 Genomes project is quite often used as a reference for human genomic analysis. However, its accuracy needs to be assessed to understand the quality of predictions made using this reference. We present here an assessment of the genotyping, phasing, and imputation accuracy data in the 1000 Genomes project. We compare the phased haplotype calls from the 1000 Genomes project to experimentally phased haplotypes for 28 of the same individuals sequenced using the 10X Genomics platform. RESULTS: We observe that phasing and imputation for rare variants are unreliable, which likely reflects the limited sample size of the 1000 Genomes project data. Further, it appears that using a population specific reference panel does not improve the accuracy of imputation over using the entire 1000 Genomes data set as a reference panel. We also note that the error rates and trends depend on the choice of definition of error, and hence any error reporting needs to take these definitions into account. CONCLUSIONS: The quality of the 1000 Genomes data needs to be considered while using this database for further studies. This work presents an analysis that can be used for these assessments.

Subject(s)

Genome, Human/genetics , Haplotypes/genetics , Racial Groups/genetics , Gene Frequency/genetics , High-Throughput Nucleotide Sequencing , Human Genome Project , Humans , Polymorphism, Single Nucleotide , Racial Groups/ethnology , Scientific Experimental Error

10.

Genome of the Komodo dragon reveals adaptations in the cardiovascular and chemosensory systems of monitor lizards.

Lind, Abigail L; Lai, Yvonne Y Y; Mostovoy, Yulia; Holloway, Alisha K; Iannucci, Alessio; Mak, Angel C Y; Fondi, Marco; Orlandini, Valerio; Eckalbar, Walter L; Milan, Massimo; Rovatsos, Michail; Kichigin, Ilya G; Makunin, Alex I; Johnson Pokorná, Martina; Altmanová, Marie; Trifonov, Vladimir A; Schijlen, Elio; Kratochvíl, Lukás; Fani, Renato; Velenský, Petr; Rehák, Ivan; Patarnello, Tomaso; Jessop, Tim S; Hicks, James W; Ryder, Oliver A; Mendelson, Joseph R; Ciofi, Claudio; Kwok, Pui-Yan; Pollard, Katherine S; Bruneau, Benoit G.

Nat Ecol Evol ; 3(8): 1241-1252, 2019 08.

Article in English | MEDLINE | ID: mdl-31358948

ABSTRACT

Monitor lizards are unique among ectothermic reptiles in that they have high aerobic capacity and distinctive cardiovascular physiology resembling that of endothermic mammals. Here, we sequence the genome of the Komodo dragon Varanus komodoensis, the largest extant monitor lizard, and generate a high-resolution de novo chromosome-assigned genome assembly for V. komodoensis using a hybrid approach of long-range sequencing and single-molecule optical mapping. Comparing the genome of V. komodoensis with those of related species, we find evidence of positive selection in pathways related to energy metabolism, cardiovascular homoeostasis, and haemostasis. We also show species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and other lizard lineages. Together, these evolutionary signatures of adaptation reveal the genetic underpinnings of the unique Komodo dragon sensory and cardiovascular systems, and suggest that selective pressure altered haemostasis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. The Komodo dragon genome is an important resource for understanding the biology of monitor lizards and reptiles worldwide.

Subject(s)

Cardiovascular System , Lizards , Acclimatization , Animals , Chromosomes

11.

Genome maps across 26 human populations reveal population-specific patterns of structural variation.

Levy-Sakin, Michal; Pastor, Steven; Mostovoy, Yulia; Li, Le; Leung, Alden K Y; McCaffrey, Jennifer; Young, Eleanor; Lam, Ernest T; Hastie, Alex R; Wong, Karen H Y; Chung, Claire Y L; Ma, Walfred; Sibert, Justin; Rajagopalan, Ramakrishnan; Jin, Nana; Chow, Eugene Y C; Chu, Catherine; Poon, Annie; Lin, Chin; Naguib, Ahmed; Wang, Wei-Ping; Cao, Han; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan.

Nat Commun ; 10(1): 1025, 2019 03 04.

Article in English | MEDLINE | ID: mdl-30833565

ABSTRACT

Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome.

Subject(s)

Chromosome Mapping , Genome, Human , Genomic Structural Variation , Algorithms , Base Sequence , Chromosome Mapping/methods , Chromosomes, Human, Y , Computational Biology , Female , Gene Dosage , Genetic Linkage , Genomics , Humans , Male , Mutation , Phylogeny , Segmental Duplications, Genomic/genetics , Sequence Analysis, DNA

12.

Comparative genome analysis of programmed DNA elimination in nematodes.

Wang, Jianbin; Gao, Shenghan; Mostovoy, Yulia; Kang, Yuanyuan; Zagoskin, Maxim; Sun, Yongqiao; Zhang, Bing; White, Laura K; Easton, Alice; Nutman, Thomas B; Kwok, Pui-Yan; Hu, Songnian; Nielsen, Martin K; Davis, Richard E.

Genome Res ; 27(12): 2001-2014, 2017 12.

Article in English | MEDLINE | ID: mdl-29118011

ABSTRACT

Programmed DNA elimination is a developmentally regulated process leading to the reproducible loss of specific genomic sequences. DNA elimination occurs in unicellular ciliates and a variety of metazoans, including invertebrates and vertebrates. In metazoa, DNA elimination typically occurs in somatic cells during early development, leaving the germline genome intact. Reference genomes for metazoa that undergo DNA elimination are not available. Here, we generated germline and somatic reference genome sequences of the DNA eliminating pig parasitic nematode Ascaris suum and the horse parasite Parascaris univalens. In addition, we carried out in-depth analyses of DNA elimination in the parasitic nematode of humans, Ascaris lumbricoides, and the parasitic nematode of dogs, Toxocara canis. Our analysis of nematode DNA elimination reveals that in all species, repetitive sequences (that differ among the genera) and germline-expressed genes (approximately 1000-2000 or 5%-10% of the genes) are eliminated. Thirty-five percent of these eliminated genes are conserved among these nematodes, defining a core set of eliminated genes that are preferentially expressed during spermatogenesis. Our analysis supports the view that DNA elimination in nematodes silences germline-expressed genes. Over half of the chromosome break sites are conserved between Ascaris and Parascaris, whereas only 10% are conserved in the more divergent T. canis. Analysis of the chromosomal breakage regions suggests a sequence-independent mechanism for DNA breakage followed by telomere healing, with the formation of more accessible chromatin in the break regions prior to DNA elimination. Our genome assemblies and annotations also provide comprehensive resources for analysis of DNA elimination, parasitology research, and comparative nematode genome and epigenome studies.

Subject(s)

DNA, Helminth , Nematoda/genetics , Alternative Splicing , Animals , Ascaridoidea/genetics , Ascaris suum/genetics , Chromosome Breakage , Chromosome Breakpoints , Evolution, Molecular , Female , Genome , Germ-Line Mutation , Male , Molecular Sequence Annotation , RNA, Helminth/biosynthesis , Repetitive Sequences, Nucleic Acid , Sequence Analysis, DNA , Sex Chromosomes , Telomere , Toxocara canis/genetics , Transcriptome

13.

The Role of Transcription Factors at Antisense-Expressing Gene Pairs in Yeast.

Mostovoy, Yulia; Thiemicke, Alexander; Hsu, Tiffany Y; Brem, Rachel B.

Genome Biol Evol ; 8(6): 1748-61, 2016 06 27.

Article in English | MEDLINE | ID: mdl-27190003

ABSTRACT

Genes encoded close to one another on the chromosome are often coexpressed, by a mechanism and regulatory logic that remain poorly understood. We surveyed the yeast genome for tandem gene pairs oriented tail-to-head at which expression antisense to the upstream gene was conserved across species. The intergenic region at most such tandem pairs is a bidirectional promoter, shared by the downstream gene mRNA and the upstream antisense transcript. Genomic analyses of these intergenic loci revealed distinctive patterns of transcription factor regulation. Mutation of a given transcription factor verified its role as a regulator in trans of tandem gene pair loci, including the proximally initiating upstream antisense transcript and downstream mRNA and the distally initiating upstream mRNA. To investigate cis-regulatory activity at such a locus, we focused on the stress-induced NAD(P)H dehydratase YKL151C and its downstream neighbor, the metabolic enzyme GPM1 Previous work has implicated the region between these genes in regulation of GPM1 expression; our mutation experiments established its function in rich medium as a repressor in cis of the distally initiating YKL151C sense RNA, and an activator of the proximally initiating YKL151C antisense RNA. Wild-type expression of all three transcripts required the transcription factor Gcr2. Thus, at this locus, the intergenic region serves as a focal point of regulatory input, driving antisense expression and mediating the coordinated regulation of YKL151C and GPM1 Together, our findings implicate transcription factors in the joint control of neighboring genes specialized to opposing conditions and the antisense transcripts expressed between them.

Subject(s)

RNA, Antisense/biosynthesis , SUMO-1 Protein/genetics , Saccharomyces cerevisiae Proteins/genetics , Transcription Factors/genetics , Transcription, Genetic , Gene Expression Regulation, Fungal , Genome, Fungal , Promoter Regions, Genetic , RNA, Antisense/genetics , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , SUMO-1 Protein/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/metabolism , Transcription Factors/biosynthesis

14.

A hybrid approach for de novo human genome sequence assembly and phasing.

Mostovoy, Yulia; Levy-Sakin, Michal; Lam, Jessica; Lam, Ernest T; Hastie, Alex R; Marks, Patrick; Lee, Joyce; Chu, Catherine; Lin, Chin; Dzakula, Zeljko; Cao, Han; Schlebusch, Stephen A; Giorda, Kristina; Schnall-Levin, Michael; Wall, Jeffrey D; Kwok, Pui-Yan.

Nat Methods ; 13(7): 587-90, 2016 07.

Article in English | MEDLINE | ID: mdl-27159086

ABSTRACT

Despite tremendous progress in genome sequencing, the basic goal of producing a phased (haplotype-resolved) genome sequence with end-to-end contiguity for each chromosome at reasonable cost and effort is still unrealized. In this study, we describe an approach to performing de novo genome assembly and experimental phasing by integrating the data from Illumina short-read sequencing, 10X Genomics linked-read sequencing, and BioNano Genomics genome mapping to yield a high-quality, phased, de novo assembled human genome.

Subject(s)

Chromosome Mapping/methods , Genome, Human , Genomics/methods , Haplotypes/genetics , High-Throughput Nucleotide Sequencing/methods , Humans

15.

Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Dzakula, Zeljko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan.

Genetics ; 202(1): 351-62, 2016 Jan.

Article in English | MEDLINE | ID: mdl-26510793

ABSTRACT

Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.

Subject(s)

Chromosome Mapping , Genomic Structural Variation , Microarray Analysis/methods , Cell Line , Genome, Human , Humans

16.

Divergence of iron metabolism in wild Malaysian yeast.

Lee, Hana N; Mostovoy, Yulia; Hsu, Tiffany Y; Chang, Amanda H; Brem, Rachel B.

G3 (Bethesda) ; 3(12): 2187-94, 2013 Dec 09.

Article in English | MEDLINE | ID: mdl-24142925

ABSTRACT

Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics.

Subject(s)

Iron/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Basic-Leucine Zipper Transcription Factors/genetics , Basic-Leucine Zipper Transcription Factors/metabolism , Biological Evolution , Cation Transport Proteins/genetics , Cation Transport Proteins/metabolism , Gene Expression Regulation, Fungal , Iron/pharmacology , Malaysia , Saccharomyces cerevisiae/drug effects , Saccharomyces cerevisiae/growth & development , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , Selection, Genetic , Transcription Factors/genetics , Transcription Factors/metabolism

17.

Inferring evolutionary histories of pathway regulation from transcriptional profiling data.

Schraiber, Joshua G; Mostovoy, Yulia; Hsu, Tiffany Y; Brem, Rachel B.

PLoS Comput Biol ; 9(10): e1003255, 2013.

Article in English | MEDLINE | ID: mdl-24130471

ABSTRACT

One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, highlighting the prevalence of pathway-level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change.

Subject(s)

Evolution, Molecular , Gene Expression Profiling/methods , Genomics/methods , Models, Genetic , Phylogeny , Saccharomyces/genetics

18.

Polygenic and directional regulatory evolution across pathways in Saccharomyces.

Bullard, James H; Mostovoy, Yulia; Dudoit, Sandrine; Brem, Rachel B.

Proc Natl Acad Sci U S A ; 107(11): 5058-63, 2010 Mar 16.

Article in English | MEDLINE | ID: mdl-20194736

ABSTRACT

The search to understand how genomes innovate in response to selection dominates the field of evolutionary biology. Powerful molecular evolution approaches have been developed to test individual loci for signatures of selection. In many cases, however, an organism's response to changes in selective pressure may be mediated by multiple genes, whose products function together in a cellular process or pathway. Here we assess the prevalence of polygenic evolution in pathways in the yeasts Saccharomyces cerevisiae and S. bayanus. We first established short-read sequencing methods to detect cis-regulatory variation in a diploid hybrid between the species. We then tested for the scenario in which selective pressure in one species to increase or decrease the activity of a pathway has driven the accumulation of cis-regulatory variants that act in the same direction on gene expression. Application of this test revealed a variety of yeast pathways with evidence for directional regulatory evolution. In parallel, we also used population genomic sequencing data to compare protein and cis-regulatory variation within and between species. We identified pathways with evidence for divergence within S. cerevisiae, and we detected signatures of positive selection between S. cerevisiae and S. bayanus. Our results point to polygenic, pathway-level change as a common evolutionary mechanism among yeasts. We suggest that pathway analyses, including our test for directional regulatory evolution, will prove to be a relevant and powerful strategy in many evolutionary genomic applications.

Subject(s)

Biological Evolution , Metabolic Networks and Pathways/genetics , Multifactorial Inheritance/genetics , Saccharomyces/genetics , Alleles , Base Sequence , Exosomes/metabolism , Gene Expression Regulation, Fungal , Genetic Variation , Hybridization, Genetic , RNA, Fungal/genetics , Regulatory Sequences, Nucleic Acid/genetics , Selection, Genetic , Species Specificity

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL