Search | VHL Regional Portal

1.

High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants.

Razavi, Pedram; Li, Bob T; Brown, David N; Jung, Byoungsok; Hubbell, Earl; Shen, Ronglai; Abida, Wassim; Juluru, Krishna; De Bruijn, Ino; Hou, Chenlu; Venn, Oliver; Lim, Raymond; Anand, Aseem; Maddala, Tara; Gnerre, Sante; Vijaya Satya, Ravi; Liu, Qinwen; Shen, Ling; Eattock, Nicholas; Yue, Jeanne; Blocker, Alexander W; Lee, Mark; Sehnert, Amy; Xu, Hui; Hall, Megan P; Santiago-Zayas, Angie; Novotny, William F; Isbell, James M; Rusch, Valerie W; Plitas, George; Heerdt, Alexandra S; Ladanyi, Marc; Hyman, David M; Jones, David R; Morrow, Monica; Riely, Gregory J; Scher, Howard I; Rudin, Charles M; Robson, Mark E; Diaz, Luis A; Solit, David B; Aravanis, Alexander M; Reis-Filho, Jorge S.

Nat Med ; 25(12): 1928-1937, 2019 12.

Article in English | MEDLINE | ID: mdl-31768066

ABSTRACT

Accurate identification of tumor-derived somatic variants in plasma circulating cell-free DNA (cfDNA) requires understanding of the various biological compartments contributing to the cfDNA pool. We sought to define the technical feasibility of a high-intensity sequencing assay of cfDNA and matched white blood cell DNA covering a large genomic region (508 genes; 2 megabases; >60,000× raw depth) in a prospective study of 124 patients with metastatic cancer, with contemporaneous matched tumor tissue biopsies, and 47 controls without cancer. The assay displayed high sensitivity and specificity, allowing for de novo detection of tumor-derived mutations and inference of tumor mutational burden, microsatellite instability, mutational signatures and sources of somatic mutations identified in cfDNA. The vast majority of cfDNA mutations (81.6% in controls and 53.2% in patients with cancer) had features consistent with clonal hematopoiesis. This cfDNA sequencing approach revealed that clonal hematopoiesis constitutes a pervasive biological phenomenon, emphasizing the importance of matched cfDNA-white blood cell sequencing for accurate variant interpretation.

Subject(s)

Cell-Free Nucleic Acids/blood , Circulating Tumor DNA/blood , Genomics , Neoplasms/blood , Adult , Biomarkers, Tumor/blood , Circulating Tumor DNA/genetics , DNA Mutational Analysis , DNA, Neoplasm/blood , Female , Gene Expression Regulation, Neoplastic , High-Throughput Nucleotide Sequencing , Humans , Male , Microsatellite Instability , Middle Aged , Mutation , Neoplasms/genetics , Neoplasms/pathology

2.

Development and Validation of an Ultradeep Next-Generation Sequencing Assay for Testing of Plasma Cell-Free DNA from Patients with Advanced Cancer.

Janku, Filip; Zhang, Shile; Waters, Jill; Liu, Li; Huang, Helen J; Subbiah, Vivek; Hong, David S; Karp, Daniel D; Fu, Siqing; Cai, Xuyu; Ramzanali, Nishma M; Madwani, Kiran; Cabrilo, Goran; Andrews, Debra L; Zhao, Yue; Javle, Milind; Kopetz, E Scott; Luthra, Rajyalakshmi; Kim, Hyunsung J; Gnerre, Sante; Satya, Ravi Vijaya; Chuang, Han-Yu; Kruglyak, Kristina M; Toung, Jonathan; Zhao, Chen; Shen, Richard; Heymach, John V; Meric-Bernstam, Funda; Mills, Gordon B; Fan, Jian-Bing; Salathia, Neeraj S.

Clin Cancer Res ; 23(18): 5648-5656, 2017 Sep 15.

Article in English | MEDLINE | ID: mdl-28536309

ABSTRACT

Purpose: Tumor-derived cell-free DNA (cfDNA) in plasma can be used for molecular testing and provide an attractive alternative to tumor tissue. Commonly used PCR-based technologies can test for limited number of alterations at the time. Therefore, novel ultrasensitive technologies capable of testing for a broad spectrum of molecular alterations are needed to further personalized cancer therapy.Experimental Design: We developed a highly sensitive ultradeep next-generation sequencing (NGS) assay using reagents from TruSeqNano library preparation and NexteraRapid Capture target enrichment kits to generate plasma cfDNA sequencing libraries for mutational analysis in 61 cancer-related genes using common bioinformatics tools. The results were retrospectively compared with molecular testing of archival primary or metastatic tumor tissue obtained at different points of clinical care.Results: In a study of 55 patients with advanced cancer, the ultradeep NGS assay detected 82% (complete detection) to 87% (complete and partial detection) of the aberrations identified in discordantly collected corresponding archival tumor tissue. Patients with a low variant allele frequency (VAF) of mutant cfDNA survived longer than those with a high VAF did (P = 0.018). In patients undergoing systemic therapy, radiological response was positively associated with changes in cfDNA VAF (P = 0.02), and compared with unchanged/increased mutant cfDNA VAF, decreased cfDNA VAF was associated with longer time to treatment failure (TTF; P = 0.03).Conclusions: Ultradeep NGS assay has good sensitivity compared with conventional clinical mutation testing of archival specimens. A high VAF in mutant cfDNA corresponded with shorter survival. Changes in VAF of mutated cfDNA were associated with TTF. Clin Cancer Res; 23(18); 5648-56. ©2017 AACR.

Subject(s)

Biomarkers, Tumor , Circulating Tumor DNA , High-Throughput Nucleotide Sequencing , Neoplasms/diagnosis , Neoplasms/genetics , Adult , Aged , Aged, 80 and over , Female , Genetic Testing , High-Throughput Nucleotide Sequencing/methods , High-Throughput Nucleotide Sequencing/standards , Humans , Male , Middle Aged , Mutation , Neoplasms/mortality , Prognosis , Reproducibility of Results , Sensitivity and Specificity

3.

The genomic substrate for adaptive radiation in African cichlid fish.

Brawand, David; Wagner, Catherine E; Li, Yang I; Malinsky, Milan; Keller, Irene; Fan, Shaohua; Simakov, Oleg; Ng, Alvin Y; Lim, Zhi Wei; Bezault, Etienne; Turner-Maier, Jason; Johnson, Jeremy; Alcazar, Rosa; Noh, Hyun Ji; Russell, Pamela; Aken, Bronwen; Alföldi, Jessica; Amemiya, Chris; Azzouzi, Naoual; Baroiller, Jean-François; Barloy-Hubler, Frederique; Berlin, Aaron; Bloomquist, Ryan; Carleton, Karen L; Conte, Matthew A; D'Cotta, Helena; Eshel, Orly; Gaffney, Leslie; Galibert, Francis; Gante, Hugo F; Gnerre, Sante; Greuter, Lucie; Guyon, Richard; Haddad, Natalie S; Haerty, Wilfried; Harris, Rayna M; Hofmann, Hans A; Hourlier, Thibaut; Hulata, Gideon; Jaffe, David B; Lara, Marcia; Lee, Alison P; MacCallum, Iain; Mwaiko, Salome; Nikaido, Masato; Nishihara, Hidenori; Ozouf-Costaz, Catherine; Penman, David J; Przybylski, Dariusz; Rakotomanga, Michaelle.

Nature ; 513(7518): 375-381, 2014 Sep 18.

Article in English | MEDLINE | ID: mdl-25186727

ABSTRACT

Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.

Subject(s)

Cichlids/classification , Cichlids/genetics , Evolution, Molecular , Genetic Speciation , Genome/genetics , Africa, Eastern , Animals , DNA Transposable Elements/genetics , Gene Duplication/genetics , Gene Expression Regulation/genetics , Genomics , Lakes , MicroRNAs/genetics , Phylogeny , Polymorphism, Genetic/genetics

4.

Gibbon genome and the fast karyotype evolution of small apes.

Carbone, Lucia; Harris, R Alan; Gnerre, Sante; Veeramah, Krishna R; Lorente-Galdos, Belen; Huddleston, John; Meyer, Thomas J; Herrero, Javier; Roos, Christian; Aken, Bronwen; Anaclerio, Fabio; Archidiacono, Nicoletta; Baker, Carl; Barrell, Daniel; Batzer, Mark A; Beal, Kathryn; Blancher, Antoine; Bohrson, Craig L; Brameier, Markus; Campbell, Michael S; Capozzi, Oronzo; Casola, Claudio; Chiatante, Giorgia; Cree, Andrew; Damert, Annette; de Jong, Pieter J; Dumas, Laura; Fernandez-Callejo, Marcos; Flicek, Paul; Fuchs, Nina V; Gut, Ivo; Gut, Marta; Hahn, Matthew W; Hernandez-Rodriguez, Jessica; Hillier, LaDeana W; Hubley, Robert; Ianc, Bianca; Izsvák, Zsuzsanna; Jablonski, Nina G; Johnstone, Laurel M; Karimpour-Fard, Anis; Konkel, Miriam K; Kostka, Dennis; Lazar, Nathan H; Lee, Sandra L; Lewis, Lora R; Liu, Yue; Locke, Devin P; Mallick, Swapan; Mendez, Fernando L.

Nature ; 513(7517): 195-201, 2014 Sep 11.

Article in English | MEDLINE | ID: mdl-25209798

ABSTRACT

Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation â¼5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.

Subject(s)

Genome/genetics , Hylobates/classification , Hylobates/genetics , Karyotype , Phylogeny , Animals , Evolution, Molecular , Hominidae/classification , Hominidae/genetics , Humans , Molecular Sequence Data , Retroelements/genetics , Selection, Genetic , Transcription Termination, Genetic

5.

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species.

Bradnam, Keith R; Fass, Joseph N; Alexandrov, Anton; Baranay, Paul; Bechner, Michael; Birol, Inanç; Boisvert, Sébastien; Chapman, Jarrod A; Chapuis, Guillaume; Chikhi, Rayan; Chitsaz, Hamidreza; Chou, Wen-Chi; Corbeil, Jacques; Del Fabbro, Cristian; Docking, T Roderick; Durbin, Richard; Earl, Dent; Emrich, Scott; Fedotov, Pavel; Fonseca, Nuno A; Ganapathy, Ganeshkumar; Gibbs, Richard A; Gnerre, Sante; Godzaridis, Elénie; Goldstein, Steve; Haimel, Matthias; Hall, Giles; Haussler, David; Hiatt, Joseph B; Ho, Isaac Y; Howard, Jason; Hunt, Martin; Jackman, Shaun D; Jaffe, David B; Jarvis, Erich D; Jiang, Huaiyang; Kazakov, Sergey; Kersey, Paul J; Kitzman, Jacob O; Knight, James R; Koren, Sergey; Lam, Tak-Wah; Lavenier, Dominique; Laviolette, François; Li, Yingrui; Li, Zhenyu; Liu, Binghang; Liu, Yue; Luo, Ruibang; Maccallum, Iain.

Gigascience ; 2(1): 10, 2013 Jul 22.

Article in English | MEDLINE | ID: mdl-23870653

ABSTRACT

BACKGROUND: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. RESULTS: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. CONCLUSIONS: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

6.

The African coelacanth genome provides insights into tetrapod evolution.

Amemiya, Chris T; Alföldi, Jessica; Lee, Alison P; Fan, Shaohua; Philippe, Hervé; Maccallum, Iain; Braasch, Ingo; Manousaki, Tereza; Schneider, Igor; Rohner, Nicolas; Organ, Chris; Chalopin, Domitille; Smith, Jeramiah J; Robinson, Mark; Dorrington, Rosemary A; Gerdol, Marco; Aken, Bronwen; Biscotti, Maria Assunta; Barucca, Marco; Baurain, Denis; Berlin, Aaron M; Blatch, Gregory L; Buonocore, Francesco; Burmester, Thorsten; Campbell, Michael S; Canapa, Adriana; Cannon, John P; Christoffels, Alan; De Moro, Gianluca; Edkins, Adrienne L; Fan, Lin; Fausto, Anna Maria; Feiner, Nathalie; Forconi, Mariko; Gamieldien, Junaid; Gnerre, Sante; Gnirke, Andreas; Goldstone, Jared V; Haerty, Wilfried; Hahn, Mark E; Hesse, Uljana; Hoffmann, Steve; Johnson, Jeremy; Karchner, Sibel I; Kuraku, Shigehiro; Lara, Marcia; Levin, Joshua Z; Litman, Gary W; Mauceli, Evan; Miyake, Tsutomu.

Nature ; 496(7445): 311-6, 2013 Apr 18.

Article in English | MEDLINE | ID: mdl-23598338

ABSTRACT

The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.

Subject(s)

Biological Evolution , Fishes/classification , Fishes/genetics , Genome/genetics , Animals , Animals, Genetically Modified , Chick Embryo , Conserved Sequence/genetics , Enhancer Elements, Genetic/genetics , Evolution, Molecular , Extremities/anatomy & histology , Extremities/growth & development , Fishes/anatomy & histology , Fishes/physiology , Genes, Homeobox/genetics , Genomics , Immunoglobulin M/genetics , Mice , Molecular Sequence Annotation , Molecular Sequence Data , Phylogeny , Sequence Alignment , Sequence Analysis, DNA , Vertebrates/anatomy & histology , Vertebrates/genetics , Vertebrates/physiology

7.

De novo assembly of highly diverse viral populations.

Yang, Xiao; Charlebois, Patrick; Gnerre, Sante; Coole, Matthew G; Lennon, Niall J; Levin, Joshua Z; Qu, James; Ryan, Elizabeth M; Zody, Michael C; Henn, Matthew R.

BMC Genomics ; 13: 475, 2012 Sep 13.

Article in English | MEDLINE | ID: mdl-22974120

ABSTRACT

BACKGROUND: Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A de novo population consensus assembly is valuable both as a single linear representation of the population and as a backbone on which intra-host variants can be accurately mapped. The availability of consensus assemblies and robustly mapped variants are crucial to the genetic study of viral disease progression, transmission dynamics, and viral evolution. Existing de novo assembly techniques fail to robustly assemble ultra-deep sequence data from genetically heterogeneous populations such as viruses into full-length genomes due to the presence of extensive genetic variability, contaminants, and variable sequence coverage. RESULTS: We present VICUNA, a de novo assembly algorithm suitable for generating consensus assemblies from genetically heterogeneous populations. We demonstrate its effectiveness on Dengue, Human Immunodeficiency and West Nile viral populations, representing a range of intra-host diversity. Compared to state-of-the-art assemblers designed for haploid or diploid systems, VICUNA recovers full-length consensus and captures insertion/deletion polymorphisms in diverse samples. Final assemblies maintain a high base calling accuracy. VICUNA program is publicly available at: http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/ viral-genomics-analysis-software. CONCLUSIONS: We developed VICUNA, a publicly available software tool, that enables consensus assembly of ultra-deep sequence derived from diverse viral populations. While VICUNA was developed for the analysis of viral populations, its application to other heterogeneous sequence data sets such as metagenomic or tumor cell population samples may prove beneficial in these fields of research.

Subject(s)

Genome, Viral/genetics , Software , Algorithms , Computational Biology

8.

A direct characterization of human mutation based on microsatellites.

Sun, James X; Helgason, Agnar; Masson, Gisli; Ebenesersdóttir, Sigríður Sunna; Li, Heng; Mallick, Swapan; Gnerre, Sante; Patterson, Nick; Kong, Augustine; Reich, David; Stefansson, Kari.

Nat Genet ; 44(10): 1161-5, 2012 Oct.

Article in English | MEDLINE | ID: mdl-22922873

ABSTRACT

Mutations are the raw material of evolution but have been difficult to study directly. We report the largest study of new mutations to date, comprising 2,058 germline changes discovered by analyzing 85,289 Icelanders at 2,477 microsatellites. The paternal-to-maternal mutation rate ratio is 3.3, and the rate in fathers doubles from age 20 to 58, whereas there is no association with age in mothers. Longer microsatellite alleles are more mutagenic and tend to decrease in length, whereas the opposite is seen for shorter alleles. We use these empirical observations to build a model that we apply to individuals for whom we have both genome sequence and microsatellite data, allowing us to estimate key parameters of evolution without calibration to the fossil record. We infer that the sequence mutation rate is 1.4-2.3×10(-8) mutations per base pair per generation (90% credible interval) and that human-chimpanzee speciation occurred 3.7-6.6 million years ago.

Subject(s)

Genome, Human , Germ-Line Mutation , Microsatellite Repeats , Bayes Theorem , Evolution, Molecular , Female , Genetic Speciation , Humans , Male , Markov Chains , Models, Genetic , Monte Carlo Method , Pedigree

9.

Finished bacterial genomes from shotgun sequence data.

Ribeiro, Filipe J; Przybylski, Dariusz; Yin, Shuangye; Sharpe, Ted; Gnerre, Sante; Abouelleil, Amr; Berlin, Aaron M; Montmayeur, Anna; Shea, Terrance P; Walker, Bruce J; Young, Sarah K; Russ, Carsten; Nusbaum, Chad; MacCallum, Iain; Jaffe, David B.

Genome Res ; 22(11): 2270-7, 2012 Nov.

Article in English | MEDLINE | ID: mdl-22829535

ABSTRACT

Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.

Subject(s)

Bacteria/genetics , Genome, Bacterial , Genomic Library , Sequence Analysis, DNA/methods , Algorithms

10.

Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection.

Henn, Matthew R; Boutwell, Christian L; Charlebois, Patrick; Lennon, Niall J; Power, Karen A; Macalalad, Alexander R; Berlin, Aaron M; Malboeuf, Christine M; Ryan, Elizabeth M; Gnerre, Sante; Zody, Michael C; Erlich, Rachel L; Green, Lisa M; Berical, Andrew; Wang, Yaoyu; Casali, Monica; Streeck, Hendrik; Bloom, Allyson K; Dudek, Tim; Tully, Damien; Newman, Ruchi; Axten, Karen L; Gladden, Adrianne D; Battis, Laura; Kemper, Michael; Zeng, Qiandong; Shea, Terrance P; Gujja, Sharvari; Zedlack, Carmen; Gasser, Olivier; Brander, Christian; Hess, Christoph; Günthard, Huldrych F; Brumme, Zabrina L; Brumme, Chanson J; Bazner, Suzane; Rychert, Jenna; Tinsley, Jake P; Mayer, Ken H; Rosenberg, Eric; Pereyra, Florencia; Levin, Joshua Z; Young, Sarah K; Jessen, Heiko; Altfeld, Marcus; Birren, Bruce W; Walker, Bruce D; Allen, Todd M.

PLoS Pathog ; 8(3): e1002529, 2012.

Article in English | MEDLINE | ID: mdl-22412369

ABSTRACT

Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more highly constrained regions of the virus in order to ensure the maintenance of immunodominant CD8 responses and the sustained decline of early viremia.

Subject(s)

Genome, Viral/genetics , Genome-Wide Association Study , HIV Infections/virology , HIV-1/genetics , Immune Evasion/immunology , CD8-Positive T-Lymphocytes/immunology , Genetic Variation , Genomic Structural Variation , HIV Infections/immunology , HIV Infections/prevention & control , HIV-1/immunology , HIV-1/pathogenicity , Humans , Immune Evasion/genetics , Oligonucleotide Array Sequence Analysis , RNA, Viral/analysis , Sequence Analysis, RNA , Viral Vaccines/immunology

11.

A high-resolution map of human evolutionary constraint using 29 mammals.

Lindblad-Toh, Kerstin; Garber, Manuel; Zuk, Or; Lin, Michael F; Parker, Brian J; Washietl, Stefan; Kheradpour, Pouya; Ernst, Jason; Jordan, Gregory; Mauceli, Evan; Ward, Lucas D; Lowe, Craig B; Holloway, Alisha K; Clamp, Michele; Gnerre, Sante; Alföldi, Jessica; Beal, Kathryn; Chang, Jean; Clawson, Hiram; Cuff, James; Di Palma, Federica; Fitzgerald, Stephen; Flicek, Paul; Guttman, Mitchell; Hubisz, Melissa J; Jaffe, David B; Jungreis, Irwin; Kent, W James; Kostka, Dennis; Lara, Marcia; Martins, Andre L; Massingham, Tim; Moltke, Ida; Raney, Brian J; Rasmussen, Matthew D; Robinson, Jim; Stark, Alexander; Vilella, Albert J; Wen, Jiayu; Xie, Xiaohui; Zody, Michael C; Baldwin, Jen; Bloom, Toby; Chin, Chee Whye; Heiman, Dave; Nicol, Robert; Nusbaum, Chad; Young, Sarah; Wilkinson, Jane; Worley, Kim C.

Nature ; 478(7370): 476-82, 2011 Oct 12.

Article in English | MEDLINE | ID: mdl-21993624

ABSTRACT

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering â¼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for â¼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.

Subject(s)

Evolution, Molecular , Genome, Human/genetics , Genome/genetics , Mammals/genetics , Animals , Disease , Exons/genetics , Genomics , Health , Humans , Molecular Sequence Annotation , Phylogeny , RNA/classification , RNA/genetics , Selection, Genetic/genetics , Sequence Alignment , Sequence Analysis, DNA

12.

High-quality draft assemblies of mammalian genomes from massively parallel sequence data.

Gnerre, Sante; Maccallum, Iain; Przybylski, Dariusz; Ribeiro, Filipe J; Burton, Joshua N; Walker, Bruce J; Sharpe, Ted; Hall, Giles; Shea, Terrance P; Sykes, Sean; Berlin, Aaron M; Aird, Daniel; Costello, Maura; Daza, Riza; Williams, Louise; Nicol, Robert; Gnirke, Andreas; Nusbaum, Chad; Lander, Eric S; Jaffe, David B.

Proc Natl Acad Sci U S A ; 108(4): 1513-8, 2011 Jan 25.

Article in English | MEDLINE | ID: mdl-21187386

ABSTRACT

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.

Subject(s)

Algorithms , Genomics/methods , Sequence Analysis, DNA/methods , Software , Animals , Genome/genetics , Humans , Internet , Mice , Reproducibility of Results

13.

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.

Maccallum, Iain; Przybylski, Dariusz; Gnerre, Sante; Burton, Joshua; Shlyakhter, Ilya; Gnirke, Andreas; Malek, Joel; McKernan, Kevin; Ranade, Swati; Shea, Terrance P; Williams, Louise; Young, Sarah; Nusbaum, Chad; Jaffe, David B.

Genome Biol ; 10(10): R103, 2009.

Article in English | MEDLINE | ID: mdl-19796385

ABSTRACT

We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).

Subject(s)

Bacteria/genetics , Fungi/genetics , Genome/genetics , Genomics/methods , Software , Base Pairing/genetics , Reproducibility of Results

14.

Genome sequence and analysis of the Irish potato famine pathogen Phytophthora infestans.

Haas, Brian J; Kamoun, Sophien; Zody, Michael C; Jiang, Rays H Y; Handsaker, Robert E; Cano, Liliana M; Grabherr, Manfred; Kodira, Chinnappa D; Raffaele, Sylvain; Torto-Alalibo, Trudy; Bozkurt, Tolga O; Ah-Fong, Audrey M V; Alvarado, Lucia; Anderson, Vicky L; Armstrong, Miles R; Avrova, Anna; Baxter, Laura; Beynon, Jim; Boevink, Petra C; Bollmann, Stephanie R; Bos, Jorunn I B; Bulone, Vincent; Cai, Guohong; Cakir, Cahid; Carrington, James C; Chawner, Megan; Conti, Lucio; Costanzo, Stefano; Ewan, Richard; Fahlgren, Noah; Fischbach, Michael A; Fugelstad, Johanna; Gilroy, Eleanor M; Gnerre, Sante; Green, Pamela J; Grenville-Briggs, Laura J; Griffith, John; Grünwald, Niklaus J; Horn, Karolyn; Horner, Neil R; Hu, Chia-Hui; Huitema, Edgar; Jeong, Dong-Hoon; Jones, Alexandra M E; Jones, Jonathan D G; Jones, Richard W; Karlsson, Elinor K; Kunjeti, Sridhara G; Lamour, Kurt; Liu, Zhenyu.

Nature ; 461(7262): 393-8, 2009 Sep 17.

Article in English | MEDLINE | ID: mdl-19741609

ABSTRACT

Phytophthora infestans is the most destructive pathogen of potato and a model organism for the oomycetes, a distinct lineage of fungus-like eukaryotes that are related to organisms such as brown algae and diatoms. As the agent of the Irish potato famine in the mid-nineteenth century, P. infestans has had a tremendous effect on human history, resulting in famine and population displacement. To this day, it affects world agriculture by causing the most destructive disease of potato, the fourth largest food crop and a critical alternative to the major cereal crops for feeding the world's population. Current annual worldwide potato crop losses due to late blight are conservatively estimated at $6.7 billion. Management of this devastating pathogen is challenged by its remarkable speed of adaptation to control strategies such as genetically resistant cultivars. Here we report the sequence of the P. infestans genome, which at approximately 240 megabases (Mb) is by far the largest and most complex genome sequenced so far in the chromalveolates. Its expansion results from a proliferation of repetitive DNA accounting for approximately 74% of the genome. Comparison with two other Phytophthora genomes showed rapid turnover and extensive expansion of specific families of secreted disease effector proteins, including many genes that are induced during infection or are predicted to have activities that alter host physiology. These fast-evolving effector genes are localized to highly dynamic and expanded regions of the P. infestans genome. This probably plays a crucial part in the rapid adaptability of the pathogen to host plants and underpins its evolutionary potential.

Subject(s)

Genome/genetics , Phytophthora infestans/genetics , Plant Diseases/microbiology , Solanum tuberosum/microbiology , Algal Proteins/genetics , DNA Transposable Elements/genetics , DNA, Intergenic/genetics , Evolution, Molecular , Host-Pathogen Interactions/genetics , Humans , Ireland , Molecular Sequence Data , Necrosis , Phenotype , Phytophthora infestans/pathogenicity , Plant Diseases/immunology , Solanum tuberosum/immunology , Starvation

15.

Assisted assembly: how to improve a de novo genome assembly by using related species.

Gnerre, Sante; Lander, Eric S; Lindblad-Toh, Kerstin; Jaffe, David B.

Genome Biol ; 10(8): R88, 2009.

Article in English | MEDLINE | ID: mdl-19712469

ABSTRACT

We describe a new assembly algorithm, where a genome assembly with low sequence coverage, either throughout the genome or locally, due to cloning bias, is considerably improved through an assisting process via a related genome. We show that the information provided by aligning the whole-genome shotgun reads of the target against a reference genome can be used to substantially improve the quality of the resulting assembly.

Subject(s)

Algorithms , Computational Biology/methods , Genome , Animals , Humans , Mammals/genetics , Plasmodium/genetics

16.

Closing gaps in the human genome using sequencing by synthesis.

Garber, Manuel; Zody, Michael C; Arachchi, Harindra M; Berlin, Aaron; Gnerre, Sante; Green, Lisa M; Lennon, Niall; Nusbaum, Chad.

Genome Biol ; 10(6): R60, 2009.

Article in English | MEDLINE | ID: mdl-19490611

ABSTRACT

The most recent release of the finished human genome contains 260 euchromatic gaps (excluding chromosome Y). Recent work has helped explain a large number of these unresolved regions as 'structural' in nature. Another class of gaps is likely to be refractory to clone-based approaches, and cannot be approached in ways previously described. We present an approach for closing these gaps using 454 sequencing. As a proof of principle, we closed all three remaining non-structural gaps in chromosome 15.

Subject(s)

Genome, Human , Sequence Analysis, DNA/methods , Base Sequence , Cell Line , Chromosomes, Human, Pair 15/genetics , Humans

17.

The difficulty of avoiding false positives in genome scans for natural selection.

Mallick, Swapan; Gnerre, Sante; Muller, Paul; Reich, David.

Genome Res ; 19(5): 922-33, 2009 May.

Article in English | MEDLINE | ID: mdl-19411606

ABSTRACT

Several studies have found evidence for more positive selection on the chimpanzee lineage compared with the human lineage since the two species split. A potential concern, however, is that these findings may simply reflect artifacts of the data: inaccuracies in the underlying chimpanzee genome sequence, which is of lower quality than human. To test this hypothesis, we generated de novo genome assemblies of chimpanzee and macaque and aligned them with human. We also implemented a novel bioinformatic procedure for producing alignments of closely related species that uses synteny information to remove misassembled and misaligned regions, and sequence quality scores to remove nucleotides that are less reliable. We applied this procedure to re-examine 59 genes recently identified as candidates for positive selection in chimpanzees. The great majority of these signals disappear after application of our new bioinformatic procedure. We also carried out laboratory-based resequencing of 10 of the regions in multiple chimpanzees and humans, and found that our alignments were correct wherever there was a conflict with the published results. These findings throw into question previous findings that there has been more positive selection in chimpanzees than in humans since the two species diverged. Our study also highlights the challenges of searching the extreme tails of distributions for signals of natural selection. Inaccuracies in the genome sequence at even a tiny fraction of genes can produce false-positive signals, which make it difficult to identify loci that have genuinely been targets of selection.

Subject(s)

Genome , Selection, Genetic , Sequence Analysis, DNA , Animals , Base Sequence , Evolution, Molecular , Genomics , Humans , Molecular Sequence Data , Pan troglodytes/genetics , Sequence Alignment , Synteny

18.

Analysis of chimpanzee history based on genome sequence alignments.

Caswell, Jennifer L; Mallick, Swapan; Richter, Daniel J; Neubauer, Julie; Schirmer, Christine; Gnerre, Sante; Reich, David.

PLoS Genet ; 4(4): e1000057, 2008 Apr 18.

Article in English | MEDLINE | ID: mdl-18421364

ABSTRACT

Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

Subject(s)

Evolution, Molecular , Genetics, Population , Genome , Pan troglodytes/genetics , Animals , Genetic Variation , Genome, Human , Genomics , Humans , Pan paniscus/genetics , Sequence Alignment

19.

Initial sequence and comparative analysis of the cat genome.

Pontius, Joan U; Mullikin, James C; Smith, Douglas R; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A; Agarwala, Richa; Narfström, Kristina; Murphy, William J; Giger, Urs; Roca, Alfred L; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E; Bourque, Guillaume; Tesler, Glenn; O'Brien, Stephen J.

Genome Res ; 17(11): 1675-89, 2007 Nov.

Article in English | MEDLINE | ID: mdl-17975172

ABSTRACT

The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing approximately 65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence.

Subject(s)

Cats/genetics , Genome , Genomics , Animals , Dogs , Humans , Mice , MicroRNAs , Microsatellite Repeats , Models, Genetic , Polymorphism, Single Nucleotide , Rats , Repetitive Sequences, Nucleic Acid

20.

The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization.

Cuomo, Christina A; Güldener, Ulrich; Xu, Jin-Rong; Trail, Frances; Turgeon, B Gillian; Di Pietro, Antonio; Walton, Jonathan D; Ma, Li-Jun; Baker, Scott E; Rep, Martijn; Adam, Gerhard; Antoniw, John; Baldwin, Thomas; Calvo, Sarah; Chang, Yueh-Long; Decaprio, David; Gale, Liane R; Gnerre, Sante; Goswami, Rubella S; Hammond-Kosack, Kim; Harris, Linda J; Hilburn, Karen; Kennell, John C; Kroken, Scott; Magnuson, Jon K; Mannhaupt, Gertrud; Mauceli, Evan; Mewes, Hans-Werner; Mitterbauer, Rudolf; Muehlbauer, Gary; Münsterkötter, Martin; Nelson, David; O'donnell, Kerry; Ouellet, Thérèse; Qi, Weihong; Quesneville, Hadi; Roncero, M Isabel G; Seong, Kye-Yong; Tetko, Igor V; Urban, Martin; Waalwijk, Cees; Ward, Todd J; Yao, Jiqiang; Birren, Bruce W; Kistler, H Corby.

Science ; 317(5843): 1400-2, 2007 Sep 07.

Article in English | MEDLINE | ID: mdl-17823352

ABSTRACT

We sequenced and annotated the genome of the filamentous fungus Fusarium graminearum, a major pathogen of cultivated cereals. Very few repetitive sequences were detected, and the process of repeat-induced point mutation, in which duplicated sequences are subject to extensive mutation, may partially account for the reduced repeat content and apparent low number of paralogous (ancestrally duplicated) genes. A second strain of F. graminearum contained more than 10,000 single-nucleotide polymorphisms, which were frequently located near telomeres and within other discrete chromosomal segments. Many highly polymorphic regions contained sets of genes implicated in plant-fungus interactions and were unusually divergent, with higher rates of recombination. These regions of genome innovation may result from selection due to interactions of F. graminearum with its plant hosts.

Subject(s)

Fusarium/genetics , Genome, Fungal , Polymorphism, Genetic , DNA, Fungal , Evolution, Molecular , Fusarium/physiology , Hordeum/microbiology , Molecular Sequence Data , Plant Diseases/microbiology , Point Mutation , Polymorphism, Single Nucleotide , Sequence Analysis, DNA

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL