Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
PLoS Biol ; 5(10): e254, 2007 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-17803354

RESUMO

Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.


Assuntos
Mapeamento Cromossômico , Diploide , Genoma Humano , Análise de Sequência de DNA , Sequência de Bases , Mapeamento Cromossômico/instrumentação , Mapeamento Cromossômico/métodos , Cromossomos Humanos , Cromossomos Humanos Y/genética , Dosagem de Genes , Genótipo , Haplótipos , Projeto Genoma Humano , Humanos , Mutação INDEL , Hibridização in Situ Fluorescente , Masculino , Análise em Microsséries , Pessoa de Meia-Idade , Dados de Sequência Molecular , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/métodos
3.
Science ; 316(5822): 222-34, 2007 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-17431167

RESUMO

The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.


Assuntos
Evolução Molecular , Genoma , Macaca mulatta/genética , Animais , Pesquisa Biomédica , Feminino , Duplicação Gênica , Rearranjo Gênico , Doenças Genéticas Inatas , Variação Genética , Humanos , Masculino , Família Multigênica , Mutação , Pan troglodytes/genética , Análise de Sequência de DNA , Especificidade da Espécie
4.
PLoS Biol ; 5(3): e16, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17355171

RESUMO

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.


Assuntos
Proteínas/química , Etiquetas de Sequências Expressas , Oceanos e Mares , Proteínas/genética , Microbiologia da Água
5.
PLoS Biol ; 5(3): e77, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17355176

RESUMO

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed "fragment recruitment," addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed "extreme assembly," made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.


Assuntos
Microbiologia da Água , Biologia Computacional , Cadeia Alimentar , Oceanos e Mares , Plâncton , Especificidade da Espécie
8.
Science ; 304(5667): 66-74, 2004 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-15001713

RESUMO

We have applied "whole-genome shotgun sequencing" to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These data are estimated to derive from at least 1800 genomic species based on sequence relatedness, including 148 previously unknown bacterial phylotypes. We have identified over 1.2 million previously unknown genes represented in these samples, including more than 782 new rhodopsin-like photoreceptors. Variation in species present and stoichiometry suggests substantial oceanic microbial diversity.


Assuntos
Archaea/genética , Bactérias/genética , Ecossistema , Genoma Bacteriano , Genômica , Água do Mar/microbiologia , Análise de Sequência de DNA , Oceano Atlântico , Bacteriófagos/genética , Biodiversidade , Biologia Computacional , Cianobactérias/genética , Cianobactérias/crescimento & desenvolvimento , Cianobactérias/metabolismo , Células Eucarióticas , Genes Arqueais , Genes Bacterianos , Genes de RNAr , Genoma Arqueal , Dados de Sequência Molecular , Fotossíntese , Filogenia , Plasmídeos , Rodopsina/genética , Rodopsinas Microbianas , Microbiologia da Água
9.
Proc Natl Acad Sci U S A ; 101(7): 1916-21, 2004 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-14769938

RESUMO

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304-1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860-921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.


Assuntos
Biologia Computacional , Genoma Humano , Projeto Genoma Humano , Biologia Computacional/normas , Mapeamento de Sequências Contíguas/normas , Humanos , RNA Mensageiro/análise , Software
10.
Science ; 301(5641): 1898-903, 2003 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-14512627

RESUMO

A survey of the dog genome sequence (6.22 million sequence reads; 1.5x coverage) demonstrates the power of sample sequencing for comparative analysis of mammalian genomes and the generation of species-specific resources. More than 650 million base pairs (>25%) of dog sequence align uniquely to the human genome, including fragments of putative orthologs for 18,473 of 24,567 annotated human genes. Mutation rates, conserved synteny, repeat content, and phylogeny can be compared among human, mouse, and dog. A variety of polymorphic elements are identified that will be valuable for mapping the genetic basis of diseases and traits in the dog.


Assuntos
Cães/genética , Genoma , Análise de Sequência de DNA , Animais , Cromossomos de Mamíferos/genética , Biologia Computacional , Sequência Conservada , Mapeamento de Sequências Contíguas , DNA Intergênico , Variação Genética , Genoma Humano , Genômica , Humanos , Elementos Nucleotídeos Longos e Dispersos , Masculino , Camundongos/genética , Dados de Sequência Molecular , Mutação , Filogenia , Mapeamento Físico do Cromossomo , Polimorfismo de Nucleotídeo Único , RNA Mensageiro/genética , Sequências Repetitivas de Ácido Nucleico , Alinhamento de Sequência , Elementos Nucleotídeos Curtos e Dispersos , Sintenia
11.
Nat Genet ; 33 Suppl: 219-27, 2003 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-12610531

RESUMO

In reviewing the past decade, it is clear that genomics was, and still is, driven by innovative technologies, perhaps more so than any other scientific area in recent memory. From the outset, computing, mathematics and new automated laboratory techniques have been key components in allowing the field to move forward rapidly. We highlight some key innovations that have come together to nurture the explosive growth that makes a new era of genomics a reality. We also document how these new approaches have fueled further innovations and discoveries.


Assuntos
Genômica/tendências , Animais , Bases de Dados Genéticas/história , Etiquetas de Sequências Expressas , Genômica/história , História do Século XX , História do Século XXI , Humanos , Análise de Sequência de DNA/história
12.
Science ; 296(5573): 1661-71, 2002 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-12040188

RESUMO

The high degree of similarity between the mouse and human genomes is demonstrated through analysis of the sequence of mouse chromosome 16 (Mmu 16), which was obtained as part of a whole-genome shotgun assembly of the mouse genome. The mouse genome is about 10% smaller than the human genome, owing to a lower repetitive DNA content. Comparison of the structure and protein-coding potential of Mmu 16 with that of the homologous segments of the human genome identifies regions of conserved synteny with human chromosomes (Hsa) 3, 8, 12, 16, 21, and 22. Gene content and order are highly conserved between Mmu 16 and the syntenic blocks of the human genome. Of the 731 predicted genes on Mmu 16, 509 align with orthologs on the corresponding portions of the human genome, 44 are likely paralogous to these genes, and 164 genes have homologs elsewhere in the human genome; there are 14 genes for which we could find no human counterpart.


Assuntos
Cromossomos/genética , Genoma Humano , Genoma , Camundongos Endogâmicos/genética , Análise de Sequência de DNA , Sintenia , Animais , Composição de Bases , Cromossomos Humanos/genética , Biologia Computacional , Sequência Conservada , Bases de Dados de Ácidos Nucleicos , Evolução Molecular , Genes , Marcadores Genéticos , Genômica , Humanos , Camundongos , Camundongos Endogâmicos A/genética , Camundongos Endogâmicos DBA/genética , Dados de Sequência Molecular , Mapeamento Físico do Cromossomo , Proteínas/química , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...