Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Science ; 291(5507): 1304-51, 2001 02 16.
Article in English | MEDLINE | ID: mdl-11181995

ABSTRACT

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.


Subject(s)
Genome, Human , Human Genome Project , Sequence Analysis, DNA , Algorithms , Animals , Chromosome Banding , Chromosome Mapping , Chromosomes, Artificial, Bacterial , Computational Biology , Consensus Sequence , CpG Islands , DNA, Intergenic , Databases, Factual , Evolution, Molecular , Exons , Female , Gene Duplication , Genes , Genetic Variation , Humans , Introns , Male , Phenotype , Physical Chromosome Mapping , Polymorphism, Single Nucleotide , Proteins/genetics , Proteins/physiology , Pseudogenes , Repetitive Sequences, Nucleic Acid , Retroelements , Sequence Analysis, DNA/methods , Species Specificity
3.
Science ; 287(5461): 2204-15, 2000 Mar 24.
Article in English | MEDLINE | ID: mdl-10731134

ABSTRACT

A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae-and the proteins they are predicted to encode-was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Genome , Proteome , Saccharomyces cerevisiae/genetics , Animals , Apoptosis/genetics , Biological Evolution , Caenorhabditis elegans/chemistry , Caenorhabditis elegans/physiology , Cell Adhesion/genetics , Cell Cycle/genetics , Drosophila melanogaster/chemistry , Drosophila melanogaster/physiology , Fungal Proteins/chemistry , Fungal Proteins/genetics , Genes, Duplicate , Genetic Diseases, Inborn/genetics , Genetics, Medical , Helminth Proteins/chemistry , Helminth Proteins/genetics , Humans , Immunity/genetics , Insect Proteins/chemistry , Insect Proteins/genetics , Multigene Family , Neoplasms/genetics , Protein Structure, Tertiary , Saccharomyces cerevisiae/chemistry , Saccharomyces cerevisiae/physiology , Signal Transduction/genetics
4.
Genetics ; 154(2): 623-33, 2000 Feb.
Article in English | MEDLINE | ID: mdl-10655216

ABSTRACT

Neurospora crassa and related heterothallic ascomycetes produce eight homokaryotic self-sterile ascospores per ascus. In contrast, asci of N. tetrasperma contain four self-fertile ascospores each with nuclei of both mating types (matA and mata). The self-fertile ascospores of N. tetrasperma result from first-division segregation of mating type and nuclear spindle overlap at the second meiotic division and at a subsequent mitotic division. Recently, Merino et al. presented population-genetic evidence that crossing over is suppressed on the mating-type chromosome of N. tetrasperma, thereby preventing second-division segregation of mating type and the formation of self-sterile ascospores. The present study experimentally confirmed suppressed crossing over for a large segment of the mating-type chromosome by examining segregation of markers in crosses of wild strains. Surprisingly, our study also revealed a region on the far left arm where recombination is obligatory. In cytological studies, we demonstrated that suppressed recombination correlates with an extensive unpaired region at pachytene. Taken together, these results suggest an unpaired region adjacent to one or more paired regions, analogous to the nonpairing and pseudoautosomal regions of animal sex chromosomes. The observed pairing and obligate crossover likely reflect mechanisms to ensure chromosome disjunction.


Subject(s)
Chromosomes, Fungal , Neurospora/genetics , Recombination, Genetic , Base Sequence , Crosses, Genetic , Crossing Over, Genetic , DNA Primers
5.
Nucleic Acids Res ; 28(1): 31-2, 2000 Jan 01.
Article in English | MEDLINE | ID: mdl-10592174

ABSTRACT

The Genome Sequence DataBase (GSDB) is a database of publicly available nucleotide sequences and their associated biological and bibliographic information. Several notable changes have occurred in the past year: GSDB stopped accepting data submissions from researchers; ownership of data submitted to GSDB was transferred to GenBank; sequence analysis capabilities were expanded to include Smith-Waterman and Frame Search; and Sequence Viewer became available to Mac users. The content of GSDB remains up-to-date because publicly available data is acquired from the International Nucleotide Sequence Database Collaboration databases (IC) on a nightly basis. This allows GSDB to continue providing researchers with the ability to analyze, query and retrieve nucleotide sequences in the database. GSDB and its related tools are freely accessible from the URL: http://www.ncgr.org


Subject(s)
Databases, Factual , Genome , Information Storage and Retrieval , Ownership , Sequence Analysis
6.
Nucleic Acids Res ; 27(1): 35-8, 1999 Jan 01.
Article in English | MEDLINE | ID: mdl-9847136

ABSTRACT

During 1998 the primary focus of the Genome Sequence DataBase (GSDB; http://www.ncgr.org/gsdb ) located at the National Center for Genome Resources (NCGR) has been to improve data quality, improve data collections, and provide new methods and tools to access and analyze data. Data quality has been improved by extensive curation of certain data fields necessary for maintaining data collections and for using certain tools. Data quality has also been increased by improvements to the suite of programs that import data from the International Nucleotide Sequence Database Collaboration (IC). The Sequence Tag Alignment and Consensus Knowledgebase (STACK), a database of human expressed gene sequences developed by the South African National Bioinformatics Institute (SANBI), became available within the last year, allowing public access to this valuable resource of expressed sequences. Data access was improved by the addition of the Sequence Viewer, a platform-independent graphical viewer for GSDB sequence data. This tool has also been integrated with other searching and data retrieval tools. A BLAST homology search service was also made available, allowing researchers to search all of the data, including the unique data, that are available from GSDB. These improvements are designed to make GSDB more accessible to users, extend the rich searching capability already present in GSDB, and to facilitate the transition to an integrated system containing many different types of biological data.


Subject(s)
Base Sequence , Databases, Factual , Genome , Information Storage and Retrieval , Animals , Computational Biology , Consensus Sequence , Gene Expression , Genome, Human , Humans , Sequence Alignment
7.
Nucleic Acids Res ; 26(1): 21-6, 1998 Jan 01.
Article in English | MEDLINE | ID: mdl-9399793

ABSTRACT

In 1997 the primary focus of the Genome Sequence DataBase (GSDB; www. ncgr.org/gsdb ) located at the National Center for Genome Resources was to improve data quality and accessibility. Efforts to increase the quality of data within the database included two major projects; one to identify and remove all vector contamination from sequences in the database and one to create premier sequence sets (including both alignments and discontiguous sequences). Data accessibility was improved during the course of the last year in several ways. First, a graphical database sequence viewer was made available to researchers. Second, an update process was implemented for the web-based query tool, Maestro. Third, a web-based tool, Excerpt, was developed to retrieve selected regions of any sequence in the database. And lastly, a GSDB flatfile that contains annotation unique to GSDB (e.g., sequence analysis and alignment data) was developed. Additionally, the GSDB web site provides a tool for the detection of matrix attachment regions (MARs), which can be used to identify regions of high coding potential. The ultimate goal of this work is to make GSDB a more useful resource for genomic comparison studies and gene level studies by improving data quality and by providing data access capabilities that are consistent with the needs of both types of studies.


Subject(s)
Databases, Factual , Genome , Base Sequence , Computer Communication Networks , Forecasting , Information Storage and Retrieval
8.
Nucleic Acids Res ; 25(1): 18-23, 1997 Jan 01.
Article in English | MEDLINE | ID: mdl-9016496

ABSTRACT

The Genome Sequence DataBase (GSDB) has completed its conversion to an improved relational database. The new database, GSDB 1.0, is fully operational and publicly available. Data contributions, including both original sequence submissions and community annotation, are being accomplished through the use of a graphical client-server interface tool, the GSDB Annotator, and via GIO (GSDB Input/Output) files. Data retrieval services are being provided through a new Web Query Tool and direct SQL. All methods of data contribution and data retrieval fully support the new data types that have been incorporated into GSDB, including discontiguous sequences, multiple sequence alignments, and community annotation.


Subject(s)
Base Sequence , Databases, Factual , Animals , Humans , Private Sector , Software
9.
Fungal Genet Biol ; 21(1): 153-62, 1997 Feb.
Article in English | MEDLINE | ID: mdl-9126624

ABSTRACT

We examined the phylogenetic relationships among five heterothallic species of Neurospora using restriction fragment polymorphisms derived from cosmid probes and sequence data from the upstream regions of two genes, al-1 and frq. Distance, maximum likelihood, and parsimony trees derived from the data support the hypothesis that strains assigned to N. sitophila, N. discreta, and N. tetrasperma form respective monophyletic groups. Strains assigned to N. intermedia and N. crassa, however, did not form two respective monophyletic groups, consistent with a previous suggestion based on analysis of mitochondrial DNAs that N. crassa and N. intermedia may be incompletely resolved sister taxa. Trees derived from restriction fragments and the al-1 sequence position N. tetrasperma as the sister species of N. sitophila. None of the trees produced by our data supported a previous analysis of sequences in the region of the mating type idiomorph that grouped N. crassa and N. sitophila as sister taxa, as well as N. intermedia and N. tetrasperma as sister taxa. Moreover, sequences from al-1, frq, and the mating-type region produced different trees when analyzed separately. The lack of consensus obtained with different sequences could result from the sorting of ancestral polymorphism during speciation or gene flow across species boundaries, or both.


Subject(s)
Neurospora/genetics , Phylogeny , Genes, Fungal/genetics , Genes, Mating Type, Fungal , Molecular Sequence Data , Polymorphism, Restriction Fragment Length , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...