ABSTRACT
The ORFDB (http://orf.invitrogen.com/) represents an ongoing effort at Invitrogen Corporation to integrate relevant scientific data with an evolving collection of human and mouse Open Reading Frame (ORF) clones (Ultimate ORF Clones). The ORFDB serves as a central data warehouse enabling researchers to search the ORF collection through its web portal ORFBrowser, allowing researchers to find the Ultimate ORF clones by blast, keyword, GenBank accession, gene symbol, clone ID, Unigene ID, LocusLink ID or through functional relationships by browsing the collection via the Gene Ontology (GO) Browser. As of October 2003, the ORFDB contains 6200 human and 2870 mouse Ultimate ORF clones. All Ultimate ORF clones have been fully sequenced with high quality, and are matched to public reference protein sequences. In addition, the cloned ORFs have been extensively annotated across six categories: Gene, ORF, Clone Format, Protein, SNP and Genomic links, with the information assembled in a format termed the ORFCard. The ORFCard represents an information repository that documents the sequence quality, alignment with respect to public protein sequences, and the latest publicly available information associated with each human and mouse gene represented in the collection.
Subject(s)
Databases, Genetic , Open Reading Frames/genetics , Animals , Cloning, Molecular , Computational Biology , DNA, Complementary/genetics , Gene Library , Genomics , Humans , Information Storage and Retrieval , Internet , Mice , Polymorphism, Single Nucleotide/genetics , Proteins/genetics , Proteomics , Quality Control , Software , User-Computer InterfaceABSTRACT
Databases of experimentally generated and computationally derived transcript sequences are valuable resources for genome analysis and annotation. The utility of such databases is enhanced when the sequences they contain are integrated with such biological information as genomic location, gene function, gene expression and phenotypic variation. We present the analysis and results of a semi-automated process of connecting transcript assemblies with highly curated biological information for mouse genes that is available through the Mouse Genome Informatics (MGI) database.
Subject(s)
Databases, Nucleic Acid , Genome , Mice/genetics , Transcription, Genetic , Animals , Computational Biology/methodsABSTRACT
TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, and then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run on multi-CPU architectures including SMP and PVM.
Subject(s)
Database Management Systems , Databases, Nucleic Acid , Expressed Sequence Tags , Gene Expression Profiling/methods , Information Storage and Retrieval/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Cluster Analysis , Gene Expression Regulation/genetics , Sequence Homology , SoftwareABSTRACT
The mosquito-borne malaria parasite Plasmodium falciparum kills an estimated 0.7-2.7 million people every year, primarily children in sub-Saharan Africa. Without effective interventions, a variety of factors-including the spread of parasites resistant to antimalarial drugs and the increasing insecticide resistance of mosquitoes-may cause the number of malaria cases to double over the next two decades. To stimulate basic research and facilitate the development of new drugs and vaccines, the genome of Plasmodium falciparum clone 3D7 has been sequenced using a chromosome-by-chromosome shotgun strategy. We report here the nucleotide sequences of chromosomes 10, 11 and 14, and a re-analysis of the chromosome 2 sequence. These chromosomes represent about 35% of the 23-megabase P. falciparum genome.
Subject(s)
DNA, Protozoan , Plasmodium falciparum/genetics , Animals , Chromosomes , Genome, Protozoan , Proteome , Protozoan Proteins/genetics , Sequence Analysis, DNAABSTRACT
Comparative genomics promises to rapidly accelerate the identification and functional classification of biologically important human genes. We developed the TIGR Orthologous Gene Alignment (TOGA;