Search | VHL Regional Portal

ORFDB: an information resource linking scientific content to a high-quality Open Reading Frame (ORF) collection.

Liang, Feng; Matrubutham, Udayakumar; Parvizi, Babak; Yen, Jessica; Duan, Daniel; Mirchandani, Jyotika; Hashima, Sandra; Nguyen, Uyen; Ubil, Eric; Loewenheim, Jake; Yu, Xin; Sipes, Sara; Williams, Wendy; Wang, Ling; Bennett, Robert; Carrino, John.

Nucleic Acids Res ; 32(Database issue): D595-9, 2004 Jan 01.

Article in English | MEDLINE | ID: mdl-14681490

ABSTRACT

The ORFDB (http://orf.invitrogen.com/) represents an ongoing effort at Invitrogen Corporation to integrate relevant scientific data with an evolving collection of human and mouse Open Reading Frame (ORF) clones (Ultimate ORF Clones). The ORFDB serves as a central data warehouse enabling researchers to search the ORF collection through its web portal ORFBrowser, allowing researchers to find the Ultimate ORF clones by blast, keyword, GenBank accession, gene symbol, clone ID, Unigene ID, LocusLink ID or through functional relationships by browsing the collection via the Gene Ontology (GO) Browser. As of October 2003, the ORFDB contains 6200 human and 2870 mouse Ultimate ORF clones. All Ultimate ORF clones have been fully sequenced with high quality, and are matched to public reference protein sequences. In addition, the cloned ORFs have been extensively annotated across six categories: Gene, ORF, Clone Format, Protein, SNP and Genomic links, with the information assembled in a format termed the ORFCard. The ORFCard represents an information repository that documents the sequence quality, alignment with respect to public protein sequences, and the latest publicly available information associated with each human and mouse gene represented in the collection.

Subject(s)

Databases, Genetic , Open Reading Frames/genetics , Animals , Cloning, Molecular , Computational Biology , DNA, Complementary/genetics , Gene Library , Genomics , Humans , Information Storage and Retrieval , Internet , Mice , Polymorphism, Single Nucleotide/genetics , Proteins/genetics , Proteomics , Quality Control , Software , User-Computer Interface

Integrating computationally assembled mouse transcript sequences with the Mouse Genome Informatics (MGI) database.

Zhu, Yunxia; King, Benjamin L; Parvizi, Babak; Brunk, Brian P; Stoeckert, Christian J; Quackenbush, John; Richardson, Joel; Bult, Carol J.

Genome Biol ; 4(2): R16, 2003.

Article in English | MEDLINE | ID: mdl-12620126

ABSTRACT

Databases of experimentally generated and computationally derived transcript sequences are valuable resources for genome analysis and annotation. The utility of such databases is enhanced when the sequences they contain are integrated with such biological information as genomic location, gene function, gene expression and phenotypic variation. We present the analysis and results of a semi-automated process of connecting transcript assemblies with highly curated biological information for mouse genes that is available through the Mouse Genome Informatics (MGI) database.

Subject(s)

Databases, Nucleic Acid , Genome , Mice/genetics , Transcription, Genetic , Animals , Computational Biology/methods

TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.

Pertea, Geo; Huang, Xiaoqiu; Liang, Feng; Antonescu, Valentin; Sultana, Razvan; Karamycheva, Svetlana; Lee, Yuandan; White, Joseph; Cheung, Foo; Parvizi, Babak; Tsai, Jennifer; Quackenbush, John.

Bioinformatics ; 19(5): 651-2, 2003 Mar 22.

Article in English | MEDLINE | ID: mdl-12651724

ABSTRACT

TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, and then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run on multi-CPU architectures including SMP and PVM.

Subject(s)

Database Management Systems , Databases, Nucleic Acid , Expressed Sequence Tags , Gene Expression Profiling/methods , Information Storage and Retrieval/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Cluster Analysis , Gene Expression Regulation/genetics , Sequence Homology , Software

Sequence of Plasmodium falciparum chromosomes 2, 10, 11 and 14.

Gardner, Malcolm J; Shallom, Shamira J; Carlton, Jane M; Salzberg, Steven L; Nene, Vishvanath; Shoaibi, Azadeh; Ciecko, Anne; Lynn, Jeffery; Rizzo, Michael; Weaver, Bruce; Jarrahi, Behnam; Brenner, Michael; Parvizi, Babak; Tallon, Luke; Moazzez, Azita; Granger, David; Fujii, Claire; Hansen, Cheryl; Pederson, James; Feldblyum, Tamara; Peterson, Jeremy; Suh, Bernard; Angiuoli, Sam; Pertea, Mihaela; Allen, Jonathan; Selengut, Jeremy; White, Owen; Cummings, Leda M; Smith, Hamilton O; Adams, Mark D; Venter, J Craig; Carucci, Daniel J; Hoffman, Stephen L; Fraser, Claire M.

Nature ; 419(6906): 531-4, 2002 Oct 03.

Article in English | MEDLINE | ID: mdl-12368868

ABSTRACT

The mosquito-borne malaria parasite Plasmodium falciparum kills an estimated 0.7-2.7 million people every year, primarily children in sub-Saharan Africa. Without effective interventions, a variety of factors-including the spread of parasites resistant to antimalarial drugs and the increasing insecticide resistance of mosquitoes-may cause the number of malaria cases to double over the next two decades. To stimulate basic research and facilitate the development of new drugs and vaccines, the genome of Plasmodium falciparum clone 3D7 has been sequenced using a chromosome-by-chromosome shotgun strategy. We report here the nucleotide sequences of chromosomes 10, 11 and 14, and a re-analysis of the chromosome 2 sequence. These chromosomes represent about 35% of the 23-megabase P. falciparum genome.

Subject(s)

DNA, Protozoan , Plasmodium falciparum/genetics , Animals , Chromosomes , Genome, Protozoan , Proteome , Protozoan Proteins/genetics , Sequence Analysis, DNA

Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA).

Lee, Yuandan; Sultana, Razvan; Pertea, Geo; Cho, Jennifer; Karamycheva, Svetlana; Tsai, Jennifer; Parvizi, Babak; Cheung, Foo; Antonescu, Valentin; White, Joseph; Holt, Ingeborg; Liang, Feng; Quackenbush, John.

Genome Res ; 12(3): 493-502, 2002 Mar.

Article in English | MEDLINE | ID: mdl-11875039

ABSTRACT

Comparative genomics promises to rapidly accelerate the identification and functional classification of biologically important human genes. We developed the TIGR Orthologous Gene Alignment (TOGA; ) database to provide a cross-reference between fully and partially sequenced eukaryotic transcribed sequences. Starting with the assembled expressed sequence tag (EST) and gene sequences that comprise the 28 TIGR Gene Indices, we used high-stringency pair-wise sequence searches and a reflexive, transitive closure process to associate sequence-specific best hits, generating 32,652 tentative ortholog groups (TOGs). This has allowed us to identify putative orthologs and paralogs for known genes, as well as those that exist only as uncharacterized ESTs and to provide links to additional information including genome sequence and mapping data. TOGA provides an important new resource for the analysis of gene function in eukaryotes. In addition, an analysis of the most widely represented sequences can begin to provide insight into eukaryotic biological processes.

Subject(s)

Eukaryotic Cells , Genes/genetics , Sequence Alignment/methods , Algorithms , Animals , Cattle , Computational Biology/methods , Consensus Sequence/genetics , Databases, Genetic , Eukaryotic Cells/chemistry , Eukaryotic Cells/metabolism , Genome, Human , Humans , Mice , Phylogeny , Rats , Sequence Homology, Nucleic Acid

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL