Search | VHL Regional Portal

1.

CDD: specific functional annotation with the Conserved Domain Database.

Marchler-Bauer, Aron; Anderson, John B; Chitsaz, Farideh; Derbyshire, Myra K; DeWeese-Scott, Carol; Fong, Jessica H; Geer, Lewis Y; Geer, Renata C; Gonzales, Noreen R; Gwadz, Marc; He, Siqian; Hurwitz, David I; Jackson, John D; Ke, Zhaoxi; Lanczycki, Christopher J; Liebert, Cynthia A; Liu, Chunlei; Lu, Fu; Lu, Shennan; Marchler, Gabriele H; Mullokandov, Mikhail; Song, James S; Tasneem, Asba; Thanki, Narmada; Yamashita, Roxanne A; Zhang, Dachuan; Zhang, Naigong; Bryant, Stephen H.

Nucleic Acids Res ; 37(Database issue): D205-10, 2009 Jan.

Article in English | MEDLINE | ID: mdl-18984618

ABSTRACT

NCBI's Conserved Domain Database (CDD) is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution. The collection can be accessed at http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, and is also part of NCBI's Entrez query and retrieval system, cross-linked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI's Entrez system, and CDD's collection of models can be queried with novel protein sequences via the CD-Search service at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Starting with the latest version of CDD, v2.14, information from redundant and homologous domain models is summarized at a superfamily level, and domain annotation on proteins is flagged as either 'specific' (identifying molecular function with high confidence) or as 'non-specific' (identifying superfamily membership only).

Subject(s)

Databases, Protein , Protein Structure, Tertiary , Amino Acid Sequence , Conserved Sequence , Proteins/classification , Sequence Alignment , Sequence Analysis, Protein

2.

CDD: a conserved domain database for interactive domain family analysis.

Marchler-Bauer, Aron; Anderson, John B; Derbyshire, Myra K; DeWeese-Scott, Carol; Gonzales, Noreen R; Gwadz, Marc; Hao, Luning; He, Siqian; Hurwitz, David I; Jackson, John D; Ke, Zhaoxi; Krylov, Dmitri; Lanczycki, Christopher J; Liebert, Cynthia A; Liu, Chunlei; Lu, Fu; Lu, Shennan; Marchler, Gabriele H; Mullokandov, Mikhail; Song, James S; Thanki, Narmada; Yamashita, Roxanne A; Yin, Jodie J; Zhang, Dachuan; Bryant, Stephen H.

Nucleic Acids Res ; 35(Database issue): D237-40, 2007 Jan.

Article in English | MEDLINE | ID: mdl-17135202

ABSTRACT

The conserved domain database (CDD) is part of NCBI's Entrez database system and serves as a primary resource for the annotation of conserved domain footprints on protein sequences in Entrez. Entrez's global query interface can be accessed at http://www.ncbi.nlm.nih.gov/Entrez and will search CDD and many other databases. Domain annotation for proteins in Entrez has been pre-computed and is readily available in the form of 'Conserved Domain' links. Novel protein sequences can be scanned against CDD using the CD-Search service; this service searches databases of CDD-derived profile models with protein sequence queries using BLAST heuristics, at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. Protein query sequences submitted to NCBI's protein BLAST search service are scanned for conserved domain signatures by default. The CDD collection contains models imported from Pfam, SMART and COG, as well as domain models curated at NCBI. NCBI curated models are organized into hierarchies of domains related by common descent. Here we report on the status of the curation effort and present a novel helper application, CDTree, which enables users of the CDD resource to examine curated hierarchies. More importantly, CDD and CDTree used in concert, serve as a powerful tool in protein classification, as they allow users to analyze protein sequences in the context of domain family hierarchies.

Subject(s)

Databases, Protein , Protein Structure, Tertiary , Amino Acid Sequence , Animals , Conserved Sequence , Internet , Phylogeny , Protein Structure, Tertiary/genetics , Proteins/classification , Sequence Analysis, Protein , User-Computer Interface

3.

CDD: a Conserved Domain Database for protein classification.

Marchler-Bauer, Aron; Anderson, John B; Cherukuri, Praveen F; DeWeese-Scott, Carol; Geer, Lewis Y; Gwadz, Marc; He, Siqian; Hurwitz, David I; Jackson, John D; Ke, Zhaoxi; Lanczycki, Christopher J; Liebert, Cynthia A; Liu, Chunlei; Lu, Fu; Marchler, Gabriele H; Mullokandov, Mikhail; Shoemaker, Benjamin A; Simonyan, Vahan; Song, James S; Thiessen, Paul A; Yamashita, Roxanne A; Yin, Jodie J; Zhang, Dachuan; Bryant, Stephen H.

Nucleic Acids Res ; 33(Database issue): D192-6, 2005 Jan 01.

Article in English | MEDLINE | ID: mdl-15608175

ABSTRACT

The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed, and can be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Protein-protein queries submitted to NCBI's BLAST search service at http://www.ncbi.nlm.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system.

Subject(s)

Databases, Protein , Protein Structure, Tertiary , Proteins/classification , Amino Acid Sequence , Conserved Sequence , Phylogeny , Sequence Alignment , Sequence Analysis, Protein , User-Computer Interface

4.

CDD: a curated Entrez database of conserved domain alignments.

Marchler-Bauer, Aron; Anderson, John B; DeWeese-Scott, Carol; Fedorova, Natalie D; Geer, Lewis Y; He, Siqian; Hurwitz, David I; Jackson, John D; Jacobs, Aviva R; Lanczycki, Christopher J; Liebert, Cynthia A; Liu, Chunlei; Madej, Thomas; Marchler, Gabriele H; Mazumder, Raja; Nikolskaya, Anastasia N; Panchenko, Anna R; Rao, Bachoti S; Shoemaker, Benjamin A; Simonyan, Vahan; Song, James S; Thiessen, Paul A; Vasudevan, Sona; Wang, Yanli; Yamashita, Roxanne A; Yin, Jodie J; Bryant, Stephen H.

Nucleic Acids Res ; 31(1): 383-7, 2003 Jan 01.

Article in English | MEDLINE | ID: mdl-12520028

ABSTRACT

The Conserved Domain Database (CDD) is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R). This allows users to search for domain types by name, for example, or to view the domain architecture of any protein in Entrez's sequence database. CDD can be accessed on the WorldWideWeb at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. Users may also employ the CD-Search service to identify conserved domains in new sequences, at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search results, and pre-computed links from Entrez's protein database, are calculated using the RPS-BLAST algorithm and Position Specific Score Matrices (PSSMs) derived from CDD alignments. CD-Searches are also run by default for protein-protein queries submitted to BLAST(R) at http://www.ncbi.nlm.nih.gov/BLAST. CDD mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI. Structure information is used to identify the core substructure likely to be present in all family members, and to produce sequence alignments consistent with structure conservation. This alignment model allows NCBI curators to annotate 'columns' corresponding to functional sites conserved among family members.

Subject(s)

Databases, Protein , Protein Structure, Tertiary , Amino Acid Sequence , Animals , Conserved Sequence , Information Storage and Retrieval , Models, Molecular , Sequence Alignment

5.

MMDB: Entrez's 3D-structure database.

Chen, Jie; Anderson, John B; DeWeese-Scott, Carol; Fedorova, Natalie D; Geer, Lewis Y; He, Siqian; Hurwitz, David I; Jackson, John D; Jacobs, Aviva R; Lanczycki, Christopher J; Liebert, Cynthia A; Liu, Chunlei; Madej, Thomas; Marchler-Bauer, Aron; Marchler, Gabriele H; Mazumder, Raja; Nikolskaya, Anastasia N; Rao, Bachoti S; Panchenko, Anna R; Shoemaker, Benjamin A; Simonyan, Vahan; Song, James S; Thiessen, Paul A; Vasudevan, Sona; Wang, Yanli; Yamashita, Roxanne A; Yin, Jodie J; Bryant, Stephen H.

Nucleic Acids Res ; 31(1): 474-7, 2003 Jan 01.

Article in English | MEDLINE | ID: mdl-12520055

ABSTRACT

Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization. Here, we focus on a new feature of Entrez's Molecular Modeling Database (MMDB): Graphical summaries of the biological annotation available for each 3D structure, based on the results of automated comparative analysis. MMDB is available at: http://www.ncbi.nlm.nih.gov/Entrez/structure.html.

Subject(s)

Databases, Protein , Models, Molecular , Structural Homology, Protein , Animals , Computer Graphics , Imaging, Three-Dimensional , Protein Structure, Tertiary , Proteins/chemistry

6.

MMDB: Entrez's 3D-structure database.

Wang, Yanli; Anderson, John B; Chen, Jie; Geer, Lewis Y; He, Siqian; Hurwitz, David I; Liebert, Cynthia A; Madej, Thomas; Marchler, Gabriele H; Marchler-Bauer, Aron; Panchenko, Anna R; Shoemaker, Benjamin A; Song, James S; Thiessen, Paul A; Yamashita, Roxanne A; Bryant, Stephen H.

Nucleic Acids Res ; 30(1): 249-52, 2002 Jan 01.

Article in English | MEDLINE | ID: mdl-11752307

ABSTRACT

Three-dimensional structures are now known within many protein families and it is quite likely, in searching a sequence database, that one will encounter a homolog with known structure. The goal of Entrez's 3D-structure database is to make this information, and the functional annotation it can provide, easily accessible to molecular biologists. To this end Entrez's search engine provides three powerful features. (i) Sequence and structure neighbors; one may select all sequences similar to one of interest, for example, and link to any known 3D structures. (ii) Links between databases; one may search by term matching in MEDLINE, for example, and link to 3D structures reported in these articles. (iii) Sequence and structure visualization; identifying a homolog with known structure, one may view molecular-graphic and alignment displays, to infer approximate 3D structure. In this article we focus on two features of Entrez's Molecular Modeling Database (MMDB) not described previously: links from individual biopolymer chains within 3D structures to a systematic taxonomy of organisms represented in molecular databases, and links from individual chains (and compact 3D domains within them) to structure neighbors, other chains (and 3D domains) with similar 3D structure. MMDB may be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure.

Subject(s)

Databases, Protein , Proteins/chemistry , Animals , Computer Graphics , Humans , Imaging, Three-Dimensional , Information Storage and Retrieval , Internet , National Library of Medicine (U.S.) , Phylogeny , Protein Structure, Tertiary , Proteins/genetics , Sequence Alignment , Sequence Homology, Amino Acid , United States

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL