Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Nucleic Acids Res ; 47(D1): D280-D284, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30398663

ABSTRACT

This article provides an update of the latest data and developments within the CATH protein structure classification database (http://www.cathdb.info). The resource provides two levels of release: CATH-B, a daily snapshot of the latest structural domain boundaries and superfamily assignments, and CATH+, which adds layers of derived data, such as predicted sequence domains, functional annotations and functional clustering (known as Functional Families or FunFams). The most recent CATH+ release (version 4.2) provides a huge update in the coverage of structural data. This release increases the number of fully- classified domains by over 40% (from 308 999 to 434 857 structural domains), corresponding to an almost two- fold increase in sequence data (from 53 million to over 95 million predicted domains) organised into 6119 superfamilies. The coverage of high-resolution, protein PDB chains that contain at least one assigned CATH domain is now 90.2% (increased from 82.3% in the previous release). A number of highly requested features have also been implemented in our web pages: allowing the user to view an alignment between their query sequence and a representative FunFam structure and providing tools that make it easier to view the full structural context (multi-domain architecture) of domains and chains.


Subject(s)
Databases, Protein , Genome , Amino Acid Sequence , Animals , Conserved Sequence , Gene Ontology , Humans , Models, Molecular , Molecular Sequence Annotation , Multigene Family/genetics , Protein Conformation , Protein Domains/genetics , Sequence Alignment , Sequence Homology, Amino Acid , Structure-Activity Relationship
2.
Nucleic Acids Res ; 46(D1): D435-D439, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29112716

ABSTRACT

Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of globular domain annotations for millions of available protein sequences. Gene3D has previously featured in the Database issue of NAR and here we report a significant update to the Gene3D database. The current release, Gene3D v16, has significantly expanded its domain coverage over the previous version and now contains over 95 million domain assignments. We also report a new method for dealing with complex domain architectures that exist in Gene3D, arising from discontinuous domains. Amongst other updates, we have added visualization tools for exploring domain annotations in the context of other sequence features and in gene families. We also provide web-pages to visualize other domain families that co-occur with a given query domain family.


Subject(s)
Databases, Protein , Genome , Protein Domains , Proteins/chemistry , Software , Amino Acid Sequence , Animals , Computer Graphics , Humans , Internet , Molecular Sequence Annotation , Proteins/genetics , Proteins/metabolism , Sequence Analysis, Protein
4.
Nucleic Acids Res ; 45(D1): D289-D295, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899584

ABSTRACT

The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the number of predicted protein domains in the previous version. The daily-updated CATH-B, which contains our very latest domain assignment data, provides putative classifications for over 100 000 additional protein domains. This article describes developments to the CATH-Gene3D resource over the last two years since the publication in 2015, including: significant increases to our structural and sequence coverage; expansion of the functional families in CATH; building a support vector machine (SVM) to automatically assign domains to superfamilies; improved search facilities to return alignments of query sequences against multiple sequence alignments; the redesign of the web pages and download site.


Subject(s)
Computational Biology/methods , Databases, Protein , Models, Molecular , Proteins/chemistry , Proteins/metabolism , Software , Structure-Activity Relationship , Web Browser
5.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25348407

ABSTRACT

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Subject(s)
Databases, Protein , Molecular Sequence Annotation , Protein Structure, Tertiary , Algorithms , Genomics , Internet , Models, Molecular , Protein Structure, Tertiary/genetics , Sequence Analysis, Protein
6.
Nucleic Acids Res ; 43(Database issue): D376-81, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25348408

ABSTRACT

The latest version of the CATH-Gene3D protein structure classification database (4.0, http://www.cathdb.info) provides annotations for over 235,000 protein domain structures and includes 25 million domain predictions. This article provides an update on the major developments in the 2 years since the last publication in this journal including: significant improvements to the predictive power of our functional families (FunFams); the release of our 'current' putative domain assignments (CATH-B); a new, strictly non-redundant data set of CATH domains suitable for homology benchmarking experiments (CATH-40) and a number of improvements to the web pages.


Subject(s)
Databases, Protein , Molecular Sequence Annotation , Protein Structure, Tertiary , Genomics , Internet , Protein Structure, Tertiary/genetics , Proteins/classification
7.
Nucleic Acids Res ; 41(Database issue): D490-8, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203873

ABSTRACT

CATH version 3.5 (Class, Architecture, Topology, Homology, available at http://www.cathdb.info/) contains 173 536 domains, 2626 homologous superfamilies and 1313 fold groups. When focusing on structural genomics (SG) structures, we observe that the number of new folds for CATH v3.5 is slightly less than for previous releases, and this observation suggests that we may now know the majority of folds that are easily accessible to structure determination. We have improved the accuracy of our functional family (FunFams) sub-classification method and the CATH sequence domain search facility has been extended to provide FunFam annotations for each domain. The CATH website has been redesigned. We have improved the display of functional data and of conserved sequence features associated with FunFams within each CATH superfamily.


Subject(s)
Databases, Protein , Protein Structure, Tertiary , Genomics , Internet , Molecular Sequence Annotation , Protein Folding , Proteins/chemistry , Proteins/classification , Proteins/genetics , Sequence Alignment , Sequence Analysis, Protein , Structural Homology, Protein
8.
Nucleic Acids Res ; 41(Database issue): D499-507, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203986

ABSTRACT

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).


Subject(s)
Databases, Protein , Protein Structure, Tertiary , Genomics , Humans , Internet , Molecular Sequence Annotation , Proteins/chemistry , Proteins/classification , Proteins/genetics , Software
9.
Nucleic Acids Res ; 35(Database issue): D291-7, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17135200

ABSTRACT

We report the latest release (version 3.0) of the CATH protein domain database (http://www.cathdb.info). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initiatives, more sensitive methods have been developed for identifying boundaries in multi-domain proteins and for recognising homologues. The CATH classification update is now being driven by an integrated pipeline that links these automated procedures with validation steps, that have been made easier by the provision of information rich web pages summarising comparison scores and relevant links to external sites for each domain being classified. An analysis of the population of domains in the CATH hierarchy and several domain characteristics are presented for version 3.0. We also report an update of the CATH Dictionary of homologous structures (CATH-DHS) which now contains multiple structural alignments, consensus information and functional annotations for 1459 well populated superfamilies in CATH. CATH is directly linked to the Gene3D database which is a projection of CATH structural data onto approximately 2 million sequences in completed genomes and UniProt.


Subject(s)
Databases, Protein , Protein Structure, Tertiary , Classification/methods , Evolution, Molecular , Internet , Protein Folding , Protein Structure, Tertiary/genetics , Proteins/classification , Sequence Homology, Amino Acid , Structural Homology, Protein , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...