Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 50(D1): D648-D653, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34761267

ABSTRACT

The IntAct molecular interaction database (https://www.ebi.ac.uk/intact) is a curated resource of molecular interactions, derived from the scientific literature and from direct data depositions. As of August 2021, IntAct provides more than one million binary interactions, curated by twelve global partners of the International Molecular Exchange consortium, for which the IntAct database provides a shared curation and dissemination platform. The IMEx curation policy has always emphasised a fine-grained data and curation model, aiming to capture the relevant experimental detail essential for the interpretation of the provided molecular interaction data. Here, we present recent curation focus and progress, as well as a completely redeveloped website which presents IntAct data in a much more user-friendly and detailed way.


Subject(s)
Databases, Protein , Protein Interaction Maps/genetics , Software , Humans , Protein Interaction Mapping/methods
2.
Nucleic Acids Res ; 50(D1): D578-D586, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34718729

ABSTRACT

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the 'Support' link.


Subject(s)
Data Curation/methods , Databases, Protein , Multiprotein Complexes/chemistry , Coronavirus/chemistry , Data Visualization , Databases, Chemical , Enzymes/chemistry , Enzymes/metabolism , Escherichia coli/chemistry , Humans , International Cooperation , Molecular Sequence Annotation , Multiprotein Complexes/metabolism , User-Computer Interface
3.
Biochim Biophys Acta Gene Regul Mech ; 1864(10): 194749, 2021 10.
Article in English | MEDLINE | ID: mdl-34425241

ABSTRACT

The domain of transcription regulation has been notoriously difficult to annotate in the Gene Ontology, partly because of the intricacies of gene regulation which involve molecular interactions with DNA as well as amongst protein complexes. The molecular function 'transcription coregulator activity' is a part of the biological process 'regulation of transcription, DNA-templated' that occurs in the cellular component 'chromatin'. It can mechanistically link sequence-specific DNA-binding transcription factor (dbTF) regulatory DNA target sites to coactivator and corepressor target sites through the molecular function 'cis-regulatory region sequence-specific DNA binding'. Many questions arise about transcription coregulators (coTF). Here, we asked how many unannotated, putative coregulators can be identified in protein complexes? Therefore, we mined the CORUM and hu.MAP protein complex databases with known and strongly presumed human transcription coregulators. In addition, we trawled the BioGRID and IntAct molecular interaction databases for interactors of the known 1457 human dbTFs annotated by the GREEKC and GO consortia. This yielded 1093 putative transcription factor coregulator complex subunits, of which 954 interact directly with a dbTF. This substantially expands the set of coTFs that could be annotated to 'transcription coregulator activity' and sets the stage for renewed annotation and wet-lab research efforts. To this end, we devised a prioritisation score based on existing GO annotations of already curated transcription coregulators as well as interactome representation. Since all the proteins that we mined are parts of protein complexes, we propose to concomitantly engage in annotation of the putative transcription coregulator-containing complexes in the Complex Portal database.


Subject(s)
DNA-Binding Proteins/metabolism , Transcription Factors/metabolism , Base Sequence , DNA/chemistry , Data Mining , Databases, Genetic , Gene Expression Regulation , Humans , Protein Interaction Mapping , Protein Subunits/metabolism , Transcription, Genetic
4.
Nucleic Acids Res ; 49(6): 3156-3167, 2021 04 06.
Article in English | MEDLINE | ID: mdl-33677561

ABSTRACT

The EMBL-EBI Complex Portal is a knowledgebase of macromolecular complexes providing persistent stable identifiers. Entries are linked to literature evidence and provide details of complex membership, function, structure and complex-specific Gene Ontology annotations. Data are freely available and downloadable in HUPO-PSI community standards and missing entries can be requested for curation. In collaboration with Saccharomyces Genome Database and UniProt, the yeast complexome, a compendium of all known heteromeric assemblies from the model organism Saccharomyces cerevisiae, was curated. This expansion of knowledge and scope has led to a 50% increase in curated complexes compared to the previously published dataset, CYC2008. The yeast complexome is used as a reference resource for the analysis of complexes from large-scale experiments. Our analysis showed that genes coding for proteins in complexes tend to have more genetic interactions, are co-expressed with more genes, are more multifunctional, localize more often in the nucleus, and are more often involved in nucleic acid-related metabolic processes and processes where large machineries are the predominant functional drivers. A comparison to genetic interactions showed that about 40% of expanded co-complex pairs also have genetic interactions, suggesting strong functional links between complex members.


Subject(s)
Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Datasets as Topic , Gene Ontology , Knowledge Bases , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics
5.
Nat Commun ; 11(1): 6144, 2020 12 01.
Article in English | MEDLINE | ID: mdl-33262342

ABSTRACT

The International Molecular Exchange (IMEx) Consortium provides scientists with a single body of experimentally verified protein interactions curated in rich contextual detail to an internationally agreed standard. In this update to the work of the IMEx Consortium, we discuss how this initiative has been working in practice, how it has ensured database sustainability, and how it is meeting emerging annotation challenges through the introduction of new interactor types and data formats. Additionally, we provide examples of how IMEx data are being used by biomedical researchers and integrated in other bioinformatic tools and resources.


Subject(s)
Access to Information , Databases, Genetic , Humans , Information Dissemination , International Cooperation
6.
Database (Oxford) ; 20192019 01 01.
Article in English | MEDLINE | ID: mdl-30715277

ABSTRACT

Proteins seldom function individually. Instead, they interact with other proteins or nucleic acids to form stable macromolecular complexes that play key roles in important cellular processes and pathways. One of the goals of Saccharomyces Genome Database (SGD; www.yeastgenome.org) is to provide a complete picture of budding yeast biological processes. To this end, we have collaborated with the Molecular Interactions team that provides the Complex Portal database at EMBL-EBI to manually curate the complete yeast complexome. These data, from a total of 589 complexes, were previously available only in SGD's YeastMine data warehouse (yeastmine.yeastgenome.org) and the Complex Portal (www.ebi.ac.uk/complexportal). We have now incorporated these macromolecular complex data into the SGD core database and designed complex-specific reports to make these data easily available to researchers. These web pages contain referenced summaries focused on the composition and function of individual complexes. In addition, detailed information about how subunits interact within the complex, their stoichiometry and the physical structure are displayed when such information is available. Finally, we generate network diagrams displaying subunits and Gene Ontology annotations that are shared between complexes. Information on macromolecular complexes will continue to be updated in collaboration with the Complex Portal team and curated as more data become available.


Subject(s)
DNA, Fungal , Databases, Genetic , Fungal Proteins , Genome, Fungal/genetics , Saccharomyces/genetics , DNA, Fungal/chemistry , DNA, Fungal/genetics , DNA, Fungal/metabolism , Fungal Proteins/chemistry , Fungal Proteins/genetics , Fungal Proteins/metabolism , Genomics
7.
Nucleic Acids Res ; 47(D1): D550-D558, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30357405

ABSTRACT

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database that collates and summarizes information on stable, macromolecular complexes of known function. It captures complex composition, topology and function and links out to a large range of domain-specific resources that hold more detailed data, such as PDB or Reactome. We have made several significant improvements since our last update, including improving compliance to the FAIR data principles by providing complex-specific, stable identifiers that include versioning. Protein complexes are now available from 20 species for download in standards-compliant formats such as PSI-XML, MI-JSON and ComplexTAB or can be accessed via an improved REST API. A component-based JS front-end framework has been implemented to drive a new website and this has allowed the use of APIs from linked services to import and visualize information such as the 3D structure of protein complexes, its role in reactions and pathways and the co-expression of complex components in the tissues of multi-cellular organisms. A first draft of the complete complexome of Saccharomyces cerevisiae is now available to browse and download.


Subject(s)
Databases, Protein , Multiprotein Complexes/chemistry , Animals , Computer Graphics , Humans , Macromolecular Substances/chemistry , Mice , Multiprotein Complexes/metabolism , Nucleic Acids/chemistry , Protein Conformation
8.
Genes (Basel) ; 9(12)2018 Nov 29.
Article in English | MEDLINE | ID: mdl-30501127

ABSTRACT

The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer's Research United Kingdom (ARUK) foundation and led by the University College London (UCL) biocuration team, was to enhance the GO resource by developing new neurological GO terms, and use GO terms to annotate gene products associated with dementia. Specifically, proteins and protein complexes relevant to processes involving amyloid-beta and tau have been annotated and the resulting annotations are denoted in GO databases as 'ARUK-UCL'. Biological knowledge presented in the scientific literature was captured through the association of GO terms with dementia-relevant protein records; GO itself was revised, and new GO terms were added. This literature biocuration increased the number of Alzheimer's-relevant gene products that were being associated with neurological GO terms, such as 'amyloid-beta clearance' or 'learning or memory', as well as neuronal structures and their compartments. Of the total 2055 annotations that we contributed for the prioritised gene products, 526 have associated proteins and complexes with neurological GO terms. To ensure that these descriptive annotations could be provided for Alzheimer's-relevant gene products, over 70 new GO terms were created. Here, we describe how the improvements in ontology development and biocuration resulting from this initiative can benefit the scientific community and enhance the interpretation of dementia data.

9.
Methods Mol Biol ; 1764: 377-390, 2018.
Article in English | MEDLINE | ID: mdl-29605928

ABSTRACT

The Complex Portal ( www.ebi.ac.uk/complexportal ) is an encyclopedia of macromolecular complexes. Complexes are assigned unique, stable IDs, are species specific, and list all participating members with links to an appropriate reference database (UniProtKB, ChEBI, RNAcentral). Each complex is annotated extensively with its functions, properties, structure, stoichiometry, tissue expression profile, and subcellular location. Links to domain-specific databases allow the user to access additional information and enable data searching and filtering. Complexes can be saved and downloaded in PSI-MI XML, MI-JSON, and tab-delimited formats.


Subject(s)
Data Mining/methods , Databases, Protein , Macromolecular Substances/chemistry , Macromolecular Substances/metabolism , Proteins/chemistry , Search Engine , Humans , Proteins/metabolism
10.
Bioinformatics ; 33(22): 3673-3675, 2017 Nov 15.
Article in English | MEDLINE | ID: mdl-29036573

ABSTRACT

SUMMARY: Proteins frequently function as parts of complexes, assemblages of multiple proteins and other biomolecules, yet network visualizations usually only show proteins as parts of binary interactions. ComplexViewer visualizes interactions with more than two participants and thereby avoids the need to first expand these into multiple binary interactions. Furthermore, if binding regions between molecules are known then these can be displayed in the context of the larger complex. AVAILABILITY AND IMPLEMENTATION: freely available under Apache version 2 license; EMBL-EBI Complex Portal: http://www.ebi.ac.uk/complexportal; Source code: https://github.com/MICommunity/ComplexViewer; Package: https://www.npmjs.com/package/complexviewer; http://biojs.io/d/complexviewer. Language: JavaScript; Web technology: Scalable Vector Graphics; Libraries: D3.js. CONTACT: colin.combe@ed.ac.uk or juri.rappsilber@ed.ac.uk.


Subject(s)
Computational Biology/methods , Models, Biological , Protein Interaction Domains and Motifs , Protein Interaction Maps , Software , Macromolecular Substances/metabolism , Protein Binding
11.
Nucleic Acids Res ; 43(Database issue): D479-84, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25313161

ABSTRACT

The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.


Subject(s)
Databases, Protein , Proteins/chemistry , Animals , Binding Sites , Humans , Internet , Macromolecular Substances/chemistry , Mice , Protein Binding , Proteins/genetics , Proteins/metabolism
12.
Nucleic Acids Res ; 42(Database issue): D358-63, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24234451

ABSTRACT

IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org).


Subject(s)
Databases, Protein , Protein Interaction Mapping , Internet , Software
13.
J Gen Virol ; 90(Pt 7): 1622-1628, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19339480

ABSTRACT

Tunisia is a medium-level epidemic country for hepatitis B virus (HBV). This study characterizes, for the first time, full genome HBV strains from Tunisia. Viral load quantification and phylogenetic analyses of full genome or pre-S/S sequences were performed on 196 hepatitis B surface antigen (HBsAg)-positive plasma samples from Tunisian blood donors. The median viral load was 64.65 IU ml(-1) (range<5-7.7x10(8) IU ml(-1)) and 89% of samples had viral loads below 10,000 IU ml(-1). Fifty-nine strains formed a novel subgenotype D7, 41 strains clustered in subgenotype D1, seven strains in subgenotype A2 and one strain in genotype C. The novel subgenotype D7 was defined by maximum Bayesian posterior probability, a genetic divergence from other HBV/D subgenotypes by >4% and a stronger HBV/E signal in the X to core genes than subgenotype D1. In conclusion, HBV/D is dominant in asymptomatic Tunisian HBsAg carriers and a novel subgenotype, D7, was the most common subgenotype found in this population.


Subject(s)
DNA, Viral/genetics , Genome, Viral , Hepatitis B virus/classification , Hepatitis B virus/isolation & purification , Hepatitis B/virology , Sequence Analysis, DNA , Blood Donors , Cluster Analysis , DNA, Viral/chemistry , Genotype , Hepatitis B Surface Antigens/genetics , Hepatitis B virus/genetics , Humans , Molecular Sequence Data , Phylogeny , Sequence Homology , Tunisia , Viral Load
14.
Mol Phylogenet Evol ; 42(3): 622-36, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17084644

ABSTRACT

Phylogenetic reconstructions of relations within the phylum Nematoda are inherently difficult but have been advanced with the introduction of large-scale molecular-based techniques. However, the most recent revisions were heavily biased towards terrestrial and parasitic species and greater representation of clades containing marine species (e.g. Araeolaimida, Chromadorida, Desmodorida, Desmoscolecida, Enoplida, and Monhysterida) is needed for accurate coverage of known taxonomic diversity. We now add small subunit ribosomal DNA (SSU rDNA) sequences for 100 previously un-sequenced species of nematodes, including 46 marine taxa. SSU rDNA sequences for >200 taxa have been analysed based on Bayesian inference and LogDet-transformed distances. The resulting phylogenies provide support for (i) the re-classification of the Secernentea as the order Rhabditida that derived from a common ancestor of chromadorean orders Araeolaimida, Chromadorida, Desmodorida, Desmoscolecida, and Monhysterida and (ii) the position of Bunonema close to the Diplogasteroidea in the Rhabditina. Other, previously controversial relationships can now be resolved more clearly: (a) Alaimus, Campydora, and Trischistoma belong in the Enoplida, (b) Isolaimium is placed basally to a big clade containing the Axonolaimidae, Plectidae, and Rhabditida, (c) Xyzzors belongs in the Desmodoridae, (d) Comesomatidae and Cyartonema belongs in the Monhysterida, (e) Globodera belongs in the Hoplolaimidae and (f) Paratylenchus dianeae belongs in the Criconematoidea. However, the SSU gene did not provide significant support for the class Chromadoria or clear evidence for the relationship between the three classes, Enoplia, Dorylaimia, and Chromadoria. Furthermore, across the whole phylum, the phylogenetically informative characters of the SSU gene are not informative in a parsimony analysis, highlighting the short-comings of the parsimony method for large-scale phylogenetic modelling.


Subject(s)
Campanulaceae/genetics , Evolution, Molecular , Nematoda/genetics , Phylogeny , Animals , Bayes Theorem , Ecosystem , Models, Biological , Nematoda/classification
SELECTION OF CITATIONS
SEARCH DETAIL
...