Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2023 Dec 20.
Article in English | MEDLINE | ID: mdl-38076838

ABSTRACT

The lack of interoperable data standards among reference genome data-sharing platforms inhibits cross-platform analysis while increasing the risk of data provenance loss. Here, we describe the FAIR-bioHeaders Reference genome (FHR), a metadata standard guided by the principles of Findability, Accessibility, Interoperability, and Reuse (FAIR) in addition to the principles of Transparency, Responsibility, User focus, Sustainability, and Technology (TRUST). The objective of FHR is to provide an extensive set of data serialisation methods and minimum data field requirements while still maintaining extensibility, flexibility, and expressivity in an increasingly decentralised genomic data ecosystem. The effort needed to implement FHR is low; FHR's design philosophy ensures easy implementation while retaining the benefits gained from recording both machine and human-readable provenance.

3.
mSystems ; 6(1)2021 02 23.
Article in English | MEDLINE | ID: mdl-33622857

ABSTRACT

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.

5.
Nucleic Acids Res ; 47(D1): D1186-D1194, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407590

ABSTRACT

The Evidence and Conclusion Ontology (ECO) contains terms (classes) that describe types of evidence and assertion methods. ECO terms are used in the process of biocuration to capture the evidence that supports biological assertions (e.g. gene product X has function Y as supported by evidence Z). Capture of this information allows tracking of annotation provenance, establishment of quality control measures and query of evidence. ECO contains over 1500 terms and is in use by many leading biological resources including the Gene Ontology, UniProt and several model organism databases. ECO is continually being expanded and revised based on the needs of the biocuration community. The ontology is freely available for download from GitHub (https://github.com/evidenceontology/) or the project's website (http://evidenceontology.org/). Users can request new terms or changes to existing terms through the project's GitHub site. ECO is released into the public domain under CC0 1.0 Universal.


Subject(s)
Computational Biology/methods , Databases, Genetic , Gene Ontology , Proteins/genetics , Animals , Humans , Information Storage and Retrieval/methods , Internet , Proteins/metabolism , Sequence Analysis, Protein , User-Computer Interface
7.
J Biomed Semantics ; 9(1): 6, 2018 01 18.
Article in English | MEDLINE | ID: mdl-29347969

ABSTRACT

BACKGROUND: Creation and use of ontologies has become a mainstream activity in many disciplines, in particular, the biomedical domain. Ontology developers often disseminate information about these ontologies in peer-reviewed ontology description reports. There appears to be, however, a high degree of variability in the content of these reports. Often, important details are omitted such that it is difficult to gain a sufficient understanding of the ontology, its content and method of creation. RESULTS: We propose the Minimum Information for Reporting an Ontology (MIRO) guidelines as a means to facilitate a higher degree of completeness and consistency between ontology documentation, including published papers, and ultimately a higher standard of report quality. A draft of the MIRO guidelines was circulated for public comment in the form of a questionnaire, and we subsequently collected 110 responses from ontology authors, developers, users and reviewers. We report on the feedback of this consultation, including comments on each guideline, and present our analysis on the relative importance of each MIRO information item. These results were used to update the MIRO guidelines, mainly by providing more detailed operational definitions of the individual items and assigning degrees of importance. Based on our revised version of MIRO, we conducted a review of 15 recently published ontology description reports from three important journals in the Semantic Web and Biomedical domain and analysed them for compliance with the MIRO guidelines. We found that only 41.38% of the information items were covered by the majority of the papers (and deemed important by the survey respondents) and a large number of important items are not covered at all, like those related to testing and versioning policies. CONCLUSIONS: We believe that the community-reviewed MIRO guidelines can contribute to improving significantly the quality of ontology description reports and other documentation, in particular by increasing consistent reporting of important ontology features that are otherwise often neglected.


Subject(s)
Biological Ontologies , Guidelines as Topic , Research Design , Surveys and Questionnaires
8.
Nucleic Acids Res ; 46(D1): D1168-D1180, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29186578

ABSTRACT

The Planteome project (http://www.planteome.org) provides a suite of reference and species-specific ontologies for plants and annotations to genes and phenotypes. Ontologies serve as common standards for semantic integration of a large and growing corpus of plant genomics, phenomics and genetics data. The reference ontologies include the Plant Ontology, Plant Trait Ontology and the Plant Experimental Conditions Ontology developed by the Planteome project, along with the Gene Ontology, Chemical Entities of Biological Interest, Phenotype and Attribute Ontology, and others. The project also provides access to species-specific Crop Ontologies developed by various plant breeding and research communities from around the world. We provide integrated data on plant traits, phenotypes, and gene function and expression from 95 plant taxa, annotated with reference ontology terms. The Planteome project is developing a plant gene annotation platform; Planteome Noctua, to facilitate community engagement. All the Planteome ontologies are publicly available and are maintained at the Planteome GitHub site (https://github.com/Planteome) for sharing, tracking revisions and new requests. The annotated data are freely accessible from the ontology browser (http://browser.planteome.org/amigo) and our data repository.


Subject(s)
Databases, Genetic , Genome, Plant , Plants/genetics , Crops, Agricultural/genetics , Data Curation , Gene Expression Regulation, Plant , Gene Ontology , Molecular Sequence Annotation , Phenotype , Software , User-Computer Interface
9.
Nucleic Acids Res ; 45(D1): D347-D352, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27733503

ABSTRACT

Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.ontobee.org/) is a linked ontology data server that stores ontology information using RDF triple store technology and supports query, visualization and linkage of ontology terms. Ontobee is also the default linked data server for publishing and browsing biomedical ontologies in the Open Biological Ontology (OBO) Foundry (http://obofoundry.org) library. Ontobee currently hosts more than 180 ontologies (including 131 OBO Foundry Library ontologies) with over four million terms. Ontobee provides a user-friendly web interface for querying and visualizing the details and hierarchy of a specific ontology term. Using the eXtensible Stylesheet Language Transformation (XSLT) technology, Ontobee is able to dereference a single ontology term URI, and then output RDF/eXtensible Markup Language (XML) for computer processing or display the HTML information on a web browser for human users. Statistics and detailed information are generated and displayed for each ontology listed in Ontobee. In addition, a SPARQL web interface is provided for custom advanced SPARQL queries of one or multiple ontologies.


Subject(s)
Biological Ontologies , Databases, Factual , Software , Web Browser
10.
Nucleic Acids Res ; 45(D1): D737-D743, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27794045

ABSTRACT

Upon the first publication of the fifth iteration of the Functional Annotation of Mammalian Genomes collaborative project, FANTOM5, we gathered a series of primary data and database systems into the FANTOM web resource (http://fantom.gsc.riken.jp) to facilitate researchers to explore transcriptional regulation and cellular states. In the course of the collaboration, primary data and analysis results have been expanded, and functionalities of the database systems enhanced. We believe that our data and web systems are invaluable resources, and we think the scientific community will benefit for this recent update to deepen their understanding of mammalian cellular organization. We introduce the contents of FANTOM5 here, report recent updates in the web resource and provide future perspectives.


Subject(s)
Databases, Genetic , Gene Expression Profiling/methods , Genomics/methods , Mammals/genetics , Software , Web Browser , Animals , Computational Biology , Humans , Search Engine
11.
Hum Mutat ; 36(10): 922-7, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26255989

ABSTRACT

Despite the increasing prevalence of clinical sequencing, the difficulty of identifying additional affected families is a key obstacle to solving many rare diseases. There may only be a handful of similar patients worldwide, and their data may be stored in diverse clinical and research databases. Computational methods are necessary to enable finding similar patients across the growing number of patient repositories and registries. We present the Matchmaker Exchange Application Programming Interface (MME API), a protocol and data format for exchanging phenotype and genotype profiles to enable matchmaking among patient databases, facilitate the identification of additional cohorts, and increase the rate with which rare diseases can be researched and diagnosed. We designed the API to be straightforward and flexible in order to simplify its adoption on a large number of data types and workflows. We also provide a public test data set, curated from the literature, to facilitate implementation of the API and development of new matching algorithms. The initial version of the API has been successfully implemented by three members of the Matchmaker Exchange and was immediately able to reproduce previously identified matches and generate several new leads currently being validated. The API is available at https://github.com/ga4gh/mme-apis.


Subject(s)
Computational Biology/methods , Information Dissemination/methods , Rare Diseases/genetics , Algorithms , Databases, Genetic , Genetic Predisposition to Disease , Genotype , Humans , Phenotype , Rare Diseases/pathology , Web Browser
12.
J Biomed Semantics ; 5(Suppl 1 Proceedings of the Bio-Ontologies Spec Interest G): S4, 2014.
Article in English | MEDLINE | ID: mdl-25093073

ABSTRACT

BACKGROUND: The molecular etiology is still to be identified for about half of the currently described Mendelian diseases in humans, thereby hindering efforts to find treatments or preventive measures. Advances, such as new sequencing technologies, have led to increasing amounts of data becoming available with which to address the problem of identifying disease genes. Therefore, automated methods are needed that reliably predict disease gene candidates based on available data. We have recently developed Exomiser as a tool for identifying causative variants from exome analysis results by filtering and prioritising using a number of criteria including the phenotype similarity between the disease and mouse mutants involving the gene candidates. Initial investigations revealed a variation in performance for different medical categories of disease, due in part to a varying contribution of the phenotype scoring component. RESULTS: In this study, we further analyse the performance of our cross-species phenotype matching algorithm, and examine in more detail the reasons why disease gene filtering based on phenotype data works better for certain disease categories than others. We found that in addition to misleading phenotype alignments between species, some disease categories are still more amenable to automated predictions than others, and that this often ties in with community perceptions on how well the organism works as model. CONCLUSIONS: In conclusion, our automated disease gene candidate predictions are highly dependent on the organism used for the predictions and the disease category being studied. Future work on computational disease gene prediction using phenotype data would benefit from methods that take into account the disease category and the source of model organism data.

13.
J Biomed Semantics ; 4(1): 2, 2013 Jan 04.
Article in English | MEDLINE | ID: mdl-23286517

ABSTRACT

BACKGROUND: Biomedical ontologies are key elements for building up the Life Sciences Semantic Web. Reusing and building biomedical ontologies requires flexible and versatile tools to manipulate them efficiently, in particular for enriching their axiomatic content. The Ontology Pre Processor Language (OPPL) is an OWL-based language for automating the changes to be performed in an ontology. OPPL augments the ontologists' toolbox by providing a more efficient, and less error-prone, mechanism for enriching a biomedical ontology than that obtained by a manual treatment. RESULTS: We present OPPL-Galaxy, a wrapper for using OPPL within Galaxy. The functionality delivered by OPPL (i.e. automated ontology manipulation) can be combined with the tools and workflows devised within the Galaxy framework, resulting in an enhancement of OPPL. Use cases are provided in order to demonstrate OPPL-Galaxy's capability for enriching, modifying and querying biomedical ontologies. CONCLUSIONS: Coupling OPPL-Galaxy with other bioinformatics tools of the Galaxy framework results in a system that is more than the sum of its parts. OPPL-Galaxy opens a new dimension of analyses and exploitation of biomedical ontologies, including automated reasoning, paving the way towards advanced biological data analyses.

14.
BMC Bioinformatics ; 12: 418, 2011 Oct 27.
Article in English | MEDLINE | ID: mdl-22032770

ABSTRACT

BACKGROUND: Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly being adopted as a method for quality control as well as for facilitating interoperability and data integration. RESULTS: We show how automated reasoning over logical definitions of ontology terms can be used to improve ontology structure. We provide the Java software package GULO (Getting an Understanding of LOgical definitions), which allows fast and easy evaluation for any kind of logically decomposed ontology by generating a composite OWL ontology from appropriate subsets of the referenced ontologies and comparing the inferred relationships with the relationships asserted in the target ontology. As a case study we show how to use GULO to evaluate the logical definitions that have been developed for the Mammalian Phenotype Ontology (MPO). CONCLUSIONS: Logical definitions of terms from biomedical ontologies represent an important resource for error and disagreement detection. GULO gives ontology curators a fast and simple tool for validation of their work.


Subject(s)
Semantics , Software , Vocabulary, Controlled , Animals , Humans , Knowledge , Logic , Phenotype
15.
J Biomed Semantics ; 2 Suppl 1: S3, 2011 Mar 07.
Article in English | MEDLINE | ID: mdl-21388572

ABSTRACT

BACKGROUND: Ontologies are commonly used in biomedicine to organize concepts to describe domains such as anatomies, environments, experiment, taxonomies etc. NCBO BioPortal currently hosts about 180 different biomedical ontologies. These ontologies have been mainly expressed in either the Open Biomedical Ontology (OBO) format or the Web Ontology Language (OWL). OBO emerged from the Gene Ontology, and supports most of the biomedical ontology content. In comparison, OWL is a Semantic Web language, and is supported by the World Wide Web consortium together with integral query languages, rule languages and distributed infrastructure for information interchange. These features are highly desirable for the OBO content as well. A convenient method for leveraging these features for OBO ontologies is by transforming OBO ontologies to OWL. RESULTS: We have developed a methodology for translating OBO ontologies to OWL using the organization of the Semantic Web itself to guide the work. The approach reveals that the constructs of OBO can be grouped together to form a similar layer cake. Thus we were able to decompose the problem into two parts. Most OBO constructs have easy and obvious equivalence to a construct in OWL. A small subset of OBO constructs requires deeper consideration. We have defined transformations for all constructs in an effort to foster a standard common mapping between OBO and OWL. Our mapping produces OWL-DL, a Description Logics based subset of OWL with desirable computational properties for efficiency and correctness. Our Java implementation of the mapping is part of the official Gene Ontology project source. CONCLUSIONS: Our transformation system provides a lossless roundtrip mapping for OBO ontologies, i.e. an OBO ontology may be translated to OWL and back without loss of knowledge. In addition, it provides a roadmap for bridging the gap between the two ontology languages in order to enable the use of ontology content in a language independent manner.

17.
BMC Bioinformatics ; 11 Suppl 12: S8, 2010 Dec 21.
Article in English | MEDLINE | ID: mdl-21210987

ABSTRACT

BACKGROUND: The biosciences increasingly face the challenge of integrating a wide variety of available data, information and knowledge in order to gain an understanding of biological systems. Data integration is supported by a diverse series of tools, but the lack of a consistent terminology to label these data still presents significant hurdles. As a consequence, much of the available biological data remains disconnected or worse: becomes misconnected. The need to address this terminology problem has spawned the building of a large number of bio-ontologies. OBOF, RDF and OWL are among the most used ontology formats to capture terms and relationships in the Life Sciences, opening the potential to use the Semantic Web to support data integration and further exploitation of integrated resources via automated retrieval and reasoning procedures. METHODS: We extended the Perl suite ONTO-PERL and functionally integrated it into the Galaxy platform. The resulting ONTO-ToolKit supports the analysis and handling of OBO-formatted ontologies via the Galaxy interface, and we demonstrated its functionality in different use cases that illustrate the flexibility to obtain sets of ontology terms that match specific search criteria. RESULTS: ONTO-ToolKit is available as a tool suite for Galaxy. Galaxy not only provides a user friendly interface allowing the interested biologist to manipulate OBO ontologies, it also opens up the possibility to perform further biological (and ontological) analyses by using other tools available within the Galaxy environment. Moreover, it provides tools to translate OBO-formatted ontologies into Semantic Web formats such as RDF and OWL. CONCLUSIONS: ONTO-ToolKit reaches out to researchers in the biosciences, by providing a user-friendly way to analyse and manipulate ontologies. This type of functionality will become increasingly important given the wealth of information that is becoming available based on ontologies.


Subject(s)
Software , Vocabulary, Controlled , Biological Science Disciplines , Gene Expression , Internet , Proteins/classification , Schizosaccharomyces/genetics , Schizosaccharomyces/metabolism , Semantics
18.
Hum Mol Genet ; 19(4): 707-19, 2010 Feb 15.
Article in English | MEDLINE | ID: mdl-19933168

ABSTRACT

We describe a novel approach to genetic association analyses with proteins sub-divided into biologically relevant smaller sequence features (SFs), and their variant types (VTs). SFVT analyses are particularly informative for study of highly polymorphic proteins such as the human leukocyte antigen (HLA), given the nature of its genetic variation: the high level of polymorphism, the pattern of amino acid variability, and that most HLA variation occurs at functionally important sites, as well as its known role in organ transplant rejection, autoimmune disease development and response to infection. Further, combinations of variable amino acid sites shared by several HLA alleles (shared epitopes) are most likely better descriptors of the actual causative genetic variants. In a cohort of systemic sclerosis patients/controls, SFVT analysis shows that a combination of SFs implicating specific amino acid residues in peptide binding pockets 4 and 7 of HLA-DRB1 explains much of the molecular determinant of risk.


Subject(s)
Genetic Variation , HLA Antigens/genetics , Scleroderma, Systemic/genetics , HLA Antigens/chemistry , HLA-DR Antigens/chemistry , HLA-DR Antigens/genetics , HLA-DRB1 Chains , Humans , Molecular Conformation
19.
Article in English | MEDLINE | ID: mdl-19964203

ABSTRACT

This paper describes an approach to providing computer-interpretable logical definitions for the terms of the Human Phenotype Ontology (HPO) using PATO, the ontology of phenotypic qualities, to link terms of the HPO to the anatomic and other entities that are affected by abnormal phenotypic qualities. This approach will allow improved computerized reasoning as well as a facility to compare phenotypes between different species. The PATO mapping will also provide direct links from phenotypic abnormalities and underlying anatomic structures encoded using the Foundational Model of Anatomy, which will be a valuable resource for computational investigations of the links between anatomical components and concepts representing diseases with abnormal phenotypes and associated genes.


Subject(s)
Models, Anatomic , Phenotype , Animals , Biomedical Engineering , Computational Biology , Humans , Marfan Syndrome/pathology , Mice , Species Specificity , Vocabulary, Controlled
20.
Mamm Genome ; 20(8): 457-61, 2009 Aug.
Article in English | MEDLINE | ID: mdl-19649761

ABSTRACT

Now that the laboratory mouse genome is sequenced and the annotation of its gene content is improving, the next major challenge is the annotation of the phenotypic associations of mouse genes. This requires the development of systematic phenotyping pipelines that use standardized phenotyping procedures which allow comparison across laboratories. It also requires the development of a sophisticated informatics infrastructure for the description and interchange of phenotype data. Here we focus on the current state of the art in the description of data produced by systematic phenotyping approaches using ontologies, in particular, the EQ (Entity-Quality) approach, and what developments are required to facilitate the linking of phenotypic descriptions of mutant mice to human diseases.


Subject(s)
Disease Models, Animal , Disease/genetics , Mice/genetics , Animals , Genome , Humans , Mice, Transgenic , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL
...