Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-38979389

ABSTRACT

The Data Coordinating Center (DCC) of the Human Tumor Atlas Network (HTAN) has played a crucial role in enabling the broad sharing and effective utilization of HTAN data within the scientific community. Data from the first phase of HTAN are now available publicly. We describe the diverse datasets and modalities shared, multiple access routes to HTAN assay data and metadata, data standards, technical infrastructure and governance approaches, as well as our approach to sustained community engagement. HTAN data can be accessed via the HTAN Portal, explored in visualization tools-including CellxGene, Minerva, and cBioPortal-and analyzed in the cloud through the NCI Cancer Research Data Commons nodes. We have developed a streamlined infrastructure to ingest and disseminate data by leveraging the Synapse platform. Taken together, the HTAN DCC's approach demonstrates a successful model for coordinating, standardizing, and disseminating complex cancer research data via multiple resources in the cancer data ecosystem, offering valuable insights for similar consortia, and researchers looking to leverage HTAN data.

2.
Elife ; 92020 03 17.
Article in English | MEDLINE | ID: mdl-32180547

ABSTRACT

Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.


Subject(s)
Biological Science Disciplines , Computational Biology , Databases, Factual , Genomics , Proteomics , Humans , Pattern Recognition, Automated
3.
Nucleic Acids Res ; 47(D1): D955-D962, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407550

ABSTRACT

The Human Disease Ontology (DO) (http://www.disease-ontology.org), database has undergone significant expansion in the past three years. The DO disease classification includes specific formal semantic rules to express meaningful disease models and has expanded from a single asserted classification to include multiple-inferred mechanistic disease classifications, thus providing novel perspectives on related diseases. Expansion of disease terms, alternative anatomy, cell type and genetic disease classifications and workflow automation highlight the updates for the DO since 2015. The enhanced breadth and depth of the DO's knowledgebase has expanded the DO's utility for exploring the multi-etiology of human disease, thus improving the capture and communication of health-related data across biomedical databases, bioinformatics tools, genomic and cancer resources and demonstrated by a 6.6× growth in DO's user community since 2015. The DO's continual integration of human disease knowledge, evidenced by the more than 200 SVN/GitHub releases/revisions, since previously reported in our DO 2015 NAR paper, includes the addition of 2650 new disease terms, a 30% increase of textual definitions, and an expanding suite of disease classification hierarchies constructed through defined logical axioms.


Subject(s)
Biological Ontologies , Databases, Factual , Disease , Disease/classification , Disease/etiology , Humans , Workflow
4.
Nucleic Acids Res ; 47(D1): D1186-D1194, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30407590

ABSTRACT

The Evidence and Conclusion Ontology (ECO) contains terms (classes) that describe types of evidence and assertion methods. ECO terms are used in the process of biocuration to capture the evidence that supports biological assertions (e.g. gene product X has function Y as supported by evidence Z). Capture of this information allows tracking of annotation provenance, establishment of quality control measures and query of evidence. ECO contains over 1500 terms and is in use by many leading biological resources including the Gene Ontology, UniProt and several model organism databases. ECO is continually being expanded and revised based on the needs of the biocuration community. The ontology is freely available for download from GitHub (https://github.com/evidenceontology/) or the project's website (http://evidenceontology.org/). Users can request new terms or changes to existing terms through the project's GitHub site. ECO is released into the public domain under CC0 1.0 Universal.


Subject(s)
Computational Biology/methods , Databases, Genetic , Gene Ontology , Proteins/genetics , Animals , Humans , Information Storage and Retrieval/methods , Internet , Proteins/metabolism , Sequence Analysis, Protein , User-Computer Interface
5.
Dis Model Mech ; 11(3)2018 03 12.
Article in English | MEDLINE | ID: mdl-29590633

ABSTRACT

Model organisms are vital to uncovering the mechanisms of human disease and developing new therapeutic tools. Researchers collecting and integrating relevant model organism and/or human data often apply disparate terminologies (vocabularies and ontologies), making comparisons and inferences difficult. A unified disease ontology is required that connects data annotated using diverse disease terminologies, and in which the terminology relationships are continuously maintained. The Mouse Genome Database (MGD, http://www.informatics.jax.org), Rat Genome Database (RGD, http://rgd.mcw.edu) and Disease Ontology (DO, http://www.disease-ontology.org) projects are collaborating to augment DO, aligning and incorporating disease terms used by MGD and RGD, and improving DO as a tool for unifying disease annotations across species. Coordinated assessment of MGD's and RGD's disease term annotations identified new terms that enhance DO's representation of human diseases. Expansion of DO term content and cross-references to clinical vocabularies (e.g. OMIM, ORDO, MeSH) has enriched the DO's domain coverage and utility for annotating many types of data generated from experimental and clinical investigations. The extension of anatomy-based DO classification structure of disease improves accessibility of terms and facilitates application of DO for computational research. A consistent representation of disease associations across data types from cellular to whole organism, generated from clinical and model organism studies, will promote the integration, mining and comparative analysis of these data. The coordinated enrichment of the DO and adoption of DO by MGD and RGD demonstrates DO's usability across human data, MGD, RGD and the rest of the model organism database community.


Subject(s)
Disease/genetics , Gene Ontology , Molecular Sequence Annotation , Animals , Databases, Genetic , Mice , Rats , Species Specificity
6.
Article in English | MEDLINE | ID: mdl-26989148

ABSTRACT

Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59,721 human genes and 73,355 mouse genes have been imported from NCBI and 27,306 human proteins and 16,728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists. In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Database URL: https://www.wikidata.org/.


Subject(s)
Databases, Nucleic Acid , Semantics , Animals , Humans , Mice , Models, Theoretical , Search Engine
7.
Mamm Genome ; 26(9-10): 584-9, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26093607

ABSTRACT

The Disease Ontology (DO) enables cross-domain data integration through a common standard of human disease terms and their etiological descriptions. Standardized disease descriptors that are integrated across mammalian genomic resources provide a human-readable, machine-interpretable, community-driven disease corpus that unifies the representation of human common and rare diseases. The DO is populated by consensus-driven disease data descriptors that incorporate disease terms utilized by genomic and genetic projects and resources engaged in studies to understand the genetics of human disease through the study of model organisms. The DO project serves multiple roles for the model organism community by providing: (1) a structured "backbone" of disease concepts represented among the model organism databases; (2) authoritative disease curation services to researchers and resource providers; and (3) development of subsets of the DO representative of human diseases annotated to animal models curated within the model organism databases.


Subject(s)
Databases, Genetic , Disease Models, Animal , Genetic Diseases, Inborn/classification , Animals , Genetic Diseases, Inborn/genetics , Genome , Humans , Phenotype
8.
Database (Oxford) ; 2015: bav032, 2015.
Article in English | MEDLINE | ID: mdl-25841438

ABSTRACT

Bio-ontologies provide terminologies for the scientific community to describe biomedical entities in a standardized manner. There are multiple initiatives that are developing biomedical terminologies for the purpose of providing better annotation, data integration and mining capabilities. Terminology resources devised for multiple purposes inherently diverge in content and structure. A major issue of biomedical data integration is the development of overlapping terms, ambiguous classifications and inconsistencies represented across databases and publications. The disease ontology (DO) was developed over the past decade to address data integration, standardization and annotation issues for human disease data. We have established a DO cancer project to be a focused view of cancer terms within the DO. The DO cancer project mapped 386 cancer terms from the Catalogue of Somatic Mutations in Cancer (COSMIC), The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium, Therapeutically Applicable Research to Generate Effective Treatments, Integrative Oncogenomics and the Early Detection Research Network into a cohesive set of 187 DO terms represented by 63 top-level DO cancer terms. For example, the COSMIC term 'kidney, NS, carcinoma, clear_cell_renal_cell_carcinoma' and TCGA term 'Kidney renal clear cell carcinoma' were both grouped to the term 'Disease Ontology Identification (DOID):4467 / renal clear cell carcinoma' which was mapped to the TopNodes_DOcancerslim term 'DOID:263 / kidney cancer'. Mapping of diverse cancer terms to DO and the use of top level terms (DO slims) will enable pan-cancer analysis across datasets generated from any of the cancer term sources where pan-cancer means including or relating to all or multiple types of cancer. The terms can be browsed from the DO web site (http://www.disease-ontology.org) and downloaded from the DO's Apache Subversion or GitHub repositories. Database URL: http://www.disease-ontology.org


Subject(s)
Biological Ontologies , Data Mining , Databases, Factual , Neoplasms , Animals , Humans
9.
PLoS Negl Trop Dis ; 9(2): e0003479, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25646954

ABSTRACT

BACKGROUND: Ontologies represent powerful tools in information technology because they enhance interoperability and facilitate, among other things, the construction of optimized search engines. To address the need to expand the toolbox available for the control and prevention of vector-borne diseases we embarked on the construction of specific ontologies. We present here IDODEN, an ontology that describes dengue fever, one of the globally most important diseases that are transmitted by mosquitoes. METHODOLOGY/PRINCIPAL FINDINGS: We constructed IDODEN using open source software, and modeled it on IDOMAL, the malaria ontology developed previously. IDODEN covers all aspects of dengue fever, such as disease biology, epidemiology and clinical features. Moreover, it covers all facets of dengue entomology. IDODEN, which is freely available, can now be used for the annotation of dengue-related data and, in addition to its use for modeling, it can be utilized for the construction of other dedicated IT tools such as decision support systems. CONCLUSIONS/SIGNIFICANCE: The availability of the dengue ontology will enable databases hosting dengue-associated data and decision-support systems for that disease to perform most efficiently and to link their own data to those stored in other independent repositories, in an architecture- and software-independent manner.


Subject(s)
Biological Ontologies , Dengue/transmission , Software , Animals , Databases, Factual , Humans
10.
Nucleic Acids Res ; 43(Database issue): D1071-8, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25348409

ABSTRACT

The current version of the Human Disease Ontology (DO) (http://www.disease-ontology.org) database expands the utility of the ontology for the examination and comparison of genetic variation, phenotype, protein, drug and epitope data through the lens of human disease. DO is a biomedical resource of standardized common and rare disease concepts with stable identifiers organized by disease etiology. The content of DO has had 192 revisions since 2012, including the addition of 760 terms. Thirty-two percent of all terms now include definitions. DO has expanded the number and diversity of research communities and community members by 50+ during the past two years. These community members actively submit term requests, coordinate biomedical resource disease representation and provide expert curation guidance. Since the DO 2012 NAR paper, there have been hundreds of term requests and a steady increase in the number of DO listserv members, twitter followers and DO website usage. DO is moving to a multi-editor model utilizing Protégé to curate DO in web ontology language. This will enable closer collaboration with the Human Phenotype Ontology, EBI's Ontology Working Group, Mouse Genome Informatics and the Monarch Initiative among others, and enhance DO's current asserted view and multiple inferred views through reasoning.


Subject(s)
Biological Ontologies , Databases, Factual , Disease , Genetic Diseases, Inborn , Humans , Internet , Rare Diseases/genetics
11.
Pathog Glob Health ; 107(6): 305-11, 2013 Sep.
Article in English | MEDLINE | ID: mdl-24091152

ABSTRACT

Arthropod borne diseases cause significant human morbidity and mortality and, therefore, efficient measures to control transmission of the disease agents would have great impact on human health. One strategy to achieve this goal is based on the manipulation of bacterial symbionts of vectors. Bacteria of the Gram-negative, acetic acid bacterium genus Asaia have been found to be stably associated with larvae and adults of the Southeast Asian malaria vector Anopheles stephensi, dominating the microbiota of the mosquito. We show here that after the infection of Anopheles gambiae larvae with Asaia the bacteria were stably associated with the mosquitoes, becoming part of the microflora of the midgut and remaining there for the duration of the life cycle. Moreover they were passed on to the next generation through vertical transmission. Additionally, we show that there is an increase in the developmental rate when additional bacteria are introduced into the organism which leads us to the conclusion that Asaia plays a yet undetermined crucial role during the larval stages. Our microarray analysis showed that the larval genes that are mostly affected are involved in cuticle formation, and include mainly members of the CPR gene family.


Subject(s)
Acetobacteraceae/physiology , Anopheles/growth & development , Anopheles/microbiology , Symbiosis , Acetobacteraceae/growth & development , Animals , Anopheles/genetics , Gastrointestinal Tract/microbiology , Gene Expression Profiling , Larva/genetics , Larva/growth & development , Larva/microbiology , Microarray Analysis
12.
J Biomed Semantics ; 4(1): 16, 2013 Sep 13.
Article in English | MEDLINE | ID: mdl-24034841

ABSTRACT

BACKGROUND: With about half a billion cases, of which nearly one million fatal ones, malaria constitutes one of the major infectious diseases worldwide. A recently revived effort to eliminate the disease also focuses on IT resources for its efficient control, which prominently includes the control of the mosquito vectors that transmit the Plasmodium pathogens. As part of this effort, IDOMAL has been developed and it is continually being updated. FINDINGS: In addition to the improvement of IDOMAL's structure and the correction of some inaccuracies, there were some major subdomain additions such as a section on natural products and remedies, and the import, from other, higher order ontologies, of several terms, which were merged with IDOMAL terms. Effort was put on rendering IDOMAL fully compatible as an extension of IDO, the Infectious Disease Ontology. The reason for the difficulties in fully reaching that target were the inherent differences between vector-borne diseases and "classical" infectious diseases, which make it necessary to specifically adjust the ontology's architecture in order to comprise vectors and their populations. CONCLUSIONS: In addition to a higher coverage of domain-specific terms and optimizing its usage by databases and decision-support systems, the new version of IDOMAL described here allows for more cross-talk between it and other ontologies, and in particular IDO. The malaria ontology is available for downloading at the OBO Foundry (http://www.obofoundry.org/cgi-bin/detail.cgi?id=malaria_ontology) and the NCBO BioPortal (http://bioportal.bioontology.org/ontologies/1311).

13.
J Biomed Inform ; 44(1): 42-7, 2011 Feb.
Article in English | MEDLINE | ID: mdl-20363364

ABSTRACT

We are developing a set of ontologies dealing with vector-borne diseases as well as the arthropod vectors that transmit them. After building ontologies for mosquito and tick anatomy we continued this project with an ontology of insecticide resistance followed by a series of ontologies that describe malaria as well as physiological processes of mosquitoes that are relevant to, and involved in, disease transmission. These will later be expanded to encompass other vector-borne diseases as well as non-mosquito vectors. The aim of the whole undertaking, which is worked out in the frame of the international IDO (Infectious Disease Ontology) project, is to provide the community with a set of ontological tools that can be used both in the development of specific databases and, most importantly, in the construction of decision support systems (DSS) to control these diseases.


Subject(s)
Arthropod Vectors , Disease Transmission, Infectious , Medical Informatics , Vocabulary, Controlled , Animals , Database Management Systems , Decision Making, Computer-Assisted , Malaria/parasitology
14.
Malar J ; 9: 230, 2010 Aug 10.
Article in English | MEDLINE | ID: mdl-20698959

ABSTRACT

BACKGROUND: Ontologies are rapidly becoming a necessity for the design of efficient information technology tools, especially databases, because they permit the organization of stored data using logical rules and defined terms that are understood by both humans and machines. This has as consequence both an enhanced usage and interoperability of databases and related resources. It is hoped that IDOMAL, the ontology of malaria will prove a valuable instrument when implemented in both malaria research and control measures. METHODS: The OBOEdit2 software was used for the construction of the ontology. IDOMAL is based on the Basic Formal Ontology (BFO) and follows the rules set by the OBO Foundry consortium. RESULTS: The first version of the malaria ontology covers both clinical and epidemiological aspects of the disease, as well as disease and vector biology. IDOMAL is meant to later become the nucleation site for a much larger ontology of vector borne diseases, which will itself be an extension of a large ontology of infectious diseases (IDO). The latter is currently being developed in the frame of a large international collaborative effort. CONCLUSIONS: IDOMAL, already freely available in its first version, will form part of a suite of ontologies that will be used to drive IT tools and databases specifically constructed to help control malaria and, later, other vector-borne diseases. This suite already consists of the ontology described here as well as the one on insecticide resistance that has been available for some time. Additional components are being developed and introduced into IDOMAL.


Subject(s)
Computational Biology/methods , Disease Vectors , Information Storage and Retrieval/methods , Malaria , Animals , Databases, Factual , Humans , Insecticide Resistance , Malaria/diagnosis , Malaria/epidemiology , Malaria/therapy , Malaria/transmission , Software , Vocabulary, Controlled
SELECTION OF CITATIONS
SEARCH DETAIL
...