Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
J Biomed Inform ; 60: 177-86, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26873780

ABSTRACT

Modern biomedical research relies on the semantic integration of heterogeneous data sources to find data correlations. Researchers access multiple datasets of disparate origin, and identify elements-e.g. genes, compounds, pathways-that lead to interesting correlations. Normally, they must refer to additional public databases in order to enrich the information about the identified entities-e.g. scientific literature, published clinical trial results, etc. While semantic integration techniques have traditionally focused on providing homogeneous access to private datasets-thus helping automate the first part of the research, and there exist different solutions for browsing public data, there is still a need for tools that facilitate merging public repositories with private datasets. This paper presents a framework that automatically locates public data of interest to the researcher and semantically integrates it with existing private datasets. The framework has been designed as an extension of traditional data integration systems, and has been validated with an existing data integration platform from a European research project by integrating a private biological dataset with data from the National Center for Biotechnology Information (NCBI).


Subject(s)
Information Storage and Retrieval/methods , Semantics , Software , Systems Integration , Biomedical Research , Computational Biology/methods , Databases, Factual , Humans , MicroRNAs/genetics , User-Computer Interface , Wilms Tumor/genetics
2.
PLoS One ; 9(10): e110331, 2014.
Article in English | MEDLINE | ID: mdl-25347075

ABSTRACT

BACKGROUND: Clinical Trials (CTs) are essential for bridging the gap between experimental research on new drugs and their clinical application. Just like CTs for traditional drugs and biologics have helped accelerate the translation of biomedical findings into medical practice, CTs for nanodrugs and nanodevices could advance novel nanomaterials as agents for diagnosis and therapy. Although there is publicly available information about nanomedicine-related CTs, the online archiving of this information is carried out without adhering to criteria that discriminate between studies involving nanomaterials or nanotechnology-based processes (nano), and CTs that do not involve nanotechnology (non-nano). Finding out whether nanodrugs and nanodevices were involved in a study from CT summaries alone is a challenging task. At the time of writing, CTs archived in the well-known online registry ClinicalTrials.gov are not easily told apart as to whether they are nano or non-nano CTs-even when performed by domain experts, due to the lack of both a common definition for nanotechnology and of standards for reporting nanomedical experiments and results. METHODS: We propose a supervised learning approach for classifying CT summaries from ClinicalTrials.gov according to whether they fall into the nano or the non-nano categories. Our method involves several stages: i) extraction and manual annotation of CTs as nano vs. non-nano, ii) pre-processing and automatic classification, and iii) performance evaluation using several state-of-the-art classifiers under different transformations of the original dataset. RESULTS AND CONCLUSIONS: The performance of the best automated classifier closely matches that of experts (AUC over 0.95), suggesting that it is feasible to automatically detect the presence of nanotechnology products in CT summaries with a high degree of accuracy. This can significantly speed up the process of finding whether reports on ClinicalTrials.gov might be relevant to a particular nanoparticle or nanodevice, which is essential to discover any precedents for nanotoxicity events or advantages for targeted drug therapy.


Subject(s)
Artificial Intelligence , Clinical Trials as Topic , Medical Informatics , Nanomedicine , Nanotechnology , Web Browser , Humans , ROC Curve , Registries , Reproducibility of Results , Research Design
3.
Ecancermedicalscience ; 8: 399, 2014.
Article in English | MEDLINE | ID: mdl-24567756

ABSTRACT

Usability testing methods are nowadays integrated into the design and development of health-care software, and the need for usability in health-care information technology (IT) is widely accepted by clinicians and researchers. Usability assessment starts with the identification of specific objectives that need to be tested and continues with the definition of evaluation criteria and monitoring procedures before usability tests are performed to assess the quality of all services and tasks. Such a process is implemented in the p-medicine environment and gives feedback iteratively to all software developers in the project. GCP (good clinical practice) criteria require additional usability testing of the software. For the p-medicine project (www.p-medicine.eu), an extended usability concept (EUC) was developed. The EUC covers topics like ease of use, likeability, and usefulness, usability in trial centres characterised by a mixed care and research environment and by extreme time constraints, confidentiality, use of source documents, standard operating procedures (SOA), and quality control during data handling to ensure that all data are reliable and have been processed correctly in terms of accuracy, completeness, legibility, consistence, and timeliness. Here, we describe the p-medicine EUC, focusing on two of the many key tools: ObTiMA and the Ontology Annotator (OA).

4.
Article in English | MEDLINE | ID: mdl-24110412

ABSTRACT

Clinical decision support (CDS) systems promise to improve the quality of clinical care by helping physicians to make better, more informed decisions efficiently. However, the design and testing of CDS systems for practical medical use is cumbersome. It has been recognized that this may easily lead to a problematic mismatch between the developers' idea of the system and requirements from clinical practice. In this paper, we will present an approach to reduce the complexity of constructing a CDS system. The approach is based on an ontological annotation of data resources, which improves standardization and the semantic processing of data. This, in turn, allows to use data mining tools to automatically create hypotheses for CDS models, which reduces the manual workload in the creation of a new model. The approach is implemented in the context of EU research project p-medicine. A proof of concept implementation on data from an existing Leukemia study is presented.


Subject(s)
Decision Support Systems, Clinical , Algorithms , Data Mining , Decision Support Techniques , Humans
5.
Article in English | MEDLINE | ID: mdl-23920744

ABSTRACT

RDF has established in the last years as the language for describing, publishing and sharing biomedical resources. Following this trend, a great amount of RDF-based data sources, as well as ontologies, have appeared. Using a common language as RDF has provided a unified syntactic for sharing resources, but the semantics remain as the main cause of heterogeneity, hampering data integration and homogenization efforts. To overcome this issue, ontology alignment based solutions have been typically used. However, alignment information is usually codified using ad-hoc formats. In this paper, we present a general purpose ontology mapping format, totally independent from the homogenization approach to be applied. The format is accompanied with a Java API that offers mapping construction and parsing features, as well as some basic algorithms for applying it to data translation solutions.


Subject(s)
Biological Ontologies , Electronic Health Records , Medical Record Linkage/methods , Natural Language Processing , Programming Languages , Terminology as Topic , Systems Integration
6.
Biomed Res Int ; 2013: 983805, 2013.
Article in English | MEDLINE | ID: mdl-23984425

ABSTRACT

RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.


Subject(s)
Access to Information , Databases, Genetic , Software , Search Engine
7.
Comput Methods Programs Biomed ; 111(1): 220-7, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23669178

ABSTRACT

This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database.


Subject(s)
Databases, Genetic/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Software , Computational Biology , Humans , Information Storage and Retrieval/statistics & numerical data , Programming Languages , User-Computer Interface
8.
Stud Health Technol Inform ; 169: 734-8, 2011.
Article in English | MEDLINE | ID: mdl-21893844

ABSTRACT

The challenges regarding seamless integration of distributed, heterogeneous and multilevel data arising in the context of contemporary, post-genomic clinical trials cannot be effectively addressed with current methodologies. An urgent need exists to access data in a uniform manner, to share information among different clinical and research centers, and to store data in secure repositories assuring the privacy of patients. Advancing Clinico-Genomic Trials (ACGT) was a European Commission funded Integrated Project that aimed at providing tools and methods to enhance the efficiency of clinical trials in the -omics era. The project, now completed after four years of work, involved the development of both a set of methodological approaches as well as tools and services and its testing in the context of real-world clinico-genomic scenarios. This paper describes the main experiences using the ACGT platform and its tools within one such scenario and highlights the very promising results obtained.


Subject(s)
Computational Biology/organization & administration , Medical Informatics/organization & administration , Biomedical Research , Clinical Trials as Topic , Computer Systems , Computers , Europe , Genomics , Humans , Neoplasms/genetics , Program Development , User-Computer Interface , Workflow
9.
J Biomed Inform ; 44(1): 8-25, 2011 Feb.
Article in English | MEDLINE | ID: mdl-20438862

ABSTRACT

OBJECTIVE: This paper introduces the objectives, methods and results of ontology development in the EU co-funded project Advancing Clinico-genomic Trials on Cancer-Open Grid Services for Improving Medical Knowledge Discovery (ACGT). While the available data in the life sciences has recently grown both in amount and quality, the full exploitation of it is being hindered by the use of different underlying technologies, coding systems, category schemes and reporting methods on the part of different research groups. The goal of the ACGT project is to contribute to the resolution of these problems by developing an ontology-driven, semantic grid services infrastructure that will enable efficient execution of discovery-driven scientific workflows in the context of multi-centric, post-genomic clinical trials. The focus of the present paper is the ACGT Master Ontology (MO). METHODS: ACGT project researchers undertook a systematic review of existing domain and upper-level ontologies, as well as of existing ontology design software, implementation methods, and end-user interfaces. This included the careful study of best practices, design principles and evaluation methods for ontology design, maintenance, implementation, and versioning, as well as for use on the part of domain experts and clinicians. RESULTS: To date, the results of the ACGT project include (i) the development of a master ontology (the ACGT-MO) based on clearly defined principles of ontology development and evaluation; (ii) the development of a technical infrastructure (the ACGT Platform) that implements the ACGT-MO utilizing independent tools, components and resources that have been developed based on open architectural standards, and which includes an application updating and evolving the ontology efficiently in response to end-user needs; and (iii) the development of an Ontology-based Trial Management Application (ObTiMA) that integrates the ACGT-MO into the design process of clinical trials in order to guarantee automatic semantic integration without the need to perform a separate mapping process.


Subject(s)
Computational Biology , Database Management Systems , Medical Informatics , Medical Oncology , Neoplasms , Animals , Databases, Factual , Humans , Vocabulary, Controlled
10.
Bioinformatics ; 26(21): 2801-2, 2010 Nov 01.
Article in English | MEDLINE | ID: mdl-20829445

ABSTRACT

SUMMARY: PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. AVAILABILITY: PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder


Subject(s)
Base Sequence , Computational Biology/methods , Databases, Genetic , Internet , Nucleic Acids/chemistry , Software , PubMed
11.
Article in English | MEDLINE | ID: mdl-18487699

ABSTRACT

The increasing amount of information available for biomedical research has led to issues related to knowledge discovery in large collections of data. Moreover, Information Retrieval techniques must consider heterogeneities present in databases, initially belonging to different domains-e.g. clinical and genetic data. One of the goals, among others, of the ACGT European is to provide seamless and homogeneous access to integrated databases. In this work, we describe an approach to overcome heterogeneities in identifiers inside queries. We present an ontology classifying the most common identifier semantic heterogeneities, and a service that makes use of it to cope with the problem using the described approach. Finally, we illustrate the solution by analysing a set of real queries.


Subject(s)
Database Management Systems , Databases, Genetic , Genome , Information Systems , Integrated Advanced Information Management Systems , Medical Informatics Applications , Multicenter Studies as Topic , Neoplasms/genetics , Algorithms , Humans , Medical Records Systems, Computerized , Neoplasms/therapy , Semantics , Software , Vocabulary, Controlled
12.
AMIA Annu Symp Proc ; : 1042, 2007 Oct 11.
Article in English | MEDLINE | ID: mdl-18694140

ABSTRACT

ACGT is an IST-FP6 Integrated Project, funded by the European Commission, for the development of services to support clinico-genomic trials on cancer in a grid-based environment. In these trials, physicians and researchers need to access heterogeneous and disparate data sources. Semantic access to these data and the possibility to integrate them seamlessly are issues that ACGT aim to solve.


Subject(s)
Database Management Systems , Neoplasms/genetics , Europe , Genomics/methods , Humans , Semantics , Systems Integration
13.
AMIA Annu Symp Proc ; : 1074, 2007 Oct 11.
Article in English | MEDLINE | ID: mdl-18694172

ABSTRACT

Knowledge discovery approaches in modern biomedical research usually require to access heterogeneous and remote data sources in a distributed environment. Traditional KDD models assumed a central repository, lacking mechanisms to access decentralized databases. In such distributed environment, ontologies can be used in all the KDD phases. We present here a new model of ontology-based KDD approach to improve data preprocessing from heterogeneous sources.


Subject(s)
Artificial Intelligence , Information Management , Vocabulary, Controlled , Biomedical Research , Data Collection/methods , Information Storage and Retrieval
SELECTION OF CITATIONS
SEARCH DETAIL
...