Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
JAMIA Open ; 4(3): ooab079, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34541463

ABSTRACT

OBJECTIVES: We sought to cluster biological phenotypes using semantic similarity and create an easy-to-install, stable, and reproducible tool. MATERIALS AND METHODS: We generated Phenotype Clustering (PhenClust)-a novel application of semantic similarity for interpreting biological phenotype associations-using the Unified Medical Language System (UMLS) metathesaurus, demonstrated the tool's application, and developed Docker containers with stable installations of two UMLS versions. RESULTS: PhenClust identified disease clusters for drug network-associated phenotypes and a meta-analysis of drug target candidates. The Dockerized containers eliminated the requirement that the user install the UMLS metathesaurus. DISCUSSION: Clustering phenotypes summarized all phenotypes associated with a drug network and two drug candidates. Docker containers can support dissemination and reproducibility of tools that are otherwise limited due to insufficient software support. CONCLUSION: PhenClust can improve interpretation of high-throughput biological analyses where many phenotypes are associated with a query and the Dockerized PhenClust achieved our objective of decreasing installation complexity.

2.
J Biomed Inform ; 117: 103732, 2021 05.
Article in English | MEDLINE | ID: mdl-33737208

ABSTRACT

BACKGROUND: Understanding the relationships between genes, drugs, and disease states is at the core of pharmacogenomics. Two leading approaches for identifying these relationships in medical literature are: human expert led manual curation efforts, and modern data mining based automated approaches. The former generates small amounts of high-quality data, and the latter offers large volumes of mixed quality data. The algorithmically extracted relationships are often accompanied by supporting evidence, such as, confidence scores, source articles, and surrounding contexts (excerpts) from the articles, that can be used as data quality indicators. Tools that can leverage these quality indicators to help the user gain access to larger and high-quality data are needed. APPROACH: We introduce GeneDive, a web application for pharmacogenomics researchers and precision medicine practitioners that makes gene, disease, and drug interactions data easily accessible and usable. GeneDive is designed to meet three key objectives: (1) provide functionality to manage information-overload problem and facilitate easy assimilation of supporting evidence, (2) support longitudinal and exploratory research investigations, and (3) offer integration of user-provided interactions data without requiring data sharing. RESULTS: GeneDive offers multiple search modalities, visualizations, and other features that guide the user efficiently to the information of their interest. To facilitate exploratory research, GeneDive makes the supporting evidence and context for each interaction readily available and allows the data quality threshold to be controlled by the user as per their risk tolerance level. The interactive search-visualization loop enables relationship discoveries between diseases, genes, and drugs that might not be explicitly described in literature but are emergent from the source medical corpus and deductive reasoning. The ability to utilize user's data either in combination with the GeneDive native datasets or in isolation promotes richer data-driven exploration and discovery. These functionalities along with GeneDive's applicability for precision medicine, bringing the knowledge contained in biomedical literature to bear on particular clinical situations and improving patient care, are illustrated through detailed use cases. CONCLUSION: GeneDive is a comprehensive, broad-use biological interactions browser. The GeneDive application and information about its underlying system architecture are available at http://www.genedive.net. GeneDive Docker image is also available for download at this URL, allowing users to (1) import their own interaction data securely and privately; and (2) generate and test hypotheses across their own and other datasets.


Subject(s)
Pharmaceutical Preparations , Precision Medicine , Data Mining , Humans , Pharmacogenetics , Software
3.
Bioinformatics ; 35(21): 4504-4506, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31114840

ABSTRACT

SUMMARY: Limited efficacy and intolerable safety limit therapeutic development and identification of potential liabilities earlier in development could significantly improve this process. Computational approaches which aggregate data from multiple sources and consider the drug's pathways effects could add to identification of these liabilities earlier. Such computational methods must be accessible to a variety of users beyond computational scientists, especially regulators and industry scientists, in order to impact the therapeutic development process. We have previously developed and published PathFX, an algorithm for identifying drug networks and phenotypes for understanding drug associations to safety and efficacy. Here we present a streamlined and easy-to-use PathFX web application that allows users to search for drug networks and associated phenotypes. We have also added visualization, and phenotype clustering to improve functionality and interpretability of PathFXweb. AVAILABILITY AND IMPLEMENTATION: https://www.pathfxweb.net/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Algorithms , Computational Biology , Phenotype
4.
Pac Symp Biocomput ; 23: 204-215, 2018.
Article in English | MEDLINE | ID: mdl-29218882

ABSTRACT

Machine Learning (ML) methods are now influencing major decisions about patient care, new medical methods, drug development and their use and importance are rapidly increasing in all areas. However, these ML methods are inherently complex and often difficult to understand and explain resulting in barriers to their adoption and validation. Our work (RFEX) focuses on enhancing Random Forest (RF) classifier explainability by developing easy to interpret explainability summary reports from trained RF classifiers as a way to improve the explainability for (often non-expert) users. RFEX is implemented and extensively tested on Stanford FEATURE data where RF is tasked with predicting functional sites in 3D molecules based on their electrochemical signatures (features). In developing RFEX method we apply user-centered approach driven by explainability questions and requirements collected by discussions with interested practitioners. We performed formal usability testing with 13 expert and non-expert users to verify RFEX usefulness. Analysis of RFEX explainability report and user feedback indicates its usefulness in significantly increasing explainability and user confidence in RF classification on FEATURE data. Notably, RFEX summary reports easily reveal that one needs very few (from 2-6 depending on a model) top ranked features to achieve 90% or better of the accuracy when all 480 features are used.


Subject(s)
Supervised Machine Learning/statistics & numerical data , User-Computer Interface , Algorithms , Classification/methods , Computational Biology/methods , Databases, Factual/statistics & numerical data , Humans , Models, Statistical
5.
Pac Symp Biocomput ; 23: 590-601, 2018.
Article in English | MEDLINE | ID: mdl-29218917

ABSTRACT

Obtaining relevant information about gene interactions is critical for understanding disease processes and treatment. With the rise in text mining approaches, the volume of such biomedical data is rapidly increasing, thereby creating a new problem for the users of this data: information overload. A tool for efficient querying and visualization of biomedical data that helps researchers understand the underlying biological mechanisms for diseases and drug responses, and ultimately helps patients, is sorely needed. To this end we have developed GeneDive, a web-based information retrieval, filtering, and visualization tool for large volumes of gene interaction data. GeneDive offers various features and modalities that guide the user through the search process to efficiently reach the information of their interest. GeneDive currently processes over three million gene-gene interactions with response times within a few seconds. For over half of the curated gene sets sourced from four prominent databases, more than 80% of the gene set members are recovered by GeneDive. In the near future, GeneDive will seamlessly accommodate other interaction types, such as gene-drug and gene-disease interactions, thus enabling full exploration of topics such as precision medicine. The GeneDive application and information about its underlying system architecture are available at http://www.genedive.net.


Subject(s)
Epistasis, Genetic , Precision Medicine/statistics & numerical data , Software , Computational Biology/methods , Computer Graphics/statistics & numerical data , Data Mining/statistics & numerical data , Databases, Genetic/statistics & numerical data , Gene Regulatory Networks , Humans , Information Storage and Retrieval/statistics & numerical data , Internet , User-Computer Interface
6.
Pac Symp Biocomput ; 23: 623-627, 2018.
Article in English | MEDLINE | ID: mdl-29218921

ABSTRACT

The goals of this workshop are to discuss challenges in explainability of current Machine Leaning and Deep Analytics (MLDA) used in biocomputing and to start the discussion on ways to improve it. We define explainability in MLDA as easy to use information explaining why and how the MLDA approach made its decisions. We believe that much greater effort is needed to address the issue of MLDA explainability because of: 1) the ever increasing use and dependence on MLDA in biocomputing including the need for increased adoption by non-MLD experts; 2) the diversity, complexity and scale of biocomputing data and MLDA algorithms; 3) the emerging importance of MLDA-based decisions in patient care, in daily research, as well as in the development of new costly medical procedures and drugs. This workshop aims to: a) analyze and challenge the current level of explainability of MLDA methods and practices in biocomputing; b) explore benefits of improvements in this area; and c) provide useful and practical guidance to the biocomputing community on how to address these challenges and how to develop improvements. The workshop format is designed to encourage a lively discussion with panelists to first motivate and understand the problem and then to define next steps and solutions needed to improve MLDA explainability.


Subject(s)
Computational Biology/methods , Machine Learning/statistics & numerical data , Algorithms , Artificial Intelligence , Cluster Analysis , Decision Support Techniques , Gene Expression Profiling/statistics & numerical data , Humans , Neural Networks, Computer , Single-Cell Analysis/statistics & numerical data
7.
PLoS One ; 9(3): e91240, 2014.
Article in English | MEDLINE | ID: mdl-24632601

ABSTRACT

We address the problem of assigning biological function to solved protein structures. Computational tools play a critical role in identifying potential active sites and informing screening decisions for further lab analysis. A critical parameter in the practical application of computational methods is the precision, or positive predictive value. Precision measures the level of confidence the user should have in a particular computed functional assignment. Low precision annotations lead to futile laboratory investigations and waste scarce research resources. In this paper we describe an advanced version of the protein function annotation system FEATURE, which achieved 99% precision and average recall of 95% across 20 representative functional sites. The system uses a Support Vector Machine classifier operating on the microenvironment of physicochemical features around an amino acid. We also compared performance of our method with state-of-the-art sequence-level annotator Pfam in terms of precision, recall and localization. To our knowledge, no other functional site annotator has been rigorously evaluated against these key criteria. The software and predictive models are incorporated into the WebFEATURE service at http://feature.stanford.edu/wf4.0-beta.


Subject(s)
Proteins/chemistry , Software , Computational Biology/methods , Databases, Protein , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...