Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
BMC Bioinformatics ; 23(1): 42, 2022 Jan 15.
Article in English | MEDLINE | ID: mdl-35033007

ABSTRACT

BACKGROUND: There has been a simultaneous increase in demand and accessibility across genomics, transcriptomics, proteomics and metabolomics data, known as omics data. This has encouraged widespread application of omics data in life sciences, from personalized medicine to the discovery of underlying pathophysiology of diseases. Causal analysis of omics data may provide important insight into the underlying biological mechanisms. Existing causal analysis methods yield promising results when identifying potential general causes of an observed outcome based on omics data. However, they may fail to discover the causes specific to a particular stratum of individuals and missing from others. METHODS: To fill this gap, we introduce the problem of stratified causal discovery and propose a method, Aristotle, for solving it. Aristotle addresses the two challenges intrinsic to omics data: high dimensionality and hidden stratification. It employs existing biological knowledge and a state-of-the-art patient stratification method to tackle the above challenges and applies a quasi-experimental design method to each stratum to find stratum-specific potential causes. RESULTS: Evaluation based on synthetic data shows better performance for Aristotle in discovering true causes under different conditions compared to existing causal discovery methods. Experiments on a real dataset on Anthracycline Cardiotoxicity indicate that Aristotle's predictions are consistent with the existing literature. Moreover, Aristotle makes additional predictions that suggest further investigations.


Subject(s)
Genomics , Proteomics , Humans , Metabolomics , Precision Medicine , Transcriptome
2.
Bioinformatics ; 37(12): 1691-1698, 2021 Jul 19.
Article in English | MEDLINE | ID: mdl-33325506

ABSTRACT

MOTIVATION: Identification of differentially expressed genes is necessary for unraveling disease pathogenesis. This task is complicated by the fact that many diseases are heterogeneous at the molecular level and samples representing distinct disease subtypes may demonstrate different patterns of dysregulation. Biclustering methods are capable of identifying genes that follow a similar expression pattern only in a subset of samples and hence can consider disease heterogeneity. However, identifying biologically significant and reproducible sets of genes and samples remain challenging for the existing tools. Many recent studies have shown that the integration of gene expression and protein interaction data improves the robustness of prediction and classification and advances biomarker discovery. RESULTS: Here, we present DESMOND, a new method for identification of Differentially ExpreSsed gene MOdules iN Diseases. DESMOND performs network-constrained biclustering on gene expression data and identifies gene modules-connected sets of genes up- or down-regulated in subsets of samples. We applied DESMOND on expression profiles of samples from two large breast cancer cohorts and have shown that the capability of DESMOND to incorporate protein interactions allows identifying the biologically meaningful gene and sample subsets and improves the reproducibility of the results. AVAILABILITY AND IMPLEMENTATION: https://github.com/ozolotareva/DESMOND. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
Contemp Clin Trials ; 101: 106239, 2021 02.
Article in English | MEDLINE | ID: mdl-33279656

ABSTRACT

BACKGROUND: The novel coronavirus 2019 (COVID-19) pandemic has mobilized global research at an unprecedented scale. While challenges associated with the COVID-19 trial landscape have been discussed previously, no comprehensive reviews have been conducted to assess the reporting, design, and data sharing practices of randomized controlled trials (RCTs). PURPOSE: The purpose of this review was to gain insight into the current landscape of reporting, methodological design, and data sharing practices for COVID-19 RCTs. DATA SOURCES: We conducted three searches to identify registered clinical trials, peer-reviewed publications, and pre-print publications. STUDY SELECTION: After screening eight major trial registries and 7844 records, we identified 178 registered trials and 38 publications describing 35 trials, including 25 peer-reviewed publications and 13 pre-prints. DATA EXTRACTION: Trial ID, registry, location, population, intervention, control, study design, recruitment target, actual recruitment, outcomes, data sharing statement, and time of data sharing were extracted. DATA SYNTHESIS: Of 178 registered trials, 112 (62.92%) were in hospital settings, median planned recruitment was 100 participants (IQR: 60, 168), and the majority (n = 166, 93.26%) did not report results in their respective registries. Of 35 published trials, 31 (88.57%) were in hospital settings, median actual recruitment was 86 participants (IQR: 55.5, 218), 10 (28.57%) did not reach recruitment targets, and 27 trials (77.14%) reported plans to share data. CONCLUSIONS: The findings of our study highlight limitations in the design and reporting practices of COVID-19 RCTs and provide guidance towards more efficient reporting of trial results, greater diversity in patient settings, and more robust data sharing.


Subject(s)
COVID-19 , Randomized Controlled Trials as Topic , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19/therapy , Data Management/organization & administration , Data Management/standards , Humans , Quality Improvement , Randomized Controlled Trials as Topic/methods , Randomized Controlled Trials as Topic/standards , Randomized Controlled Trials as Topic/statistics & numerical data , Research Design/standards , Research Design/statistics & numerical data , SARS-CoV-2
4.
PLoS Comput Biol ; 15(11): e1007451, 2019 11.
Article in English | MEDLINE | ID: mdl-31710622

ABSTRACT

Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM's results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM.


Subject(s)
Computational Biology/methods , Metabolic Networks and Pathways/genetics , Neoplasms/genetics , Algorithms , Colorectal Neoplasms/genetics , Disease Progression , Glioblastoma/genetics , Humans , Linear Models , Models, Theoretical , Mutation , Neoplasms/metabolism , Signal Transduction/genetics
5.
Bioinformatics ; 35(14): i379-i388, 2019 07 15.
Article in English | MEDLINE | ID: mdl-31510674

ABSTRACT

MOTIVATION: Despite the remarkable advances in sequencing and computational techniques, noise in the data and complexity of the underlying biological mechanisms render deconvolution of the phylogenetic relationships between cancer mutations difficult. Besides that, the majority of the existing datasets consist of bulk sequencing data of single tumor sample of an individual. Accurate inference of the phylogenetic order of mutations is particularly challenging in these cases and the existing methods are faced with several theoretical limitations. To overcome these limitations, new methods are required for integrating and harnessing the full potential of the existing data. RESULTS: We introduce a method called Hintra for intra-tumor heterogeneity detection. Hintra integrates sequencing data for a cohort of tumors and infers tumor phylogeny for each individual based on the evolutionary information shared between different tumors. Through an iterative process, Hintra learns the repeating evolutionary patterns and uses this information for resolving the phylogenetic ambiguities of individual tumors. The results of synthetic experiments show an improved performance compared to two state-of-the-art methods. The experimental results with a recent Breast Cancer dataset are consistent with the existing knowledge and provide potentially interesting findings. AVAILABILITY AND IMPLEMENTATION: The source code for Hintra is available at https://github.com/sahandk/HINTRA.


Subject(s)
Neoplasms , Software , Humans , Mutation , Phylogeny , Sequence Analysis
6.
Bioinformatics ; 35(18): 3263-3272, 2019 09 15.
Article in English | MEDLINE | ID: mdl-30768166

ABSTRACT

MOTIVATION: Patient stratification methods are key to the vision of precision medicine. Here, we consider transcriptional data to segment the patient population into subsets relevant to a given phenotype. Whereas most existing patient stratification methods focus either on predictive performance or interpretable features, we developed a method striking a balance between these two important goals. RESULTS: We introduce a Bayesian method called SUBSTRA that uses regularized biclustering to identify patient subtypes and interpretable subtype-specific transcript clusters. The method iteratively re-weights feature importance to optimize phenotype prediction performance by producing more phenotype-relevant patient subtypes. We investigate the performance of SUBSTRA in finding relevant features using simulated data and successfully benchmark it against state-of-the-art unsupervised stratification methods and supervised alternatives. Moreover, SUBSTRA achieves predictive performance competitive with the supervised benchmark methods and provides interpretable transcriptional features in diverse biological settings, such as drug response prediction, cancer diagnosis, or kidney transplant rejection. AVAILABILITY AND IMPLEMENTATION: The R code of SUBSTRA is available at https://github.com/sahandk/SUBSTRA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Bayes Theorem , Phenotype , Precision Medicine
7.
Pac Symp Biocomput ; 21: 345-56, 2016.
Article in English | MEDLINE | ID: mdl-26776199

ABSTRACT

The move from Empirical Medicine towards Personalized Medicine has attracted attention to Stratified Medicine (SM). Some methods are provided in the literature for patient stratification, which is the central task of SM, however, there are still significant open issues. First, it is still unclear if integrating different datatypes will help in detecting disease subtypes more accurately, and, if not, which datatype(s) are most useful for this task. Second, it is not clear how we can compare different methods of patient stratification. Third, as most of the proposed stratification methods are deterministic, there is a need for investigating the potential benefits of applying probabilistic methods. To address these issues, we introduce a novel integrative Bayesian biclustering method, called B2PS, for patient stratification and propose methods for evaluating the results. Our experimental results demonstrate the superiority of B2PS over a popular state-of-the-art method and the benefits of Bayesian approaches. Our results agree with the intuition that transcriptomic data forms a better basis for patient stratification than genomic data.


Subject(s)
Algorithms , Bayes Theorem , Cluster Analysis , Precision Medicine/methods , Brain Neoplasms/classification , Brain Neoplasms/genetics , Breast Neoplasms/classification , Breast Neoplasms/genetics , Computational Biology/methods , Computational Biology/statistics & numerical data , Databases, Genetic/statistics & numerical data , Female , Gene Expression , Glioblastoma/classification , Glioblastoma/genetics , Humans , Machine Learning , Models, Statistical , Precision Medicine/statistics & numerical data
8.
Article in English | MEDLINE | ID: mdl-26592807

ABSTRACT

The labor-intensive and expensive experimental process of drug-target interaction prediction has motivated many researchers to focus on in silico prediction, which leads to the helpful information in supporting the experimental interaction data. Therefore, they have proposed several computational approaches for discovering new drug-target interactions. Several learning-based methods have been increasingly developed which can be categorized into two main groups: similarity-based and feature-based. In this paper, we firstly use the bi-gram features extracted from the Position Specific Scoring Matrix (PSSM) of proteins in predicting drug-target interactions. Our results demonstrate the high-confidence prediction ability of the Bigram-PSSM model in terms of several performance indicators specifically for enzymes and ion channels. Moreover, we investigate the impact of negative selection strategy on the performance of the prediction, which is not widely taken into account in the other relevant studies. This is important, as the number of non-interacting drug-target pairs are usually extremely large in comparison with the number of interacting ones in existing drug-target interaction data. An interesting observation is that different levels of performance reduction have been attained for four datasets when we change the sampling method from the random sampling to the balanced sampling.


Subject(s)
Databases, Factual/standards , Drug Interactions , Pharmaceutical Preparations/metabolism , Position-Specific Scoring Matrices , Proteins/metabolism , Forecasting , Humans , Pharmaceutical Preparations/chemistry , Proteins/chemistry , Random Allocation
9.
PLoS One ; 8(7): e68073, 2013.
Article in English | MEDLINE | ID: mdl-23874498

ABSTRACT

Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks' structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network.


Subject(s)
Algorithms , Models, Theoretical , Pattern Recognition, Automated/methods , Classification/methods , Computer Simulation , Time Factors
10.
Comput Methods Programs Biomed ; 111(2): 512-8, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23727299

ABSTRACT

This paper presents GelClust, a new software that is designed for processing gel electrophoresis images and generating the corresponding phylogenetic trees. Unlike the most of commercial and non-commercial related softwares, we found that GelClust is very user-friendly and guides the user from image toward dendrogram through seven simple steps. Furthermore, the software, which is implemented in C# programming language under Windows operating system, is more accurate than similar software regarding image processing and is the only software able to detect and correct gel 'smile' effects completely automatically. These claims are supported with experiments.


Subject(s)
Electrophoresis, Agar Gel/methods , Electrophoresis, Polyacrylamide Gel/methods , Image Processing, Computer-Assisted/methods , Software , Algorithms , Cluster Analysis , DNA/analysis , Electronic Data Processing , Phylogeny , Programming Languages , Reproducibility of Results
11.
PLoS One ; 7(8): e43287, 2012.
Article in English | MEDLINE | ID: mdl-22952659

ABSTRACT

Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool.


Subject(s)
Computational Biology/methods , Protein Interaction Mapping/methods , Software , Algorithms , Amino Acid Motifs , Computer Simulation , Models, Biological , Programming Languages , Protein Structure, Tertiary , Systems Biology
SELECTION OF CITATIONS
SEARCH DETAIL
...