Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
Add more filters










Publication year range
1.
Artif Intell Med ; 150: 102820, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38553160

ABSTRACT

Due to the constant increase in cancer rates, the disease has become a leading cause of death worldwide, enhancing the need for its detection and treatment. In the era of personalized medicine, the main goal is to incorporate individual variability in order to choose more precisely which therapy and prevention strategies suit each person. However, predicting the sensitivity of tumors to anticancer treatments remains a challenge. In this work, we propose two deep neural network models to predict the impact of anticancer drugs in tumors through the half-maximal inhibitory concentration (IC50). These models join biological and chemical data to apprehend relevant features of the genetic profile and the drug compounds, respectively. In order to predict the drug response in cancer cell lines, this study employed different DL methods, resorting to Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). In the first stage, two autoencoders were pre-trained with high-dimensional gene expression and mutation data of tumors. Afterward, this genetic background is transferred to the prediction models that return the IC50 value that portrays the potency of a substance in inhibiting a cancer cell line. When comparing RSEM Expected counts and TPM as methods for displaying gene expression data, RSEM has been shown to perform better in deep models and CNNs model can obtain better insight in these types of data. Moreover, the obtained results reflect the effectiveness of the extracted deep representations in the prediction of the IC50 value that portrays the potency of a substance in inhibiting a tumor, achieving a performance of a mean squared error of 1.06 and surpassing previous state-of-the-art models.


Subject(s)
Genetic Profile , Neoplasms , Humans , Neural Networks, Computer , Neoplasms/drug therapy , Neoplasms/genetics , Cell Line , Genomics
2.
Brief Bioinform ; 24(6)2023 09 22.
Article in English | MEDLINE | ID: mdl-37903414

ABSTRACT

The drug discovery process can be significantly improved by applying deep reinforcement learning (RL) methods that learn to generate compounds with desired pharmacological properties. Nevertheless, RL-based methods typically condense the evaluation of sampled compounds into a single scalar value, making it difficult for the generative agent to learn the optimal policy. This work combines self-attention mechanisms and RL to generate promising molecules. The idea is to evaluate the relative significance of each atom and functional group in their interaction with the target, and to utilize this information for optimizing the Generator. Therefore, the framework for de novo drug design is composed of a Generator that samples new compounds combined with a Transformer-encoder and a biological affinity Predictor that evaluate the generated structures. Moreover, it takes the advantage of the knowledge encapsulated in the Transformer's attention weights to evaluate each token individually. We compared the performance of two output prediction strategies for the Transformer: standard and masked language model (MLM). The results show that the MLM Transformer is more effective in optimizing the Generator compared with the state-of-the-art works. Additionally, the evaluation models identified the most important regions of each molecule for the biological interaction with the target. As a case study, we generated synthesizable hit compounds that can be putative inhibitors of the enzyme ubiquitin-specific protein 7 (USP7).


Subject(s)
Drug Design , Learning , Drug Discovery
3.
J Comput Aided Mol Des ; 37(12): 791-806, 2023 12.
Article in English | MEDLINE | ID: mdl-37847342

ABSTRACT

In this work, we develop a method for generating targeted hit compounds by applying deep reinforcement learning and attention mechanisms to predict binding affinity against a biological target while considering stereochemical information. The novelty of this work is a deep model Predictor that can establish the relationship between chemical structures and their corresponding [Formula: see text] values. We thoroughly study the effect of different molecular descriptors such as ECFP4, ECFP6, SMILES and RDKFingerprint. Also, we demonstrated the importance of attention mechanisms to capture long-range dependencies in molecular sequences. Due to the importance of stereochemical information for the binding mechanism, this information was employed both in the prediction and generation processes. To identify the most promising hits, we apply the self-adaptive multi-objective optimization strategy. Moreover, to ensure the existence of stereochemical information, we consider all the possible enumerated stereoisomers to provide the most appropriate 3D structures. We evaluated this approach against the Ubiquitin-Specific Protease 7 (USP7) by generating putative inhibitors for this target. The predictor with SMILES notations as descriptor plus bidirectional recurrent neural network using attention mechanism has the best performance. Additionally, our methodology identify the regions of the generated molecules that are important for the interaction with the receptor's active site. Also, the obtained results demonstrate that it is possible to discover synthesizable molecules with high biological affinity for the target, containing the indication of their optimal stereochemical conformation.


Subject(s)
Artificial Intelligence , Drug Design , Neural Networks, Computer , Molecular Structure
4.
Comput Biol Med ; 164: 107285, 2023 09.
Article in English | MEDLINE | ID: mdl-37557054

ABSTRACT

The design of compounds that target specific biological functions with relevant selectivity is critical in the context of drug discovery, especially due to the polypharmacological nature of most existing drug molecules. In recent years, in silico-based methods combined with deep learning have shown promising results in the de novo drug design challenge, leading to potential leads for biologically interesting targets. However, several of these methods overlook the importance of certain properties, such as validity rate and target selectivity, or simplify the generative process by neglecting the multi-objective nature of the pharmacological space. In this study, we propose a multi-objective Transformer-based architecture to generate drug candidates with desired molecular properties and increased selectivity toward a specific biological target. The framework consists of a Transformer-Decoder Generator that generates novel and valid compounds in the SMILES format notation, a Transformer-Encoder Predictor that estimates the binding affinity toward the biological target, and a feedback loop combined with a multi-objective optimization strategy to rank the generated molecules and condition the generating distribution around the targeted properties. The results demonstrate that the proposed architecture can generate novel and synthesizable small compounds with desired pharmacological properties toward a biologically relevant target. The unbiased Transformer-based Generator achieved superior performance in the novelty rate (97.38%) and comparable performance in terms of internal diversity, uniqueness, and validity against state-of-the-art baselines. The optimization of the unbiased Transformer-based Generator resulted in the generation of molecules exhibiting high binding affinity toward the Adenosine A2A Receptor (AA2AR) and possessing desirable physicochemical properties, where 99.36% of the generated molecules follow Lipinski's rule of five. Furthermore, the implementation of a feedback strategy, in conjunction with a multi-objective algorithm, effectively shifted the distribution of the generated molecules toward optimal values of molecular weight, molecular lipophilicity, topological polar surface area, synthetic accessibility score, and quantitative estimate of drug-likeness, without the necessity of prior training sets comprising molecules endowed with pharmacological properties of interest. Overall, this research study validates the applicability of a Transformer-based architecture in the context of drug design, capable of exploring the vast chemical representation space to generate novel molecules with improved pharmacological properties and target selectivity. The data and source code used in this study are available at: https://github.com/larngroup/FSM-DDTR.


Subject(s)
Drug Design , Drug Discovery , Feedback , Algorithms , Software
6.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35789255

ABSTRACT

The generation of candidate hit molecules with the potential to be used in cancer treatment is a challenging task. In this context, computational methods based on deep learning have been employed to improve in silico drug design methodologies. Nonetheless, the applied strategies have focused solely on the chemical aspect of the generation of compounds, disregarding the likely biological consequences for the organism's dynamics. Herein, we propose a method to implement targeted molecular generation that employs biological information, namely, disease-associated gene expression data, to conduct the process of identifying interesting hits. When applied to the generation of USP7 putative inhibitors, the framework managed to generate promising compounds, with more than 90% of them containing drug-like properties and essential active groups for the interaction with the target. Hence, this work provides a novel and reliable method for generating new promising compounds focused on the biological context of the disease.


Subject(s)
Drug Design , Transcriptome , Ubiquitin-Specific Peptidase 7
7.
Comput Biol Med ; 147: 105772, 2022 08.
Article in English | MEDLINE | ID: mdl-35777085

ABSTRACT

The accurate identification of Drug-Target Interactions (DTIs) remains a critical turning point in drug discovery and understanding of the binding process. Despite recent advances in computational solutions to overcome the challenges of in vitro and in vivo experiments, most of the proposed in silico-based methods still focus on binary classification, overlooking the importance of characterizing DTIs with unbiased binding strength values to properly distinguish primary interactions from those with off-targets. Moreover, several of these methods usually simplify the entire interaction mechanism, neglecting the joint contribution of the individual units of each binding component and the interacting substructures involved, and have yet to focus on more explainable and interpretable architectures. In this study, we propose an end-to-end Transformer-based architecture for predicting drug-target binding affinity (DTA) using 1D raw sequential and structural data to represent the proteins and compounds. This architecture exploits self-attention layers to capture the biological and chemical context of the proteins and compounds, respectively, and cross-attention layers to exchange information and capture the pharmacological context of the DTIs. The results show that the proposed architecture is effective in predicting DTA, achieving superior performance in both correctly predicting the value of interaction strength and being able to correctly discriminate the rank order of binding strength compared to state-of-the-art baselines. The combination of multiple Transformer-Encoders was found to result in robust and discriminative aggregate representations of the proteins and compounds for binding affinity prediction, in which the addition of a Cross-Attention Transformer-Encoder was identified as an important block for improving the discriminative power of these representations. Overall, this research study validates the applicability of an end-to-end Transformer-based architecture in the context of drug discovery, capable of self-providing different levels of potential DTI and prediction understanding due to the nature of the attention blocks. The data and source code used in this study are available at: https://github.com/larngroup/DTITR.


Subject(s)
Proteins , Software , Drug Development , Drug Discovery/methods , Proteins/chemistry
8.
J Cheminform ; 14(1): 40, 2022 Jun 26.
Article in English | MEDLINE | ID: mdl-35754029

ABSTRACT

Drug design is an important area of study for pharmaceutical businesses. However, low efficacy, off-target delivery, time consumption, and high cost are challenges and can create barriers that impact this process. Deep Learning models are emerging as a promising solution to perform de novo drug design, i.e., to generate drug-like molecules tailored to specific needs. However, stereochemistry was not explicitly considered in the generated molecules, which is inevitable in targeted-oriented molecules. This paper proposes a framework based on Feedback Generative Adversarial Network (GAN) that includes optimization strategy by incorporating Encoder-Decoder, GAN, and Predictor deep models interconnected with a feedback loop. The Encoder-Decoder converts the string notations of molecules into latent space vectors, effectively creating a new type of molecular representation. At the same time, the GAN can learn and replicate the training data distribution and, therefore, generate new compounds. The feedback loop is designed to incorporate and evaluate the generated molecules according to the multiobjective desired property at every epoch of training to ensure a steady shift of the generated distribution towards the space of the targeted properties. Moreover, to develop a more precise set of molecules, we also incorporate a multiobjective optimization selection technique based on a non-dominated sorting genetic algorithm. The results demonstrate that the proposed framework can generate realistic, novel molecules that span the chemical space. The proposed Encoder-Decoder model correctly reconstructs 99% of the datasets, including stereochemical information. The model's ability to find uncharted regions of the chemical space was successfully shown by optimizing the unbiased GAN to generate molecules with a high binding affinity to the Kappa Opioid and Adenosine [Formula: see text] receptor. Furthermore, the generated compounds exhibit high internal and external diversity levels 0.88 and 0.94, respectively, and uniqueness.

9.
BMC Bioinformatics ; 23(1): 237, 2022 Jun 17.
Article in English | MEDLINE | ID: mdl-35715734

ABSTRACT

BACKGROUND: Several computational advances have been achieved in the drug discovery field, promoting the identification of novel drug-target interactions and new leads. However, most of these methodologies have been overlooking the importance of providing explanations to the decision-making process of deep learning architectures. In this research study, we explore the reliability of convolutional neural networks (CNNs) at identifying relevant regions for binding, specifically binding sites and motifs, and the significance of the deep representations extracted by providing explanations to the model's decisions based on the identification of the input regions that contributed the most to the prediction. We make use of an end-to-end deep learning architecture to predict binding affinity, where CNNs are exploited in their capacity to automatically identify and extract discriminating deep representations from 1D sequential and structural data. RESULTS: The results demonstrate the effectiveness of the deep representations extracted from CNNs in the prediction of drug-target interactions. CNNs were found to identify and extract features from regions relevant for the interaction, where the weight associated with these spots was in the range of those with the highest positive influence given by the CNNs in the prediction. The end-to-end deep learning model achieved the highest performance both in the prediction of the binding affinity and on the ability to correctly distinguish the interaction strength rank order when compared to baseline approaches. CONCLUSIONS: This research study validates the potential applicability of an end-to-end deep learning architecture in the context of drug discovery beyond the confined space of proteins and ligands with determined 3D structure. Furthermore, it shows the reliability of the deep representations extracted from the CNNs by providing explainability to the decision-making process.


Subject(s)
Neural Networks, Computer , Proteins , Binding Sites , Plant Extracts , Proteins/chemistry , Reproducibility of Results
10.
Biomedicines ; 10(2)2022 Jan 29.
Article in English | MEDLINE | ID: mdl-35203524

ABSTRACT

Dementia remains an extremely prevalent syndrome among older people and represents a major cause of disability and dependency. Alzheimer's disease (AD) accounts for the majority of dementia cases and stands as the most common neurodegenerative disease. Since age is the major risk factor for AD, the increase in lifespan not only represents a rise in the prevalence but also adds complexity to the diagnosis. Moreover, the lack of disease-modifying therapies highlights another constraint. A shift from a curative to a preventive approach is imminent and we are moving towards the application of personalized medicine where we can shape the best clinical intervention for an individual patient at a given point. This new step in medicine requires the most recent tools and analysis of enormous amounts of data where the application of artificial intelligence (AI) plays a critical role on the depiction of disease-patient dynamics, crucial in reaching early/optimal diagnosis, monitoring and intervention. Predictive models and algorithms are the key elements in this innovative field. In this review, we present an overview of relevant topics regarding the application of AI in AD, detailing the algorithms and their applications in the fields of drug discovery, and biomarkers.

11.
J Cheminform ; 13(1): 21, 2021 Mar 09.
Article in English | MEDLINE | ID: mdl-33750461

ABSTRACT

In this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine [Formula: see text] and [Formula: see text] opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.

12.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2364-2374, 2021.
Article in English | MEDLINE | ID: mdl-32142454

ABSTRACT

The discovery of potential Drug-Target Interactions (DTIs) is a determining step in the drug discovery and repositioning process, as the effectiveness of the currently available antibiotic treatment is declining. Although putting efforts on the traditional in vivo or in vitro methods, pharmaceutical financial investment has been reduced over the years. Therefore, establishing effective computational methods is decisive to find new leads in a reasonable amount of time. Successful approaches have been presented to solve this problem but seldom protein sequences and structured data are used together. In this paper, we present a deep learning architecture model, which exploits the particular ability of Convolutional Neural Networks (CNNs) to obtain 1D representations from protein sequences (amino acid sequence) and compounds SMILES (Simplified Molecular Input Line Entry System) strings. These representations can be interpreted as features that express local dependencies or patterns that can then be used in a Fully Connected Neural Network (FCNN), acting as a binary classifier. The results achieved demonstrate that using CNNs to obtain representations of the data, instead of the traditional descriptors, lead to improved performance. The proposed end-to-end deep learning method outperformed traditional machine learning approaches in the correct classification of both positive and negative interactions.


Subject(s)
Computational Biology/methods , Deep Learning , Drug Discovery/methods , Drug Repositioning/methods , Algorithms , Amino Acid Sequence , Humans , Machine Learning , Neural Networks, Computer , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/metabolism , Proteins/chemistry , Proteins/metabolism
13.
Bioinformatics ; 36(4): 1298-1299, 2020 02 15.
Article in English | MEDLINE | ID: mdl-31504214

ABSTRACT

SUMMARY: CroP is a data visualization application that focuses on the analysis of relational data that changes over time. While it was specifically designed for addressing the preeminent need to interpret large scale time series from gene expression studies, CroP is prepared to analyze datasets from multiple contexts. Multiple datasets can be uploaded simultaneously and viewed through dynamic visualization models, which are contained within flexible panels that allow users to adapt the workspace to their data. Through clustering and the time curve visualization it is possible to quickly identify groups of data points with similar proprieties or behaviors, as well as temporal patterns across all points, such as periodic waves of expression. Additionally, it integrates a public biomedical database for gene annotation. CroP will be of major interest to biologists who seek to extract relations from complex sets of data. AVAILABILITY AND IMPLEMENTATION: CroP is freely available for download as an executable jar at https://cdv.dei.uc.pt/crop/.


Subject(s)
Software , Cluster Analysis , Databases, Factual , Gene Expression , Molecular Sequence Annotation
14.
Biomed Res Int ; 2019: 8984248, 2019.
Article in English | MEDLINE | ID: mdl-31828144

ABSTRACT

Protein-protein interactions (PPIs) can be conveniently represented as networks, allowing the use of graph theory for their study. Network topology studies may reveal patterns associated with specific organisms. Here, we propose a new methodology to denoise PPI networks and predict missing links solely based on the network topology, the organization measurement (OM) method. The OM methodology was applied in the denoising of the PPI networks of two Saccharomyces cerevisiae datasets (Yeast and CS2007) and one Homo sapiens dataset (Human). To evaluate the denoising capabilities of the OM methodology, two strategies were applied. The first strategy compared its application in random networks and in the reference set networks, while the second strategy perturbed the networks with the gradual random addition and removal of edges. The application of the OM methodology to the Yeast and Human reference sets achieved an AUC of 0.95 and 0.87, in Yeast and Human networks, respectively. The random removal of 80% of the Yeast and Human reference set interactions resulted in an AUC of 0.71 and 0.62, whereas the random addition of 80% interactions resulted in an AUC of 0.75 and 0.72, respectively. Applying the OM methodology to the CS2007 dataset yields an AUC of 0.99. We also perturbed the network of the CS2007 dataset by randomly inserting and removing edges in the same proportions previously described. The false positives identified and removed from the network varied from 97%, when inserting 20% more edges, to 89%, when 80% more edges were inserted. The true positives identified and inserted in the network varied from 95%, when removing 20% of the edges, to 40%, after the random deletion of 80% edges. The OM methodology is sensitive to the topological structure of the biological networks. The obtained results suggest that the present approach can efficiently be used to denoise PPI networks.


Subject(s)
Computational Biology/methods , Protein Interaction Mapping/methods , Protein Interaction Maps , Area Under Curve , Databases, Protein , Humans , Saccharomyces cerevisiae Proteins
15.
Brief Bioinform ; 20(4): 1513-1523, 2019 07 19.
Article in English | MEDLINE | ID: mdl-29590305

ABSTRACT

The field of computational biology has become largely dependent on data visualization tools to analyze the increasing quantities of data gathered through the use of new and growing technologies. Aside from the volume, which often results in large amounts of noise and complex relationships with no clear structure, the visualization of biological data sets is hindered by their heterogeneity, as data are obtained from different sources and contain a wide variety of attributes, including spatial and temporal information. This requires visualization approaches that are able to not only represent various data structures simultaneously but also provide exploratory methods that allow the identification of meaningful relationships that would not be perceptible through data analysis algorithms alone. In this article, we present a survey of visualization approaches applied to the analysis of biological data. We focus on graph-based visualizations and tools that use coordinated multiple views to represent high-dimensional multivariate data, in particular time series gene expression, protein-protein interaction networks and biological pathways. We then discuss how these methods can be used to help solve the current challenges surrounding the visualization of complex biological data sets.


Subject(s)
Computational Biology/methods , Data Analysis , Algorithms , Animals , Computer Graphics/statistics & numerical data , Data Interpretation, Statistical , Gene Expression Profiling/statistics & numerical data , Humans , Models, Biological , Multivariate Analysis , Protein Interaction Maps , User-Computer Interface
16.
J Proteomics ; 171: 81-86, 2018 01 16.
Article in English | MEDLINE | ID: mdl-28843534

ABSTRACT

The value of the molecular information obtained from saliva is dependent on the use of in vitro and in silico techniques. The main proteins of saliva when separated by capillary electrophoresis enable the establishment of individual profiles with characteristic patterns reflecting each individual phenotype. Different physiological or pathological conditions may be identified by specific protein profiles. The association of each profile to the particular protein composition provides clues as to which biological processes are compromised in each situation. Patient stratification according to different phenotypes often within a particular disease spectrum is especially important for the management of individuals carrying multiple diseases and requiring personalized interventions. In this work we present the SalivaPRINT Toolkit, which enables the analysis of protein profile patterns and patient phenotyping. Additionally, the SalivaPRINT Toolkit allows the identification of molecular weight ranges altered in a particular condition and therefore potentially involved in the underlying dysregulated mechanisms. This tutorial introduces the use of the SalivaPRINT Toolkit command line interface (https://github.com/salivatec/SalivaPRINT) as an independent tool for electrophoretic protein profile evaluation. It provides a detailed overview of its functionalities, illustrated by the application to the analysis of profiles obtained from a healthy population versus a population affected with inflammatory conditions. BIOLOGICAL SIGNIFICANCE: We present SalivaPRINT, which serves as a patient characterization tool to identify molecular weights related with particular conditions and, from there, find proteins, which may be involved in the underlying dysregulated cellular mechanisms. The proposed analysis strategy has the potential to boost personalized diagnosis. To our knowledge this is the first independent tool for electrophoretic protein profile evaluation and is crucial when a large number of complex electrophoretic profiles needs to be compared and classified.


Subject(s)
Computational Biology/methods , Proteome/metabolism , Saliva/metabolism , Salivary Proteins and Peptides/metabolism , Software , Celiac Disease/metabolism , Databases, Protein , Humans , Inflammation/metabolism , Machine Learning , Molecular Weight , Phenotype , Proteome/classification
17.
Biomed Res Int ; 2017: 1734151, 2017.
Article in English | MEDLINE | ID: mdl-29379794

ABSTRACT

Identifying ZIKV factors interfering with human host pathways represents a major challenge in understanding ZIKV tropism and pathogenesis. The integration of proteomic, gene expression and Protein-Protein Interactions (PPIs) established between ZIKV and human host proteins predicted by the OralInt algorithm identified 1898 interactions with medium or high score (≥0.7). Targets implicated in vesicular traffic and docking were identified. New receptors involved in endocytosis pathways as ZIKV entry targets, using both clathrin-dependent (17 receptors) and independent (10 receptors) pathways, are described. New targets used by the ZIKV to undermine the host's antiviral immune response are proposed based on predicted interactions established between the virus and host cell receptors and/or proteins with an effector or signaling role in the immune response such as IFN receptors and TLR. Complement and cytokines are proposed as extracellular potential interacting partners of the secreted form of NS1 ZIKV protein. Altogether, in this article, 18 new human targets for structural and nonstructural ZIKV proteins are proposed. These results are of great relevance for the understanding of viral pathogenesis and consequently the development of preventive (vaccines) and therapeutic targets for ZIKV infection management.


Subject(s)
Computational Biology , Models, Immunological , Viral Proteins/immunology , Zika Virus Infection/immunology , Zika Virus/immunology , Female , Humans , Male , Viral Vaccines/immunology , Zika Virus Infection/pathology , Zika Virus Infection/prevention & control
18.
PLoS Comput Biol ; 12(11): e1005219, 2016 Nov.
Article in English | MEDLINE | ID: mdl-27893735

ABSTRACT

De novo experimental drug discovery is an expensive and time-consuming task. It requires the identification of drug-target interactions (DTIs) towards targets of biological interest, either to inhibit or enhance a specific molecular function. Dedicated computational models for protein simulation and DTI prediction are crucial for speed and to reduce the costs associated with DTI identification. In this paper we present a computational pipeline that enables the discovery of putative leads for drug repositioning that can be applied to any microbial proteome, as long as the interactome of interest is at least partially known. Network metrics calculated for the interactome of the bacterial organism of interest were used to identify putative drug-targets. Then, a random forest classification model for DTI prediction was constructed using known DTI data from publicly available databases, resulting in an area under the ROC curve of 0.91 for classification of out-of-sampling data. A drug-target network was created by combining 3,081 unique ligands and the expected ten best drug targets. This network was used to predict new DTIs and to calculate the probability of the positive class, allowing the scoring of the predicted instances. Molecular docking experiments were performed on the best scoring DTI pairs and the results were compared with those of the same ligands with their original targets. The results obtained suggest that the proposed pipeline can be used in the identification of new leads for drug repositioning. The proposed classification model is available at http://bioinformatics.ua.pt/software/dtipred/.


Subject(s)
Anti-Bacterial Agents/chemistry , Bacterial Proteins/chemistry , Drug Discovery/methods , Drug Repositioning/methods , Models, Chemical , Protein Interaction Mapping/methods , Computer Simulation , Drug Evaluation, Preclinical/methods
19.
J Bioinform Comput Biol ; 13(5): 1550023, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26388143

ABSTRACT

Microbial communities thrive in close association among themselves and with the host, establishing protein-protein interactions (PPIs) with the latter, and thus being able to benefit (positively impact) or disturb (negatively impact) biological events in the host. Despite major collaborative efforts to sequence the Human microbiome, there is still a great lack of understanding their impact. We propose a computational methodology to predict the impact of microbial proteins in human biological events, taking into account the abundance of each microbial protein and its relation to all other microbial and human proteins. This alternative methodology is centered on an improved impact estimation algorithm that integrates PPIs between human and microbial proteins with Reactome pathway data. This methodology was applied to study the impact of 24 microbial phyla over different cellular events, within 10 different human microbiomes. The results obtained confirm findings already described in the literature and explore new ones. We believe the Human microbiome can no longer be ignored as not only is there enough evidence correlating microbiome alterations and disease states, but also the return to healthy states once these alterations are reversed.


Subject(s)
Algorithms , Computational Biology/methods , Microbiota , Protein Interaction Mapping/statistics & numerical data , Computing Methodologies , Databases, Protein , Female , Genetic Variation , Host-Pathogen Interactions , Humans , Male , Metagenomics/statistics & numerical data , Organ Specificity , Phylogeny
20.
Article in English | MEDLINE | ID: mdl-26736986

ABSTRACT

Microbial species thrive within human hosts by establishing complex associations between themselves and the host. Even though species diversity can be measured (alpha- and beta-diversity), a methodology to estimate the impact of microorganisms in human pathways is still lacking. In this work we propose a computational approach to estimate which human pathways are targeted the most by microorganisms, while also identifying which microorganisms are prominent in this targeting. Our results were consistent with literature evidence, and thus we propose this methodology as a new prospective approach to be used for screening potentially impacted pathways.


Subject(s)
Algorithms , Bacteria/metabolism , Host-Pathogen Interactions , Microbiota , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...