Pesquisa | Portal Regional da BVS (teste)

1.

Detecting outliers in case-control cohorts for improving deep learning networks on Schizophrenia prediction.

Martins, Daniel; Abbasi, Maryam; Egas, Conceição; Arrais, Joel P.

J Integr Bioinform ; 2024 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-39004922

RESUMO

This study delves into the intricate genetic and clinical aspects of Schizophrenia, a complex mental disorder with uncertain etiology. Deep Learning (DL) holds promise for analyzing large genomic datasets to uncover new risk factors. However, based on reports of non-negligible misdiagnosis rates for SCZ, case-control cohorts may contain outlying genetic profiles, hindering compelling performances of classification models. The research employed a case-control dataset sourced from the Swedish populace. A gene-annotation-based DL architecture was developed and employed in two stages. First, the model was trained on the entire dataset to highlight differences between cases and controls. Then, samples likely to be misclassified were excluded, and the model was retrained on the refined dataset for performance evaluation. The results indicate that SCZ prevalence and misdiagnosis rates can affect case-control cohorts, potentially compromising future studies reliant on such datasets. However, by detecting and filtering outliers, the study demonstrates the feasibility of adapting DL methodologies to large-scale biological problems, producing results more aligned with existing heritability estimates for SCZ. This approach not only advances the comprehension of the genetic background of SCZ but also opens doors for adapting DL techniques in complex research for precision medicine in mental health.

2.

Predicting drug activity against cancer through genomic profiles and SMILES.

Abbasi, Maryam; Carvalho, Filipa G; Ribeiro, Bernardete; Arrais, Joel P.

Artif Intell Med ; 150: 102820, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38553160

RESUMO

Due to the constant increase in cancer rates, the disease has become a leading cause of death worldwide, enhancing the need for its detection and treatment. In the era of personalized medicine, the main goal is to incorporate individual variability in order to choose more precisely which therapy and prevention strategies suit each person. However, predicting the sensitivity of tumors to anticancer treatments remains a challenge. In this work, we propose two deep neural network models to predict the impact of anticancer drugs in tumors through the half-maximal inhibitory concentration (IC50). These models join biological and chemical data to apprehend relevant features of the genetic profile and the drug compounds, respectively. In order to predict the drug response in cancer cell lines, this study employed different DL methods, resorting to Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). In the first stage, two autoencoders were pre-trained with high-dimensional gene expression and mutation data of tumors. Afterward, this genetic background is transferred to the prediction models that return the IC50 value that portrays the potency of a substance in inhibiting a cancer cell line. When comparing RSEM Expected counts and TPM as methods for displaying gene expression data, RSEM has been shown to perform better in deep models and CNNs model can obtain better insight in these types of data. Moreover, the obtained results reflect the effectiveness of the extracted deep representations in the prediction of the IC50 value that portrays the potency of a substance in inhibiting a tumor, achieving a performance of a mean squared error of 1.06 and surpassing previous state-of-the-art models.

Assuntos

Perfil Genético , Neoplasias , Humanos , Redes Neurais de Computação , Neoplasias/tratamento farmacológico , Neoplasias/genética , Linhagem Celular , Genômica

3.

Enhancing reinforcement learning for de novo molecular design applying self-attention mechanisms.

Pereira, Tiago O; Abbasi, Maryam; Arrais, Joel P.

Brief Bioinform ; 24(6)2023 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-37903414

RESUMO

The drug discovery process can be significantly improved by applying deep reinforcement learning (RL) methods that learn to generate compounds with desired pharmacological properties. Nevertheless, RL-based methods typically condense the evaluation of sampled compounds into a single scalar value, making it difficult for the generative agent to learn the optimal policy. This work combines self-attention mechanisms and RL to generate promising molecules. The idea is to evaluate the relative significance of each atom and functional group in their interaction with the target, and to utilize this information for optimizing the Generator. Therefore, the framework for de novo drug design is composed of a Generator that samples new compounds combined with a Transformer-encoder and a biological affinity Predictor that evaluate the generated structures. Moreover, it takes the advantage of the knowledge encapsulated in the Transformer's attention weights to evaluate each token individually. We compared the performance of two output prediction strategies for the Transformer: standard and masked language model (MLM). The results show that the MLM Transformer is more effective in optimizing the Generator compared with the state-of-the-art works. Additionally, the evaluation models identified the most important regions of each molecule for the biological interaction with the target. As a case study, we generated synthesizable hit compounds that can be putative inhibitors of the enzyme ubiquitin-specific protein 7 (USP7).

Assuntos

Desenho de Fármacos , Aprendizagem , Descoberta de Drogas

4.

Artificial intelligence for prediction of biological activities and generation of molecular hits using stereochemical information.

Pereira, Tiago O; Abbasi, Maryam; Oliveira, Rita I; Guedes, Romina A; Salvador, Jorge A R; Arrais, Joel P.

J Comput Aided Mol Des ; 37(12): 791-806, 2023 12.

Artigo em Inglês | MEDLINE | ID: mdl-37847342

RESUMO

In this work, we develop a method for generating targeted hit compounds by applying deep reinforcement learning and attention mechanisms to predict binding affinity against a biological target while considering stereochemical information. The novelty of this work is a deep model Predictor that can establish the relationship between chemical structures and their corresponding [Formula: see text] values. We thoroughly study the effect of different molecular descriptors such as ECFP4, ECFP6, SMILES and RDKFingerprint. Also, we demonstrated the importance of attention mechanisms to capture long-range dependencies in molecular sequences. Due to the importance of stereochemical information for the binding mechanism, this information was employed both in the prediction and generation processes. To identify the most promising hits, we apply the self-adaptive multi-objective optimization strategy. Moreover, to ensure the existence of stereochemical information, we consider all the possible enumerated stereoisomers to provide the most appropriate 3D structures. We evaluated this approach against the Ubiquitin-Specific Protease 7 (USP7) by generating putative inhibitors for this target. The predictor with SMILES notations as descriptor plus bidirectional recurrent neural network using attention mechanism has the best performance. Additionally, our methodology identify the regions of the generated molecules that are important for the interaction with the receptor's active site. Also, the obtained results demonstrate that it is possible to discover synthesizable molecules with high biological affinity for the target, containing the indication of their optimal stereochemical conformation.

Assuntos

Inteligência Artificial , Desenho de Fármacos , Redes Neurais de Computação , Estrutura Molecular

5.

FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers.

Monteiro, Nelson R C; Pereira, Tiago O; Machado, Ana Catarina D; Oliveira, José L; Abbasi, Maryam; Arrais, Joel P.

Comput Biol Med ; 164: 107285, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37557054

RESUMO

The design of compounds that target specific biological functions with relevant selectivity is critical in the context of drug discovery, especially due to the polypharmacological nature of most existing drug molecules. In recent years, in silico-based methods combined with deep learning have shown promising results in the de novo drug design challenge, leading to potential leads for biologically interesting targets. However, several of these methods overlook the importance of certain properties, such as validity rate and target selectivity, or simplify the generative process by neglecting the multi-objective nature of the pharmacological space. In this study, we propose a multi-objective Transformer-based architecture to generate drug candidates with desired molecular properties and increased selectivity toward a specific biological target. The framework consists of a Transformer-Decoder Generator that generates novel and valid compounds in the SMILES format notation, a Transformer-Encoder Predictor that estimates the binding affinity toward the biological target, and a feedback loop combined with a multi-objective optimization strategy to rank the generated molecules and condition the generating distribution around the targeted properties. The results demonstrate that the proposed architecture can generate novel and synthesizable small compounds with desired pharmacological properties toward a biologically relevant target. The unbiased Transformer-based Generator achieved superior performance in the novelty rate (97.38%) and comparable performance in terms of internal diversity, uniqueness, and validity against state-of-the-art baselines. The optimization of the unbiased Transformer-based Generator resulted in the generation of molecules exhibiting high binding affinity toward the Adenosine A2A Receptor (AA2AR) and possessing desirable physicochemical properties, where 99.36% of the generated molecules follow Lipinski's rule of five. Furthermore, the implementation of a feedback strategy, in conjunction with a multi-objective algorithm, effectively shifted the distribution of the generated molecules toward optimal values of molecular weight, molecular lipophilicity, topological polar surface area, synthetic accessibility score, and quantitative estimate of drug-likeness, without the necessity of prior training sets comprising molecules endowed with pharmacological properties of interest. Overall, this research study validates the applicability of a Transformer-based architecture in the context of drug design, capable of exploring the vast chemical representation space to generate novel molecules with improved pharmacological properties and target selectivity. The data and source code used in this study are available at: https://github.com/larngroup/FSM-DDTR.

Assuntos

Desenho de Fármacos , Descoberta de Drogas , Retroalimentação , Algoritmos , Software

6.

Correction to: Designing optimized drug candidates with Generative Adversarial Network.

Abbasi, Maryam; Santos, Beatriz P; Pereira, Tiago C; Sofa, Raul; Monteiro, Nelson R C; Simões, Carlos J V; Brito, Rui M M; Ribeiro, Bernardete; Oliveira, José L; Arrais, Joel P.

J Cheminform ; 14(1): 53, 2022 Aug 11.

Artigo em Inglês | MEDLINE | ID: mdl-35953869

7.

Deep generative model for therapeutic targets using transcriptomic disease-associated data-USP7 case study.

Pereira, Tiago; Abbasi, Maryam; Oliveira, Rita I; Guedes, Romina A; Salvador, Jorge A R; Arrais, Joel P.

Brief Bioinform ; 23(4)2022 07 18.

Artigo em Inglês | MEDLINE | ID: mdl-35789255

RESUMO

The generation of candidate hit molecules with the potential to be used in cancer treatment is a challenging task. In this context, computational methods based on deep learning have been employed to improve in silico drug design methodologies. Nonetheless, the applied strategies have focused solely on the chemical aspect of the generation of compounds, disregarding the likely biological consequences for the organism's dynamics. Herein, we propose a method to implement targeted molecular generation that employs biological information, namely, disease-associated gene expression data, to conduct the process of identifying interesting hits. When applied to the generation of USP7 putative inhibitors, the framework managed to generate promising compounds, with more than 90% of them containing drug-like properties and essential active groups for the interaction with the target. Hence, this work provides a novel and reliable method for generating new promising compounds focused on the biological context of the disease.

Assuntos

Desenho de Fármacos , Transcriptoma , Peptidase 7 Específica de Ubiquitina

8.

DTITR: End-to-end drug-target binding affinity prediction with transformers.

Monteiro, Nelson R C; Oliveira, José L; Arrais, Joel P.

Comput Biol Med ; 147: 105772, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35777085

RESUMO

The accurate identification of Drug-Target Interactions (DTIs) remains a critical turning point in drug discovery and understanding of the binding process. Despite recent advances in computational solutions to overcome the challenges of in vitro and in vivo experiments, most of the proposed in silico-based methods still focus on binary classification, overlooking the importance of characterizing DTIs with unbiased binding strength values to properly distinguish primary interactions from those with off-targets. Moreover, several of these methods usually simplify the entire interaction mechanism, neglecting the joint contribution of the individual units of each binding component and the interacting substructures involved, and have yet to focus on more explainable and interpretable architectures. In this study, we propose an end-to-end Transformer-based architecture for predicting drug-target binding affinity (DTA) using 1D raw sequential and structural data to represent the proteins and compounds. This architecture exploits self-attention layers to capture the biological and chemical context of the proteins and compounds, respectively, and cross-attention layers to exchange information and capture the pharmacological context of the DTIs. The results show that the proposed architecture is effective in predicting DTA, achieving superior performance in both correctly predicting the value of interaction strength and being able to correctly discriminate the rank order of binding strength compared to state-of-the-art baselines. The combination of multiple Transformer-Encoders was found to result in robust and discriminative aggregate representations of the proteins and compounds for binding affinity prediction, in which the addition of a Cross-Attention Transformer-Encoder was identified as an important block for improving the discriminative power of these representations. Overall, this research study validates the applicability of an end-to-end Transformer-based architecture in the context of drug discovery, capable of self-providing different levels of potential DTI and prediction understanding due to the nature of the attention blocks. The data and source code used in this study are available at: https://github.com/larngroup/DTITR.

Assuntos

Proteínas , Software , Desenvolvimento de Medicamentos , Descoberta de Drogas/métodos , Proteínas/química

9.

Designing optimized drug candidates with Generative Adversarial Network.

Abbasi, Maryam; Santos, Beatriz P; Pereira, Tiago C; Sofia, Raul; Monteiro, Nelson R C; Simões, Carlos J V; Brito, Rui M M; Ribeiro, Bernardete; Oliveira, José L; Arrais, Joel P.

J Cheminform ; 14(1): 40, 2022 Jun 26.

Artigo em Inglês | MEDLINE | ID: mdl-35754029

RESUMO

Drug design is an important area of study for pharmaceutical businesses. However, low efficacy, off-target delivery, time consumption, and high cost are challenges and can create barriers that impact this process. Deep Learning models are emerging as a promising solution to perform de novo drug design, i.e., to generate drug-like molecules tailored to specific needs. However, stereochemistry was not explicitly considered in the generated molecules, which is inevitable in targeted-oriented molecules. This paper proposes a framework based on Feedback Generative Adversarial Network (GAN) that includes optimization strategy by incorporating Encoder-Decoder, GAN, and Predictor deep models interconnected with a feedback loop. The Encoder-Decoder converts the string notations of molecules into latent space vectors, effectively creating a new type of molecular representation. At the same time, the GAN can learn and replicate the training data distribution and, therefore, generate new compounds. The feedback loop is designed to incorporate and evaluate the generated molecules according to the multiobjective desired property at every epoch of training to ensure a steady shift of the generated distribution towards the space of the targeted properties. Moreover, to develop a more precise set of molecules, we also incorporate a multiobjective optimization selection technique based on a non-dominated sorting genetic algorithm. The results demonstrate that the proposed framework can generate realistic, novel molecules that span the chemical space. The proposed Encoder-Decoder model correctly reconstructs 99% of the datasets, including stereochemical information. The model's ability to find uncharted regions of the chemical space was successfully shown by optimizing the unbiased GAN to generate molecules with a high binding affinity to the Kappa Opioid and Adenosine [Formula: see text] receptor. Furthermore, the generated compounds exhibit high internal and external diversity levels 0.88 and 0.94, respectively, and uniqueness.

10.

Explainable deep drug-target representations for binding affinity prediction.

Monteiro, Nelson R C; Simões, Carlos J V; Ávila, Henrique V; Abbasi, Maryam; Oliveira, José L; Arrais, Joel P.

BMC Bioinformatics ; 23(1): 237, 2022 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-35715734

RESUMO

BACKGROUND: Several computational advances have been achieved in the drug discovery field, promoting the identification of novel drug-target interactions and new leads. However, most of these methodologies have been overlooking the importance of providing explanations to the decision-making process of deep learning architectures. In this research study, we explore the reliability of convolutional neural networks (CNNs) at identifying relevant regions for binding, specifically binding sites and motifs, and the significance of the deep representations extracted by providing explanations to the model's decisions based on the identification of the input regions that contributed the most to the prediction. We make use of an end-to-end deep learning architecture to predict binding affinity, where CNNs are exploited in their capacity to automatically identify and extract discriminating deep representations from 1D sequential and structural data. RESULTS: The results demonstrate the effectiveness of the deep representations extracted from CNNs in the prediction of drug-target interactions. CNNs were found to identify and extract features from regions relevant for the interaction, where the weight associated with these spots was in the range of those with the highest positive influence given by the CNNs in the prediction. The end-to-end deep learning model achieved the highest performance both in the prediction of the binding affinity and on the ability to correctly distinguish the interaction strength rank order when compared to baseline approaches. CONCLUSIONS: This research study validates the potential applicability of an end-to-end deep learning architecture in the context of drug discovery beyond the confined space of proteins and ligands with determined 3D structure. Furthermore, it shows the reliability of the deep representations extracted from the CNNs by providing explainability to the decision-making process.

Assuntos

Redes Neurais de Computação , Proteínas , Sítios de Ligação , Extratos Vegetais , Proteínas/química , Reprodutibilidade dos Testes

11.

The Road to Personalized Medicine in Alzheimer's Disease: The Use of Artificial Intelligence.

Silva-Spínola, Anuschka; Baldeiras, Inês; Arrais, Joel P; Santana, Isabel.

Biomedicines ; 10(2)2022 Jan 29.

Artigo em Inglês | MEDLINE | ID: mdl-35203524

RESUMO

Dementia remains an extremely prevalent syndrome among older people and represents a major cause of disability and dependency. Alzheimer's disease (AD) accounts for the majority of dementia cases and stands as the most common neurodegenerative disease. Since age is the major risk factor for AD, the increase in lifespan not only represents a rise in the prevalence but also adds complexity to the diagnosis. Moreover, the lack of disease-modifying therapies highlights another constraint. A shift from a curative to a preventive approach is imminent and we are moving towards the application of personalized medicine where we can shape the best clinical intervention for an individual patient at a given point. This new step in medicine requires the most recent tools and analysis of enormous amounts of data where the application of artificial intelligence (AI) plays a critical role on the depiction of disease-patient dynamics, crucial in reaching early/optimal diagnosis, monitoring and intervention. Predictive models and algorithms are the key elements in this innovative field. In this review, we present an overview of relevant topics regarding the application of AI in AD, detailing the algorithms and their applications in the fields of drug discovery, and biomarkers.

12.

Diversity oriented Deep Reinforcement Learning for targeted molecule generation.

Pereira, Tiago; Abbasi, Maryam; Ribeiro, Bernardete; Arrais, Joel P.

J Cheminform ; 13(1): 21, 2021 Mar 09.

Artigo em Inglês | MEDLINE | ID: mdl-33750461

RESUMO

In this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine [Formula: see text] and [Formula: see text] opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.

13.

Drug-Target Interaction Prediction: End-to-End Deep Learning Approach.

Monteiro, Nelson R C; Ribeiro, Bernardete; Arrais, Joel P.

IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2364-2374, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-32142454

RESUMO

The discovery of potential Drug-Target Interactions (DTIs) is a determining step in the drug discovery and repositioning process, as the effectiveness of the currently available antibiotic treatment is declining. Although putting efforts on the traditional in vivo or in vitro methods, pharmaceutical financial investment has been reduced over the years. Therefore, establishing effective computational methods is decisive to find new leads in a reasonable amount of time. Successful approaches have been presented to solve this problem but seldom protein sequences and structured data are used together. In this paper, we present a deep learning architecture model, which exploits the particular ability of Convolutional Neural Networks (CNNs) to obtain 1D representations from protein sequences (amino acid sequence) and compounds SMILES (Simplified Molecular Input Line Entry System) strings. These representations can be interpreted as features that express local dependencies or patterns that can then be used in a Fully Connected Neural Network (FCNN), acting as a binary classifier. The results achieved demonstrate that using CNNs to obtain representations of the data, instead of the traditional descriptors, lead to improved performance. The proposed end-to-end deep learning method outperformed traditional machine learning approaches in the correct classification of both positive and negative interactions.

Assuntos

Biologia Computacional/métodos , Aprendizado Profundo , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Algoritmos , Sequência de Aminoácidos , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Proteínas/química , Proteínas/metabolismo

14.

CroP-Coordinated Panel visualization for biological networks analysis.

Cruz, António; Machado, Penousal; Arrais, Joel P.

Bioinformatics ; 36(4): 1298-1299, 2020 02 15.

Artigo em Inglês | MEDLINE | ID: mdl-31504214

RESUMO

SUMMARY: CroP is a data visualization application that focuses on the analysis of relational data that changes over time. While it was specifically designed for addressing the preeminent need to interpret large scale time series from gene expression studies, CroP is prepared to analyze datasets from multiple contexts. Multiple datasets can be uploaded simultaneously and viewed through dynamic visualization models, which are contained within flexible panels that allow users to adapt the workspace to their data. Through clustering and the time curve visualization it is possible to quickly identify groups of data points with similar proprieties or behaviors, as well as temporal patterns across all points, such as periodic waves of expression. Additionally, it integrates a public biomedical database for gene annotation. CroP will be of major interest to biologists who seek to extract relations from complex sets of data. AVAILABILITY AND IMPLEMENTATION: CroP is freely available for download as an executable jar at https://cdv.dei.uc.pt/crop/.

Assuntos

Software , Análise por Conglomerados , Bases de Dados Factuais , Expressão Gênica , Anotação de Sequência Molecular

15.

Handling Noise in Protein Interaction Networks.

Correia, Fernanda B; Coelho, Edgar D; Oliveira, José L; Arrais, Joel P.

Biomed Res Int ; 2019: 8984248, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31828144

RESUMO

Protein-protein interactions (PPIs) can be conveniently represented as networks, allowing the use of graph theory for their study. Network topology studies may reveal patterns associated with specific organisms. Here, we propose a new methodology to denoise PPI networks and predict missing links solely based on the network topology, the organization measurement (OM) method. The OM methodology was applied in the denoising of the PPI networks of two Saccharomyces cerevisiae datasets (Yeast and CS2007) and one Homo sapiens dataset (Human). To evaluate the denoising capabilities of the OM methodology, two strategies were applied. The first strategy compared its application in random networks and in the reference set networks, while the second strategy perturbed the networks with the gradual random addition and removal of edges. The application of the OM methodology to the Yeast and Human reference sets achieved an AUC of 0.95 and 0.87, in Yeast and Human networks, respectively. The random removal of 80% of the Yeast and Human reference set interactions resulted in an AUC of 0.71 and 0.62, whereas the random addition of 80% interactions resulted in an AUC of 0.75 and 0.72, respectively. Applying the OM methodology to the CS2007 dataset yields an AUC of 0.99. We also perturbed the network of the CS2007 dataset by randomly inserting and removing edges in the same proportions previously described. The false positives identified and removed from the network varied from 97%, when inserting 20% more edges, to 89%, when 80% more edges were inserted. The true positives identified and inserted in the network varied from 95%, when removing 20% of the edges, to 40%, after the random deletion of 80% edges. The OM methodology is sensitive to the topological structure of the biological networks. The obtained results suggest that the present approach can efficiently be used to denoise PPI networks.

Assuntos

Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Área Sob a Curva , Bases de Dados de Proteínas , Humanos , Proteínas de Saccharomyces cerevisiae

16.

Interactive and coordinated visualization approaches for biological data analysis.

Cruz, António; Arrais, Joel P; Machado, Penousal.

Brief Bioinform ; 20(4): 1513-1523, 2019 07 19.

Artigo em Inglês | MEDLINE | ID: mdl-29590305

RESUMO

The field of computational biology has become largely dependent on data visualization tools to analyze the increasing quantities of data gathered through the use of new and growing technologies. Aside from the volume, which often results in large amounts of noise and complex relationships with no clear structure, the visualization of biological data sets is hindered by their heterogeneity, as data are obtained from different sources and contain a wide variety of attributes, including spatial and temporal information. This requires visualization approaches that are able to not only represent various data structures simultaneously but also provide exploratory methods that allow the identification of meaningful relationships that would not be perceptible through data analysis algorithms alone. In this article, we present a survey of visualization approaches applied to the analysis of biological data. We focus on graph-based visualizations and tools that use coordinated multiple views to represent high-dimensional multivariate data, in particular time series gene expression, protein-protein interaction networks and biological pathways. We then discuss how these methods can be used to help solve the current challenges surrounding the visualization of complex biological data sets.

Assuntos

Biologia Computacional/métodos , Análise de Dados , Algoritmos , Animais , Gráficos por Computador/estatística & dados numéricos , Interpretação Estatística de Dados , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Modelos Biológicos , Análise Multivariada , Mapas de Interação de Proteínas , Interface Usuário-Computador

17.

SalivaPRINT Toolkit - Protein profile evaluation and phenotype stratification.

Cruz, Igor; Esteves, Eduardo; Fernandes, Mónica; Rosa, Nuno; Correia, Maria José; Arrais, Joel P; Barros, Marlene.

J Proteomics ; 171: 81-86, 2018 01 16.

Artigo em Inglês | MEDLINE | ID: mdl-28843534

RESUMO

The value of the molecular information obtained from saliva is dependent on the use of in vitro and in silico techniques. The main proteins of saliva when separated by capillary electrophoresis enable the establishment of individual profiles with characteristic patterns reflecting each individual phenotype. Different physiological or pathological conditions may be identified by specific protein profiles. The association of each profile to the particular protein composition provides clues as to which biological processes are compromised in each situation. Patient stratification according to different phenotypes often within a particular disease spectrum is especially important for the management of individuals carrying multiple diseases and requiring personalized interventions. In this work we present the SalivaPRINT Toolkit, which enables the analysis of protein profile patterns and patient phenotyping. Additionally, the SalivaPRINT Toolkit allows the identification of molecular weight ranges altered in a particular condition and therefore potentially involved in the underlying dysregulated mechanisms. This tutorial introduces the use of the SalivaPRINT Toolkit command line interface (https://github.com/salivatec/SalivaPRINT) as an independent tool for electrophoretic protein profile evaluation. It provides a detailed overview of its functionalities, illustrated by the application to the analysis of profiles obtained from a healthy population versus a population affected with inflammatory conditions. BIOLOGICAL SIGNIFICANCE: We present SalivaPRINT, which serves as a patient characterization tool to identify molecular weights related with particular conditions and, from there, find proteins, which may be involved in the underlying dysregulated cellular mechanisms. The proposed analysis strategy has the potential to boost personalized diagnosis. To our knowledge this is the first independent tool for electrophoretic protein profile evaluation and is crucial when a large number of complex electrophoretic profiles needs to be compared and classified.

Assuntos

Biologia Computacional/métodos , Proteoma/metabolismo , Saliva/metabolismo , Proteínas e Peptídeos Salivares/metabolismo , Software , Doença Celíaca/metabolismo , Bases de Dados de Proteínas , Humanos , Inflamação/metabolismo , Aprendizado de Máquina , Peso Molecular , Fenótipo , Proteoma/classificação

18.

New Targets for Zika Virus Determined by Human-Viral Interactomic: A Bioinformatics Approach.

Esteves, Eduardo; Rosa, Nuno; Correia, Maria José; Arrais, Joel P; Barros, Marlene.

Biomed Res Int ; 2017: 1734151, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29379794

RESUMO

Identifying ZIKV factors interfering with human host pathways represents a major challenge in understanding ZIKV tropism and pathogenesis. The integration of proteomic, gene expression and Protein-Protein Interactions (PPIs) established between ZIKV and human host proteins predicted by the OralInt algorithm identified 1898 interactions with medium or high score (≥0.7). Targets implicated in vesicular traffic and docking were identified. New receptors involved in endocytosis pathways as ZIKV entry targets, using both clathrin-dependent (17 receptors) and independent (10 receptors) pathways, are described. New targets used by the ZIKV to undermine the host's antiviral immune response are proposed based on predicted interactions established between the virus and host cell receptors and/or proteins with an effector or signaling role in the immune response such as IFN receptors and TLR. Complement and cytokines are proposed as extracellular potential interacting partners of the secreted form of NS1 ZIKV protein. Altogether, in this article, 18 new human targets for structural and nonstructural ZIKV proteins are proposed. These results are of great relevance for the understanding of viral pathogenesis and consequently the development of preventive (vaccines) and therapeutic targets for ZIKV infection management.

Assuntos

Biologia Computacional , Modelos Imunológicos , Proteínas Virais/imunologia , Infecção por Zika virus/imunologia , Zika virus/imunologia , Feminino , Humanos , Masculino , Vacinas Virais/imunologia , Infecção por Zika virus/patologia , Infecção por Zika virus/prevenção & controle

19.

Computational Discovery of Putative Leads for Drug Repositioning through Drug-Target Interaction Prediction.

Coelho, Edgar D; Arrais, Joel P; Oliveira, José Luís.

PLoS Comput Biol ; 12(11): e1005219, 2016 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-27893735

RESUMO

De novo experimental drug discovery is an expensive and time-consuming task. It requires the identification of drug-target interactions (DTIs) towards targets of biological interest, either to inhibit or enhance a specific molecular function. Dedicated computational models for protein simulation and DTI prediction are crucial for speed and to reduce the costs associated with DTI identification. In this paper we present a computational pipeline that enables the discovery of putative leads for drug repositioning that can be applied to any microbial proteome, as long as the interactome of interest is at least partially known. Network metrics calculated for the interactome of the bacterial organism of interest were used to identify putative drug-targets. Then, a random forest classification model for DTI prediction was constructed using known DTI data from publicly available databases, resulting in an area under the ROC curve of 0.91 for classification of out-of-sampling data. A drug-target network was created by combining 3,081 unique ligands and the expected ten best drug targets. This network was used to predict new DTIs and to calculate the probability of the positive class, allowing the scoring of the predicted instances. Molecular docking experiments were performed on the best scoring DTI pairs and the results were compared with those of the same ligands with their original targets. The results obtained suggest that the proposed pipeline can be used in the identification of new leads for drug repositioning. The proposed classification model is available at http://bioinformatics.ua.pt/software/dtipred/.

Assuntos

Antibacterianos/química , Proteínas de Bactérias/química , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Modelos Químicos , Mapeamento de Interação de Proteínas/métodos , Simulação por Computador , Avaliação Pré-Clínica de Medicamentos/métodos

20.

Computational methodology for predicting the landscape of the human-microbial interactome region level influence.

Coelho, Edgar D; Santiago, André M; Arrais, Joel P; Oliveira, José Luís.

J Bioinform Comput Biol ; 13(5): 1550023, 2015 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-26388143

RESUMO

Microbial communities thrive in close association among themselves and with the host, establishing protein-protein interactions (PPIs) with the latter, and thus being able to benefit (positively impact) or disturb (negatively impact) biological events in the host. Despite major collaborative efforts to sequence the Human microbiome, there is still a great lack of understanding their impact. We propose a computational methodology to predict the impact of microbial proteins in human biological events, taking into account the abundance of each microbial protein and its relation to all other microbial and human proteins. This alternative methodology is centered on an improved impact estimation algorithm that integrates PPIs between human and microbial proteins with Reactome pathway data. This methodology was applied to study the impact of 24 microbial phyla over different cellular events, within 10 different human microbiomes. The results obtained confirm findings already described in the literature and explore new ones. We believe the Human microbiome can no longer be ignored as not only is there enough evidence correlating microbiome alterations and disease states, but also the return to healthy states once these alterations are reversed.

Assuntos

Algoritmos , Biologia Computacional/métodos , Microbiota , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Metodologias Computacionais , Bases de Dados de Proteínas , Feminino , Variação Genética , Interações Hospedeiro-Patógeno , Humanos , Masculino , Metagenômica/estatística & dados numéricos , Especificidade de Órgãos , Filogenia

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA