Pesquisa | Portal Regional da BVS (teste)

On the feasibility of deep learning applications using raw mass spectrometry data.

Cadow, Joris; Manica, Matteo; Mathis, Roland; Guo, Tiannan; Aebersold, Ruedi; Rodríguez Martínez, María.

Bioinformatics ; 37(Suppl_1): i245-i253, 2021 07 12.

Artigo em Inglês | MEDLINE | ID: mdl-34252933

RESUMO

SUMMARY: In recent years, SWATH-MS has become the proteomic method of choice for data-independent-acquisition, as it enables high proteome coverage, accuracy and reproducibility. However, data analysis is convoluted and requires prior information and expert curation. Furthermore, as quantification is limited to a small set of peptides, potentially important biological information may be discarded. Here we demonstrate that deep learning can be used to learn discriminative features directly from raw MS data, eliminating hence the need of elaborate data processing pipelines. Using transfer learning to overcome sample sparsity, we exploit a collection of publicly available deep learning models already trained for the task of natural image classification. These models are used to produce feature vectors from each mass spectrometry (MS) raw image, which are later used as input for a classifier trained to distinguish tumor from normal prostate biopsies. Although the deep learning models were originally trained for a completely different classification task and no additional fine-tuning is performed on them, we achieve a highly remarkable classification performance of 0.876 AUC. We investigate different types of image preprocessing and encoding. We also investigate whether the inclusion of the secondary MS2 spectra improves the classification performance. Throughout all tested models, we use standard protein expression vectors as gold standards. Even with our naïve implementation, our results suggest that the application of deep learning and transfer learning techniques might pave the way to the broader usage of raw mass spectrometry data in real-time diagnosis. AVAILABILITY AND IMPLEMENTATION: The open source code used to generate the results from MS images is available on GitHub: https://ibm.biz/mstransc. The raw MS data underlying this article cannot be shared publicly for the privacy of individuals that participated in the study. Processed data including the MS images, their encodings, classification labels and results can be accessed at the following link: https://ibm.box.com/v/mstc-supplementary. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado Profundo , Estudos de Viabilidade , Humanos , Masculino , Espectrometria de Massas , Redes Neurais de Computação , Proteômica , Reprodutibilidade dos Testes

PaccMann^RL: De novo generation of hit-like anticancer molecules from transcriptomic data via reinforcement learning.

Born, Jannis; Manica, Matteo; Oskooei, Ali; Cadow, Joris; Markert, Greta; Rodríguez Martínez, María.

iScience ; 24(4): 102269, 2021 Apr 23.

Artigo em Inglês | MEDLINE | ID: mdl-33851095

RESUMO

With the advent of deep generative models in computational chemistry, in-silico drug design is undergoing an unprecedented transformation. Although deep learning approaches have shown potential in generating compounds with desired chemical properties, they disregard the cellular environment of target diseases. Bridging systems biology and drug design, we present a reinforcement learning method for de novo molecular design from gene expression profiles. We construct a hybrid Variational Autoencoder that tailors molecules to target-specific transcriptomic profiles, using an anticancer drug sensitivity prediction model (PaccMann) as reward function. Without incorporating information about anticancer drugs, the molecule generation is biased toward compounds with high predicted efficacy against cell lines or cancer types. The generation can be further refined by subsidiary constraints such as toxicity. Our cancer-type-specific candidate drugs are similar to cancer drugs in drug-likeness, synthesizability, and solubility and frequently exhibit the highest structural similarity to compounds with known efficacy against these cancer types.

COSIFER: a Python package for the consensus inference of molecular interaction networks.

Manica, Matteo; Bunne, Charlotte; Mathis, Roland; Cadow, Joris; Ahsen, Mehmet Eren; Stolovitzky, Gustavo A; Martínez, María Rodríguez.

Bioinformatics ; 37(14): 2070-2072, 2021 08 04.

Artigo em Inglês | MEDLINE | ID: mdl-33241320

RESUMO

SUMMARY: The advent of high-throughput technologies has provided researchers with measurements of thousands of molecular entities and enable the investigation of the internal regulatory apparatus of the cell. However, network inference from high-throughput data is far from being a solved problem. While a plethora of different inference methods have been proposed, they often lead to non-overlapping predictions, and many of them lack user-friendly implementations to enable their broad utilization. Here, we present Consensus Interaction Network Inference Service (COSIFER), a package and a companion web-based platform to infer molecular networks from expression data using state-of-the-art consensus approaches. COSIFER includes a selection of state-of-the-art methodologies for network inference and different consensus strategies to integrate the predictions of individual methods and generate robust networks. AVAILABILITY AND IMPLEMENTATION: COSIFER Python source code is available at https://github.com/PhosphorylatedRabbits/cosifer. The web service is accessible at https://ibm.biz/cosifer-aas. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Software , Consenso

PaccMann: a web service for interpretable anticancer compound sensitivity prediction.

Cadow, Joris; Born, Jannis; Manica, Matteo; Oskooei, Ali; Rodríguez Martínez, María.

Nucleic Acids Res ; 48(W1): W502-W508, 2020 07 02.

Artigo em Inglês | MEDLINE | ID: mdl-32402082

RESUMO

The identification of new targeted and personalized therapies for cancer requires the fast and accurate assessment of the drug efficacy of potential compounds against a particular biomolecular sample. It has been suggested that the integration of complementary sources of information might strengthen the accuracy of a drug efficacy prediction model. Here, we present a web-based platform for the Prediction of AntiCancer Compound sensitivity with Multimodal Attention-based Neural Networks (PaccMann). PaccMann is trained on public transcriptomic cell line profiles, compound structure information and drug sensitivity screenings, and outperforms state-of-the-art methods on anticancer drug sensitivity prediction. On the open-access web service (https://ibm.biz/paccmann-aas), users can select a known drug compound or design their own compound structure in an interactive editor, perform in-silico drug testing and investigate compound efficacy on publicly available or user-provided transcriptomic profiles. PaccMann leverages methods for model interpretability and outputs confidence scores as well as attention heatmaps that highlight the genes and chemical sub-structures that were more important to make a prediction, hence facilitating the understanding of the model's decision making and the involved biochemical processes. We hope to serve the community with a toolbox for fast and efficient validation in drug repositioning or lead compound identification regimes.

Assuntos

Antineoplásicos/farmacologia , Reposicionamento de Medicamentos , Software , Antineoplásicos/química , Simulação por Computador , Perfilação da Expressão Gênica , Internet , Redes Neurais de Computação , Sirolimo/análogos & derivados , Sirolimo/farmacologia

PIMKL: Pathway-Induced Multiple Kernel Learning.

Manica, Matteo; Cadow, Joris; Mathis, Roland; Rodríguez Martínez, María.

NPJ Syst Biol Appl ; 5: 8, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30854223

RESUMO

Reliable identification of molecular biomarkers is essential for accurate patient stratification. While state-of-the-art machine learning approaches for sample classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in a clinical setting. Furthermore, many methods behave as black boxes, and we have very little understanding about the mechanisms that lead to the prediction. While opaqueness concerning machine behavior might not be a problem in deterministic domains, in health care, providing explanations about the molecular factors and phenotypes that are driving the classification is crucial to build trust in the performance of the predictive system. We propose Pathway-Induced Multiple Kernel Learning (PIMKL), a methodology to reliably classify samples that can also help gain insights into the molecular mechanisms that underlie the classification. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a Multiple Kernel Learning (MKL) algorithm, an approach that has demonstrated excellent performance in different machine learning applications. After optimizing the combination of kernels to predict a specific phenotype, the model provides a stable molecular signature that can be interpreted in the light of the ingested prior knowledge and that can be used in transfer learning tasks.

Assuntos

Biomarcadores Tumorais/classificação , Biologia Computacional/métodos , Algoritmos , Humanos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Software

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA