Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Mol Inform ; 42(3): e2200232, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36529710

RESUMO

Maximum common substructures (MCS) have received a lot of attention in the chemoinformatics community. They are typically used as a similarity measure between molecules, showing high predictive performance when used in classification tasks, while being easily explainable substructures. In the present work, we applied the Pairwise Maximum Common Subgraph Feature Generation (PMCSFG) algorithm to automatically detect toxicophores (structural alerts) and to compute fingerprints based on MCS. We present a comparison between our MCS-based fingerprints and 12 well-known chemical fingerprints when used as features in machine learning models. We provide an experimental evaluation and discuss the usefulness of the different methods on mutagenicity data. The features generated by the MCS method have a state-of-the-art performance when predicting mutagenicity, while they are more interpretable than the traditional chemical fingerprints.


Assuntos
Algoritmos , Mutagênicos , Mutagênicos/química , Mutagênese , Aprendizado de Máquina
2.
Bioinformatics ; 37(10): 1360-1366, 2021 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-33444437

RESUMO

MOTIVATION: Population-level genetic variation enables competitiveness and niche specialization in microbial communities. Despite the difficulty in culturing many microbes from an environment, we can still study these communities by isolating and sequencing DNA directly from an environment (metagenomics). Recovering the genomic sequences of all isoforms of a given gene across all organisms in a metagenomic sample would aid evolutionary and ecological insights into microbial ecosystems with potential benefits for medicine and biotechnology. A significant obstacle to this goal arises from the lack of a computationally tractable solution that can recover these sequences from sequenced read fragments. This poses a problem analogous to reconstructing the two sequences that make up the genome of a diploid organism (i.e. haplotypes) but for an unknown number of individuals and haplotypes. RESULTS: The problem of single individual haplotyping was first formalized by Lancia et al. in 2001. Now, nearly two decades later, we discuss the complexity of 'haplotyping' metagenomic samples, with a new formalization of Lancia et al.'s data structure that allows us to effectively extend the single individual haplotype problem to microbial communities. This work describes and formalizes the problem of recovering genes (and other genomic subsequences) from all individuals within a complex community sample, which we term the metagenomic individual haplotyping problem. We also provide software implementations for a pairwise single nucleotide variant (SNV) co-occurrence matrix and greedy graph traversal algorithm. AVAILABILITY AND IMPLEMENTATION: Our reference implementation of the described pairwise SNV matrix (Hansel) and greedy haplotype path traversal algorithm (Gretel) is open source, MIT licensed and freely available online at github.com/samstudio8/hansel and github.com/samstudio8/gretel, respectively.

3.
Science ; 355(6327): 820-826, 2017 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-28219971

RESUMO

It is still not possible to predict whether a given molecule will have a perceived odor or what olfactory percept it will produce. We therefore organized the crowd-sourced DREAM Olfaction Prediction Challenge. Using a large olfactory psychophysical data set, teams developed machine-learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models accurately predicted odor intensity and pleasantness and also successfully predicted 8 among 19 rated semantic descriptors ("garlic," "fish," "sweet," "fruit," "burnt," "spices," "flower," and "sour"). Regularized linear models performed nearly as well as random forest-based ones, with a predictive accuracy that closely approaches a key theoretical limit. These models help to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule.


Assuntos
Odorantes , Percepção Olfatória , Olfato , Adulto , Conjuntos de Dados como Assunto , Humanos , Masculino , Modelos Biológicos
4.
Expert Rev Proteomics ; 13(5): 495-511, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27031651

RESUMO

With the current expanded technical capabilities to perform mass spectrometry-based biomedical proteomics experiments, an improved focus on the design of experiments is crucial. As it is clear that ignoring the importance of a good design leads to an unprecedented rate of false discoveries which would poison our results, more and more tools are developed to help researchers designing proteomic experiments. In this review, we apply statistical thinking to go through the entire proteomics workflow for biomarker discovery and validation and relate the considerations that should be made at the level of hypothesis building, technology selection, experimental design and the optimization of the experimental parameters.


Assuntos
Espectrometria de Massas/métodos , Proteômica/métodos , Projetos de Pesquisa , Humanos , Proteômica/estatística & dados numéricos , Proteômica/tendências
5.
J R Soc Interface ; 12(104): 20141289, 2015 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-25652463

RESUMO

There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist 'Eve' designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax.


Assuntos
Desenho de Fármacos , Reposicionamento de Medicamentos , Doenças Raras/tratamento farmacológico , Tecnologia Farmacêutica/tendências , Algoritmos , Antineoplásicos/uso terapêutico , Automação , Avaliação Pré-Clínica de Medicamentos , Humanos , Malária Vivax/tratamento farmacológico , Modelos Estatísticos , Plasmodium vivax/efeitos dos fármacos , Relação Quantitativa Estrutura-Atividade , Análise de Regressão , Reprodutibilidade dos Testes , Software , Medicina Tropical
6.
Proteomics ; 14(4-5): 353-66, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24323524

RESUMO

Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis.


Assuntos
Inteligência Artificial , Biologia Computacional , Proteômica/métodos , Padrões de Referência , Projetos de Pesquisa
7.
J Biomed Semantics ; 4 Suppl 1: S7, 2013 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-23734675

RESUMO

The theory of probability is widely used in biomedical research for data analysis and modelling. In previous work the probabilities of the research hypotheses have been recorded as experimental metadata. The ontology HELO is designed to support probabilistic reasoning, and provides semantic descriptors for reporting on research that involves operations with probabilities. HELO explicitly links research statements such as hypotheses, models, laws, conclusions, etc. to the associated probabilities of these statements being true. HELO enables the explicit semantic representation and accurate recording of probabilities in hypotheses, as well as the inference methods used to generate and update those hypotheses. We demonstrate the utility of HELO on three worked examples: changes in the probability of the hypothesis that sirtuins regulate human life span; changes in the probability of hypotheses about gene functions in the S. cerevisiae aromatic amino acid pathway; and the use of active learning in drug design (quantitative structure activity relation learning), where a strategy for the selection of compounds with the highest probability of improving on the best known compound was used. HELO is open source and available at https://github.com/larisa-soldatova/HELO.

8.
Bioinformatics ; 29(15): 1913-4, 2013 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-23709496

RESUMO

SUMMARY: We present PIUS, a tool that identifies peptides from tandem mass spectrometry data by analyzing the six-frame translation of a complete genome. It differs from earlier studies that have performed such a genomic search in two ways: (i) it considers a larger search space and (ii) it is designed for natural peptide identification rather than proteomics. Differently from other peptidomics tools designed for genome-wide searches, PIUS does not limit the analysis to a set of sequences that match a list of de novo reconstructions. AVAILABILITY: Source code, executables and a detailed technical report are freely available at http://dtai.cs.kuleuven.be/ml/systems/pius. CONTACT: eduardo.costa@cs.kuleuven.be SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Peptídeos/química , Software , Espectrometria de Massas em Tandem , Algoritmos , Animais , Linhagem Celular , Bases de Dados de Proteínas , Genoma , Genômica , Camundongos , Peptídeos/análise , Proteômica/métodos , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...