Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Genomics ; 17: 205, 2016 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-26956490

RESUMO

BACKGROUND: Chemical bioavailability is an important dose metric in environmental risk assessment. Although many approaches have been used to evaluate bioavailability, not a single approach is free from limitations. Previously, we developed a new genomics-based approach that integrated microarray technology and regression modeling for predicting bioavailability (tissue residue) of explosives compounds in exposed earthworms. In the present study, we further compared 18 different regression models and performed variable selection simultaneously with parameter estimation. RESULTS: This refined approach was applied to both previously collected and newly acquired earthworm microarray gene expression datasets for three explosive compounds. Our results demonstrate that a prediction accuracy of R(2) = 0.71-0.82 was achievable at a relatively low model complexity with as few as 3-10 predictor genes per model. These results are much more encouraging than our previous ones. CONCLUSION: This study has demonstrated that our approach is promising for bioavailability measurement, which warrants further studies of mixed contamination scenarios in field settings.


Assuntos
Substâncias Explosivas/farmacocinética , Perfilação da Expressão Gênica/métodos , Oligoquetos/genética , Poluentes do Solo/farmacocinética , Animais , Azocinas/farmacocinética , Disponibilidade Biológica , Oligoquetos/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Regressão , Triazinas/farmacocinética , Trinitrotolueno/farmacocinética
2.
Mol Inform ; 33(9): 627-40, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27486081

RESUMO

Glycogen synthase kinase-3 (GSK-3) is a multifunctional serine/threonine protein kinase which regulates a wide range of cellular processes, involving various signalling pathways. GSK-3ß has emerged as an important therapeutic target for diabetes and Alzheimer's disease. To identify structurally novel GSK-3ß inhibitors, we performed virtual screening by implementing a combined ligand-based/structure-based approach, which included quantitative structure-activity relationship (QSAR) analysis and docking prediction. To integrate and analyze complex data sets from multiple experimental sources, we drafted and validated a hierarchical QSAR method, which adopts a two-level structure to take data heterogeneity into account. A collection of 728 GSK-3 inhibitors with diverse structural scaffolds was obtained from published papers that used different experimental assay protocols. Support vector machines and random forests were implemented with wrapper-based feature selection algorithms to construct predictive learning models. The best models for each single group of compounds were then used to build the final hierarchical QSAR model, with an overall R(2) of 0.752 for the 141 compounds in the test set. The compounds obtained from the virtual screening experiment were tested for GSK-3ß inhibition. The bioassay results confirmed that 2 hit compounds are indeed GSK-3ß inhibitors exhibiting sub-micromolar inhibitory activity, and therefore validated our combined ligand-based/structure-based approach as effective for virtual screening experiments.

3.
BMC Bioinformatics ; 14 Suppl 14: S16, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24267824

RESUMO

BACKGROUND: In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model. RESULTS: In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets. CONCLUSIONS: The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast.


Assuntos
Descoberta de Drogas , Algoritmos , Inteligência Artificial , Humanos , Imageamento Tridimensional , Ligantes , Modelos Moleculares , Conformação Molecular
4.
BMC Bioinformatics ; 13 Suppl 15: S3, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23046442

RESUMO

BACKGROUND: In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. METHODS: We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features. RESULTS: The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach. CONCLUSIONS: The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Descoberta de Drogas , Modelos Teóricos , Conformação Molecular , Relação Quantitativa Estrutura-Atividade
5.
IEEE Trans Nanobioscience ; 11(3): 228-36, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22987128

RESUMO

There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.


Assuntos
Algoritmos , Inteligência Artificial , Biologia Computacional/métodos , Árvores de Decisões , Membro 1 da Subfamília B de Cassetes de Ligação de ATP/genética , Bases de Dados Factuais , Humanos , Modelos Teóricos , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos , Receptores de Canabinoides/genética , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...