Pesquisa | Portal Regional da BVS (teste)

Correction: Triclustering method for finding biomarkers in human immunodeficiency virus-1 gene expression data.

Siswantining, Titin; Bustamam, Alhadi; Sarwinda, Devvi; Soemartojo, Saskya Mary; Latief, Moh Abdul; Octaria, Elke Annisa; Siregar, Anggrainy Togi Marito; Septa, Oon; Al-Ash, Herley Shaori; Saputra, Noval.

Math Biosci Eng ; 20(4): 7298-7301, 2023 Feb 14.

Artigo em Inglês | MEDLINE | ID: mdl-37161152

Triclustering method for finding biomarkers in human immunodeficiency virus-1 gene expression data.

Math Biosci Eng ; 19(7): 6743-6763, 2022 05 05.

Artigo em Inglês | MEDLINE | ID: mdl-35730281

RESUMO

HIV-1 is a virus that destroys CD4 + cells in the body's immune system, causing a drastic decline in immune system performance. Analysis of HIV-1 gene expression data is urgently needed. Microarray technology is used to analyze gene expression data by measuring the expression of thousands of genes in various conditions. The gene expression series data, which are formed in three dimensions, are analyzed using triclustering. Triclustering is an analysis technique for 3D data that aims to group data simultaneously into rows and columns across different times/conditions. The result of this technique is called a tricluster. A tricluster is a subspace in the form of a subset of rows, columns, and time/conditions. In this study, we used the Î´-Trimax, THD Tricluster, and MOEA methods by applying different measures, namely, transposed virtual error, the New Residue Score, and the Multi Slope Measure. The gene expression data consisted of 22,283 probe gene IDs, 40 observations, and four conditions: normal, acute, chronic, and non-progressor. Tricluster evaluation was carried out based on intertemporal homogeneity. An analysis of the probe ID gene that affects AIDS was carried out through this triclustering process. Based on this analysis, a gene symbol which is biomarkers associated with AIDS due to HIV-1, HLA-C, was found in every condition for normal, acute, chronic, and non-progressive HIV-1 patients.

Assuntos

Síndrome da Imunodeficiência Adquirida , HIV-1 , Algoritmos , Biomarcadores/análise , Análise por Conglomerados , Expressão Gênica , Perfilação da Expressão Gênica/métodos , HIV-1/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos

Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure-activity relationship-based artificial intelligence and molecular docking of hit compounds.

Hermansyah, Oky; Bustamam, Alhadi; Yanuar, Arry.

Comput Biol Chem ; 95: 107597, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34800858

RESUMO

Dipeptidyl peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus; however, some classes of these drugs exert side effects, including joint pain and pancreatitis. Studies suggest that these side effects might be related to secondary inhibition of DPP-8 and DPP-9. In this study, we identified DPP-4-inhibitor hit compounds selective against DPP-8 and DPP-9. We built a virtual screening workflow using a quantitative structure-activity relationship (QSAR) strategy based on artificial intelligence to allow faster screening of millions of molecules for the DPP-4 target relative to other screening methods. Five regression machine learning algorithms and four classification machine learning algorithms were applied to build virtual screening workflows, with the QSAR model applied using support vector regression (R2pred 0.78) and the classification QSAR model using the random forest algorithm with 92.2% accuracy. Virtual screening results of > 10 million molecules obtained 2 716 hits compounds with a pIC50 value of > 7.5. Additionally, molecular docking results of several potential hit compounds for DPP-4, DPP-8, and DPP-9 identified CH0002 as showing high inhibitory potential against DPP-4 and low inhibitory potential for DPP-8 and DPP-9 enzymes. These results demonstrated the effectiveness of this technique for identifying DPP-4-inhibitor hit compounds selective for DPP-4 and against DPP-8 and DPP-9 and suggest its potential efficacy for applications to discover hit compounds of other targets.

Assuntos

Inteligência Artificial , Dipeptidil Peptidase 4/metabolismo , Inibidores da Dipeptidil Peptidase IV/farmacologia , Simulação de Acoplamento Molecular , Relação Quantitativa Estrutura-Atividade , Inibidores da Dipeptidil Peptidase IV/química , Avaliação Pré-Clínica de Medicamentos , Humanos

Performance of rotation forest ensemble classifier and feature extractor in predicting protein interactions using amino acid sequences.

Bustamam, Alhadi; Musti, Mohamad I S; Hartomo, Susilo; Aprilia, Shirley; Tampubolon, Patuan P; Lestari, Dian.

BMC Genomics ; 20(Suppl 9): 950, 2019 Dec 24.

Artigo em Inglês | MEDLINE | ID: mdl-31874636

RESUMO

BACKGROUND: There are two significant problems associated with predicting protein-protein interactions using the sequences of amino acids. The first problem is representing each sequence as a feature vector, and the second is designing a model that can identify the protein interactions. Thus, effective feature extraction methods can lead to improved model performance. In this study, we used two types of feature extraction methods-global encoding and pseudo-substitution matrix representation (PseudoSMR)-to represent the sequences of amino acids in human proteins and Human Immunodeficiency Virus type 1 (HIV-1) to address the classification problem of predicting protein-protein interactions. We also compared principal component analysis (PCA) with independent principal component analysis (IPCA) as methods for transforming Rotation Forest. RESULTS: The results show that using global encoding and PseudoSMR as a feature extraction method successfully represents the amino acid sequence for the Rotation Forest classifier with PCA or with IPCA. This can be seen from the comparison of the results of evaluation metrics, which were >73% across the six different parameters. The accuracy of both methods was >74%. The results for the other model performance criteria, such as sensitivity, specificity, precision, and F1-score, were all >73%. The data used in this study can be accessed using the following link: https://www.dsc.ui.ac.id/research/amino-acid-pred/. CONCLUSIONS: Both global encoding and PseudoSMR can successfully represent the sequences of amino acids. Rotation Forest (PCA) performed better than Rotation Forest (IPCA) in terms of predicting protein-protein interactions between HIV-1 and human proteins. Both the Rotation Forest (PCA) classifier and the Rotation Forest IPCA classifier performed better than other classifiers, such as Gradient Boosting, K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine (SVM). Rotation Forest (PCA) and Rotation Forest (IPCA) have accuracy, sensitivity, specificity, precision, and F1-score values >70% while the other classifiers have values <70%.

Assuntos

Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , HIV-1 , Proteínas do Vírus da Imunodeficiência Humana/química , Humanos , Análise de Componente Principal , Máquina de Vetores de Suporte

Fast parallel Markov clustering in bioinformatics using massively parallel computing on GPU with CUDA and ELLPACK-R sparse format.

Bustamam, Alhadi; Burrage, Kevin; Hamilton, Nicholas A.

IEEE/ACM Trans Comput Biol Bioinform ; 9(3): 679-92, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-21483031

RESUMO

Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.

Assuntos

Gráficos por Computador , Análise por Conglomerados , Biologia Computacional/métodos , Simulação por Computador , Cadeias de Markov , Análise de Sequência com Séries de Oligonucleotídeos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA