Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
2.
Math Biosci Eng ; 19(7): 6743-6763, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35730281

RESUMO

HIV-1 is a virus that destroys CD4 + cells in the body's immune system, causing a drastic decline in immune system performance. Analysis of HIV-1 gene expression data is urgently needed. Microarray technology is used to analyze gene expression data by measuring the expression of thousands of genes in various conditions. The gene expression series data, which are formed in three dimensions, are analyzed using triclustering. Triclustering is an analysis technique for 3D data that aims to group data simultaneously into rows and columns across different times/conditions. The result of this technique is called a tricluster. A tricluster is a subspace in the form of a subset of rows, columns, and time/conditions. In this study, we used the δ-Trimax, THD Tricluster, and MOEA methods by applying different measures, namely, transposed virtual error, the New Residue Score, and the Multi Slope Measure. The gene expression data consisted of 22,283 probe gene IDs, 40 observations, and four conditions: normal, acute, chronic, and non-progressor. Tricluster evaluation was carried out based on intertemporal homogeneity. An analysis of the probe ID gene that affects AIDS was carried out through this triclustering process. Based on this analysis, a gene symbol which is biomarkers associated with AIDS due to HIV-1, HLA-C, was found in every condition for normal, acute, chronic, and non-progressive HIV-1 patients.


Assuntos
Síndrome da Imunodeficiência Adquirida , HIV-1 , Algoritmos , Biomarcadores/análise , Análise por Conglomerados , Expressão Gênica , Perfilação da Expressão Gênica/métodos , HIV-1/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
3.
Comput Biol Chem ; 95: 107597, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34800858

RESUMO

Dipeptidyl peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus; however, some classes of these drugs exert side effects, including joint pain and pancreatitis. Studies suggest that these side effects might be related to secondary inhibition of DPP-8 and DPP-9. In this study, we identified DPP-4-inhibitor hit compounds selective against DPP-8 and DPP-9. We built a virtual screening workflow using a quantitative structure-activity relationship (QSAR) strategy based on artificial intelligence to allow faster screening of millions of molecules for the DPP-4 target relative to other screening methods. Five regression machine learning algorithms and four classification machine learning algorithms were applied to build virtual screening workflows, with the QSAR model applied using support vector regression (R2pred 0.78) and the classification QSAR model using the random forest algorithm with 92.2% accuracy. Virtual screening results of > 10 million molecules obtained 2 716 hits compounds with a pIC50 value of > 7.5. Additionally, molecular docking results of several potential hit compounds for DPP-4, DPP-8, and DPP-9 identified CH0002 as showing high inhibitory potential against DPP-4 and low inhibitory potential for DPP-8 and DPP-9 enzymes. These results demonstrated the effectiveness of this technique for identifying DPP-4-inhibitor hit compounds selective for DPP-4 and against DPP-8 and DPP-9 and suggest its potential efficacy for applications to discover hit compounds of other targets.


Assuntos
Inteligência Artificial , Dipeptidil Peptidase 4/metabolismo , Inibidores da Dipeptidil Peptidase IV/farmacologia , Simulação de Acoplamento Molecular , Relação Quantitativa Estrutura-Atividade , Inibidores da Dipeptidil Peptidase IV/química , Avaliação Pré-Clínica de Medicamentos , Humanos
4.
BMC Genomics ; 20(Suppl 9): 950, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874636

RESUMO

BACKGROUND: There are two significant problems associated with predicting protein-protein interactions using the sequences of amino acids. The first problem is representing each sequence as a feature vector, and the second is designing a model that can identify the protein interactions. Thus, effective feature extraction methods can lead to improved model performance. In this study, we used two types of feature extraction methods-global encoding and pseudo-substitution matrix representation (PseudoSMR)-to represent the sequences of amino acids in human proteins and Human Immunodeficiency Virus type 1 (HIV-1) to address the classification problem of predicting protein-protein interactions. We also compared principal component analysis (PCA) with independent principal component analysis (IPCA) as methods for transforming Rotation Forest. RESULTS: The results show that using global encoding and PseudoSMR as a feature extraction method successfully represents the amino acid sequence for the Rotation Forest classifier with PCA or with IPCA. This can be seen from the comparison of the results of evaluation metrics, which were >73% across the six different parameters. The accuracy of both methods was >74%. The results for the other model performance criteria, such as sensitivity, specificity, precision, and F1-score, were all >73%. The data used in this study can be accessed using the following link: https://www.dsc.ui.ac.id/research/amino-acid-pred/. CONCLUSIONS: Both global encoding and PseudoSMR can successfully represent the sequences of amino acids. Rotation Forest (PCA) performed better than Rotation Forest (IPCA) in terms of predicting protein-protein interactions between HIV-1 and human proteins. Both the Rotation Forest (PCA) classifier and the Rotation Forest IPCA classifier performed better than other classifiers, such as Gradient Boosting, K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine (SVM). Rotation Forest (PCA) and Rotation Forest (IPCA) have accuracy, sensitivity, specificity, precision, and F1-score values >70% while the other classifiers have values <70%.


Assuntos
Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , HIV-1 , Proteínas do Vírus da Imunodeficiência Humana/química , Humanos , Análise de Componente Principal , Máquina de Vetores de Suporte
5.
Artigo em Inglês | MEDLINE | ID: mdl-21483031

RESUMO

Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However,with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.


Assuntos
Gráficos por Computador , Análise por Conglomerados , Biologia Computacional/métodos , Simulação por Computador , Cadeias de Markov , Análise de Sequência com Séries de Oligonucleotídeos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...