Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
J Comput Biol ; 31(6): 576-588, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38758925

RESUMO

Single-cell RNA sequencing (scRNA-seq) technology provides a means for studying biology from a cellular perspective. The fundamental goal of scRNA-seq data analysis is to discriminate single-cell types using unsupervised clustering. Few single-cell clustering algorithms have taken into account both deep and surface information, despite the recent slew of suggestions. Consequently, this article constructs a fusion learning framework based on deep learning, namely scGASI. For learning a clustering similarity matrix, scGASI integrates data affinity recovery and deep feature embedding in a unified scheme based on various top feature sets. Next, scGASI learns the low-dimensional latent representation underlying the data using a graph autoencoder to mine the hidden information residing in the data. To efficiently merge the surface information from raw area and the deeper potential information from underlying area, we then construct a fusion learning model based on self-expression. scGASI uses this fusion learning model to learn the similarity matrix of an individual feature set as well as the clustering similarity matrix of all feature sets. Lastly, gene marker identification, visualization, and clustering are accomplished using the clustering similarity matrix. Extensive verification on actual data sets demonstrates that scGASI outperforms many widely used clustering techniques in terms of clustering accuracy.


Assuntos
Algoritmos , Aprendizado Profundo , Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Análise de Sequência de RNA/métodos , Humanos , Biologia Computacional/métodos
2.
Acta Pharmacol Sin ; 45(2): 391-404, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37803139

RESUMO

Hepatocellular carcinoma (HCC) is one of the most common and deadly cancers in the world. The therapeutic outlook for HCC patients has significantly improved with the advent and development of systematic and targeted therapies such as sorafenib and lenvatinib; however, the rise of drug resistance and the high mortality rate necessitate the continuous discovery of effective targeting agents. To discover novel anti-HCC compounds, we first constructed a deep learning-based chemical representation model to screen more than 6 million compounds in the ZINC15 drug-like library. We successfully identified LGOd1 as a novel anticancer agent with a characteristic levoglucosenone (LGO) scaffold. The mechanistic studies revealed that LGOd1 treatment leads to HCC cell death by interfering with cellular copper homeostasis, which is similar to a recently reported copper-dependent cell death named cuproptosis. While the prototypical cuproptosis is brought on by copper ionophore-induced copper overload, mechanistic studies indicated that LGOd1 does not act as a copper ionophore, but most likely by interacting with the copper chaperone protein CCS, thus LGOd1 represents a potentially new class of compounds with unique cuproptosis-inducing property. In summary, our findings highlight the critical role of bioavailable copper in the regulation of cell death and represent a novel route of cuproptosis induction.


Assuntos
Carcinoma Hepatocelular , Aprendizado Profundo , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/tratamento farmacológico , Cobre , Neoplasias Hepáticas/tratamento farmacológico , Ionóforos , Apoptose
3.
Sci Bull (Beijing) ; 68(22): 2729-2733, 2023 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-37833190

RESUMO

The electromagnetic form factors of the proton and the neutron in the timelike region are investigated. Electron-positron annihilation into antinucleon-nucleon (N¯N) pairs is treated in distorted wave Born approximation, including the final-state interaction in the N¯N system. The latter is obtained by a Lippmann-Schwinger equation for N¯N potentials derived within SU(3) chiral effective field theory. By fitting to the phase shifts and (differential) cross section data, a high quality description is achieved. With these amplitudes, the oscillations of the electromagnetic form factors of the proton and the neutron are studied. It is found that each of them can be described by two fractional oscillators. One is characterized as "overdamped" and dominates near the threshold, while the other is "underdamped" and plays an important role in the high-energy region. These two oscillators are essential to understand the distributions of polarized electric charges induced by hard photons for the nucleons.

4.
J Comput Biol ; 30(8): 848-860, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37471220

RESUMO

The development of single-cell transcriptome sequencing technologies has opened new ways to study biological phenomena at the cellular level. A key application of such technologies involves the employment of single-cell RNA sequencing (scRNA-seq) data to identify distinct cell types through clustering, which in turn provides evidence for revealing heterogeneity. Despite the promise of this approach, the inherent characteristics of scRNA-seq data, such as higher noise levels and lower coverage, pose major challenges to existing clustering methods and compromise their accuracy. In this study, we propose a method called Adjusted Random walk Graph regularization Sparse Low-Rank Representation (ARGLRR), a practical sparse subspace clustering method, to identify cell types. The fundamental low-rank representation (LRR) model is concerned with the global structure of data. To address the limited ability of the LRR method to capture local structure, we introduced adjusted random walk graph regularization in its framework. ARGLRR allows for the capture of both local and global structures in scRNA-seq data. Additionally, the imposition of similarity constraints into the LRR framework further improves the ability of the proposed model to estimate cell-to-cell similarity and capture global structural relationships between cells. ARGLRR surpasses other advanced comparison approaches on nine known scRNA-seq data sets judging by the results. In the normalized mutual information and Adjusted Rand Index metrics on the scRNA-seq data sets clustering experiments, ARGLRR outperforms the best-performing comparative method by 6.99% and 5.85%, respectively. In addition, we visualize the result using Uniform Manifold Approximation and Projection. Visualization results show that the usage of ARGLRR enhances the separation of different cell types within the similarity matrix.


Assuntos
Algoritmos , RNA , Análise por Conglomerados , Análise de Célula Única/métodos , Análise de Sequência de RNA , Perfilação da Expressão Gênica
5.
Mil Med Res ; 10(1): 7, 2023 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-36814339

RESUMO

BACKGROUND: Triclosan [5-chloro-2-(2,4-dichlorophenoxy) phenol, TCS], a common antimicrobial additive in many personal care and health care products, is frequently detected in human blood and urine. Therefore, it has been considered an emerging and potentially toxic pollutant in recent years. Long-term exposure to TCS has been suggested to exert endocrine disruption effects, and promote liver fibrogenesis and tumorigenesis. This study was aimed at clarifying the underlying cellular and molecular mechanisms of hepatotoxicity effect of TCS at the initiation stage. METHODS: C57BL/6 mice were exposed to different dosages of TCS for 2 weeks and the organ toxicity was evaluated by various measurements including complete blood count, histological analysis and TCS quantification. Single cell RNA sequencing (scRNA-seq) was then carried out on TCS- or mock-treated mouse livers to delineate the TCS-induced hepatotoxicity. The acquired single-cell transcriptomic data were analyzed from different aspects including differential gene expression, transcription factor (TF) regulatory network, pseudotime trajectory, and cellular communication, to systematically dissect the molecular and cellular events after TCS exposure. To verify the TCS-induced liver fibrosis, the expression levels of key fibrogenic proteins were examined by Western blotting, immunofluorescence, Masson's trichrome and Sirius red staining. In addition, normal hepatocyte cell MIHA and hepatic stellate cell LX-2 were used as in vitro cell models to experimentally validate the effects of TCS by immunological, proteomic and metabolomic technologies. RESULTS: We established a relatively short term TCS exposure murine model and found the TCS mainly accumulated in the liver. The scRNA-seq performed on the livers of the TCS-treated and control group profiled the gene expressions of > 76,000 cells belonging to 13 major cell types. Among these types, hepatocytes and hepatic stellate cells (HSCs) were significantly increased in TCS-treated group. We found that TCS promoted fibrosis-associated proliferation of hepatocytes, in which Gata2 and Mef2c are the key driving TFs. Our data also suggested that TCS induced the proliferation and activation of HSCs, which was experimentally verified in both liver tissue and cell model. In addition, other changes including the dysfunction and capillarization of endothelial cells, an increase of fibrotic characteristics in B plasma cells, and M2 phenotype-skewing of macrophage cells, were also deduced from the scRNA-seq analysis, and these changes are likely to contribute to the progression of liver fibrosis. Lastly, the key differential ligand-receptor pairs involved in cellular communications were identified and we confirmed the role of GAS6_AXL interaction-mediated cellular communication in promoting liver fibrosis. CONCLUSIONS: TCS modulates the cellular activities and fates of several specific cell types (including hepatocytes, HSCs, endothelial cells, B cells, Kupffer cells and liver capsular macrophages) in the liver, and regulates the ligand-receptor interactions between these cells, thereby promoting the proliferation and activation of HSCs, leading to liver fibrosis. Overall, we provide the first comprehensive single-cell atlas of mouse livers in response to TCS and delineate the key cellular and molecular processes involved in TCS-induced hepatotoxicity and fibrosis.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas , Triclosan , Humanos , Camundongos , Animais , Transcriptoma , Células Endoteliais/metabolismo , Células Endoteliais/patologia , Ligantes , Proteômica , Camundongos Endogâmicos C57BL , Cirrose Hepática/metabolismo , Cirrose Hepática/patologia , Fibrose , Doença Hepática Induzida por Substâncias e Drogas/patologia
6.
Artigo em Inglês | MEDLINE | ID: mdl-36063515

RESUMO

Automatic seizure detection system can serve as a meaningful clinical tool for the treatment and analysis of epilepsy using electroencephalogram (EEG) and has obtained rapid development. An automatic detection of epileptic seizure method based on kernel-based robust probabilistic collaborative representation (ProCRC) combined with graph-regularized non-negative matrix factorization (GNMF) is proposed in this work. The raw EEG signals are pre-processed through the wavelet transform to obtain time-frequency distribution of EEG signals as preliminary feature information and GNMF is further employed for dimension reduction, retaining and enhancing the productive feature information of EEG signals. Then, the test sample is represented using robust ProCRC that can decide whether the testing sample belongs to each class (seizure or non-seizure) by jointly maximizing the likelihood. In addition, the kernel trick is applied to improve the separability of non-linear high dimensional EEG signals in robust ProCRC. Finally, post-processing techniques are introduced to generate more accurate and reliable results. The average epoch-based sensitivity of 96.48%, event-based sensitivity of 93.65% and specificity of 98.55% are acquired in this method, which is evaluated on the public Freiburg EEG database.


Assuntos
Epilepsia , Convulsões , Algoritmos , Eletroencefalografia/métodos , Epilepsia/diagnóstico , Humanos , Convulsões/diagnóstico , Processamento de Sinais Assistido por Computador , Análise de Ondaletas
7.
Mil Med Res ; 9(1): 30, 2022 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-35698214

RESUMO

BACKGROUND: Malaria is a devastating infectious disease that disproportionally threatens hundreds of millions of people in developing countries. In the history of anti-malaria campaign, chloroquine (CQ) has played an indispensable role, however, its mechanism of action (MoA) is not fully understood. METHODS: We used the principle of photo-affinity labeling and click chemistry-based functionalization in the design of a CQ probe and developed a combined deconvolution strategy of activity-based protein profiling (ABPP) and mass spectrometry-coupled cellular thermal shift assay (MS-CETSA) that identified the protein targets of CQ in an unbiased manner in this study. The interactions between CQ and these identified potential protein hits were confirmed by biophysical and enzymatic assays. RESULTS: We developed a novel clickable, photo-affinity chloroquine analog probe (CQP) which retains the antimalarial activity in the nanomole range, and identified a total of 40 proteins that specifically interacted and photo-crosslinked with CQP which was inhibited in the presence of excess CQ. Using MS-CETSA, we identified 83 candidate interacting proteins out of a total of 3375 measured parasite proteins. At the same time, we identified 8 proteins as the most potential hits which were commonly identified by both methods. CONCLUSIONS: We found that CQ could disrupt glycolysis and energy metabolism of malarial parasites through direct binding with some of the key enzymes, a new mechanism that is different from its well-known inhibitory effect of hemozoin formation. This is the first report of identifying CQ antimalarial targets by a parallel usage of labeled (ABPP) and label-free (MS-CETSA) methods.


Assuntos
Antimaláricos , Malária , Antimaláricos/farmacologia , Antimaláricos/uso terapêutico , Cloroquina/farmacologia , Cloroquina/uso terapêutico , Humanos , Malária/tratamento farmacológico , Espectrometria de Massas
8.
Mil Med Res ; 9(1): 22, 2022 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-35596191

RESUMO

BACKGROUND: Sepsis involves life-threatening organ dysfunction and is caused by a dysregulated host response to infection. No specific therapies against sepsis have been reported. Celastrol (Cel) is a natural anti-inflammatory compound that shows potential against systemic inflammatory diseases. This study aimed to investigate the pharmacological activity and molecular mechanism of Cel in models of endotoxemia and sepsis. METHODS: We evaluated the anti-inflammatory efficacy of Cel against endotoxemia and sepsis in mice and macrophage cultures treated with lipopolysaccharide (LPS). We screened for potential protein targets of Cel using activity-based protein profiling (ABPP). Potential targets were validated using biophysical methods such as cellular thermal shift assays (CETSA) and surface plasmon resonance (SPR). Residues involved in Cel binding to target proteins were identified through point mutagenesis, and the functional effects of such binding were explored through gene knockdown. RESULTS: Cel protected mice from lethal endotoxemia and improved their survival with sepsis, and it significantly decreased the levels of pro-inflammatory cytokines in mice and macrophages treated with LPS (P < 0.05). Cel bound to Cys424 of pyruvate kinase M2 (PKM2), inhibiting the enzyme and thereby suppressing aerobic glycolysis (Warburg effect). Cel also bound to Cys106 in high mobility group box 1 (HMGB1) protein, reducing the secretion of inflammatory cytokine interleukin (IL)-1ß. Cel bound to the Cys residues in lactate dehydrogenase A (LDHA). CONCLUSION: Cel inhibits inflammation and the Warburg effect in sepsis via targeting PKM2 and HMGB1 protein.


Assuntos
Endotoxemia , Proteína HMGB1 , Sepse , Animais , Anti-Inflamatórios/uso terapêutico , Citocinas/uso terapêutico , Endotoxemia/tratamento farmacológico , Proteína HMGB1/metabolismo , Proteína HMGB1/uso terapêutico , Humanos , Inflamação/tratamento farmacológico , Inflamação/metabolismo , Lipopolissacarídeos/uso terapêutico , Camundongos , Triterpenos Pentacíclicos , Piruvato Quinase/genética , Piruvato Quinase/metabolismo , Piruvato Quinase/uso terapêutico , Sepse/tratamento farmacológico
9.
BMC Med Inform Decis Mak ; 22(1): 69, 2022 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-35305630

RESUMO

BACKGROUND: MiRNA is a class of non-coding single-stranded RNA molecules with a length of approximately 22 nucleotides encoded by endogenous genes, which can regulate the expression of other genes. Therefore, it is very important to predict the associations between miRNA and disease. Predecessors developed a new prediction method of drug-disease association, and it achieved good results. METHODS: In this paper, we introduced the method of LAGCN to identify potential miRNA-disease associations. First, we integrate three associations into a heterogeneous network, such as the known miRNA-disease association, miRNA-miRNA similarities and disease-disease similarities, next we apply graph convolution network to learn the embedding of miRNA and disease. We use an attention mechanism to combine embedding from multiple convolution layers. Unobserved miRNA-disease associations are scored based on integrated embedding. RESULTS: After fivefold cross-validations, the value of AUC is reached 0.9091, which is higher than other prediction methods and baseline methods. CONCLUSIONS: In this paper, we introduced the method of LAGCN to identify potential miRNA-disease associations. LAGCN has achieved good performance in predicting miRNA-disease associations, and it is superior to other association prediction methods and baseline methods.


Assuntos
MicroRNAs , Algoritmos , Biologia Computacional/métodos , Humanos , MicroRNAs/genética
10.
BMC Bioinformatics ; 22(Suppl 12): 334, 2022 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-35057729

RESUMO

BACKGROUND: The identification of cancer types is of great significance for early diagnosis and clinical treatment of cancer. Clustering cancer samples is an important means to identify cancer types, which has been paid much attention in the field of bioinformatics. The purpose of cancer clustering is to find expression patterns of different cancer types, so that the samples with similar expression patterns can be gathered into the same type. In order to improve the accuracy and reliability of cancer clustering, many clustering methods begin to focus on the integration analysis of cancer multi-omics data. Obviously, the methods based on multi-omics data have more advantages than those using single omics data. However, the high heterogeneity and noise of cancer multi-omics data pose a great challenge to the multi-omics analysis method. RESULTS: In this study, in order to extract more complementary information from cancer multi-omics data for cancer clustering, we propose a low-rank subspace clustering method called multi-view manifold regularized compact low-rank representation (MmCLRR). In MmCLRR, each omics data are regarded as a view, and it learns a consistent subspace representation by imposing a consistence constraint on the low-rank affinity matrix of each view to balance the agreement between different views. Moreover, the manifold regularization and concept factorization are introduced into our method. Relying on the concept factorization, the dictionary can be updated in the learning, which greatly improves the subspace learning ability of low-rank representation. We adopt linearized alternating direction method with adaptive penalty to solve the optimization problem of MmCLRR method. CONCLUSIONS: Finally, we apply MmCLRR into the clustering of cancer samples based on multi-omics data, and the clustering results show that our method outperforms the existing multi-view methods.


Assuntos
Algoritmos , Neoplasias , Análise por Conglomerados , Biologia Computacional , Humanos , Neoplasias/genética , Reprodutibilidade dos Testes
11.
Rep Prog Phys ; 84(7)2021 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-33882459

RESUMO

The description of strong interaction physics of low-lying resonances is out of the valid range of perturbative QCD. Chiral effective field theories (EFTs) have been developed to tackle the issue. Partial wave dynamics is the systematic tool to decode the underlying physics and reveal the properties of those resonances. It is extremely powerful and helpful for our understanding of the non-perturbative regime, especially when dispersion techniques are utilized simultaneously. Recently, plenty of exotic/ordinary hadrons have been reported by experiment collaborations, e.g. LHCb, Belle, and BESIII, etc. In this review, we summarize the recent progress on the applications of partial wave dynamics combined with chiral EFTs and dispersion relations, on related topics, with emphasis onππ,πK,πNandK̄Nscatterings.

12.
BMC Bioinformatics ; 22(1): 175, 2021 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-33794766

RESUMO

BACKGROUND: Identifying lncRNA-disease associations not only helps to better comprehend the underlying mechanisms of various human diseases at the lncRNA level but also speeds up the identification of potential biomarkers for disease diagnoses, treatments, prognoses, and drug response predictions. However, as the amount of archived biological data continues to grow, it has become increasingly difficult to detect potential human lncRNA-disease associations from these enormous biological datasets using traditional biological experimental methods. Consequently, developing new and effective computational methods to predict potential human lncRNA diseases is essential. RESULTS: Using a combination of incremental principal component analysis (IPCA) and random forest (RF) algorithms and by integrating multiple similarity matrices, we propose a new algorithm (IPCARF) based on integrated machine learning technology for predicting lncRNA-disease associations. First, we used two different models to compute a semantic similarity matrix of diseases from a directed acyclic graph of diseases. Second, a characteristic vector for each lncRNA-disease pair is obtained by integrating disease similarity, lncRNA similarity, and Gaussian nuclear similarity. Then, the best feature subspace is obtained by applying IPCA to decrease the dimension of the original feature set. Finally, we train an RF model to predict potential lncRNA-disease associations. The experimental results show that the IPCARF algorithm effectively improves the AUC metric when predicting potential lncRNA-disease associations. Before the parameter optimization procedure, the AUC value predicted by the IPCARF algorithm under 10-fold cross-validation reached 0.8529; after selecting the optimal parameters using the grid search algorithm, the predicted AUC of the IPCARF algorithm reached 0.8611. CONCLUSIONS: We compared IPCARF with the existing LRLSLDA, LRLSLDA-LNCSIM, TPGLDA, NPCMF, and ncPred prediction methods, which have shown excellent performance in predicting lncRNA-disease associations. The compared results of 10-fold cross-validation procedures show that the predictions of the IPCARF method are better than those of the other compared methods.


Assuntos
Biologia Computacional , Aprendizado de Máquina , RNA Longo não Codificante , Algoritmos , Humanos , Análise de Componente Principal , RNA Longo não Codificante/genética
13.
Front Genet ; 12: 621317, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33708239

RESUMO

The dimensionality reduction method accompanied by different norm constraints plays an important role in mining useful information from large-scale gene expression data. In this article, a novel method named Lp-norm and L2,1-norm constrained graph Laplacian principal component analysis (PL21GPCA) based on traditional principal component analysis (PCA) is proposed for robust tumor sample clustering and gene network module discovery. Three aspects are highlighted in the PL21GPCA method. First, to degrade the high sensitivity to outliers and noise, the non-convex proximal Lp-norm (0 < p < 1)constraint is applied on the loss function. Second, to enhance the sparsity of gene expression in cancer samples, the L2,1-norm constraint is used on one of the regularization terms. Third, to retain the geometric structure of the data, we introduce the graph Laplacian regularization item to the PL21GPCA optimization model. Extensive experiments on five gene expression datasets, including one benchmark dataset, two single-cancer datasets from The Cancer Genome Atlas (TCGA), and two integrated datasets of multiple cancers from TCGA, are performed to validate the effectiveness of our method. The experimental results demonstrate that the PL21GPCA method performs better than many other methods in terms of tumor sample clustering. Additionally, this method is used to discover the gene network modules for the purpose of finding key genes that may be associated with some cancers.

14.
J Bioinform Comput Biol ; 19(1): 2050047, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33410727

RESUMO

Non-negative Matrix Factorization (NMF) is a popular data dimension reduction method in recent years. The traditional NMF method has high sensitivity to data noise. In the paper, we propose a model called Sparse Robust Graph-regularized Non-negative Matrix Factorization based on Correntropy (SGNMFC). The maximized correntropy replaces the traditional minimized Euclidean distance to improve the robustness of the algorithm. Through the kernel function, correntropy can give less weight to outliers and noise in data but give greater weight to meaningful data. Meanwhile, the geometry structure of the high-dimensional data is completely preserved in the low-dimensional manifold through the graph regularization. Feature selection and sample clustering are commonly used methods for analyzing genes. Sparse constraints are applied to the loss function to reduce matrix complexity and analysis difficulty. Comparing the other five similar methods, the effectiveness of the SGNMFC model is proved by selection of differentially expressed genes and sample clustering experiments in three The Cancer Genome Atlas (TCGA) datasets.


Assuntos
Algoritmos , Biologia Computacional/métodos , Expressão Gênica , Neoplasias/genética , Análise por Conglomerados , Gráficos por Computador , Interpretação Estatística de Dados , Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica , Humanos
15.
Hum Hered ; 84(1): 21-33, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31466058

RESUMO

Differentially expressed genes selection becomes a hotspot and difficulty in recent molecular biology. Low-rank representation (LRR) uniting graph Laplacian regularization has gained good achievement in the above field. However, the co-expression information of data cannot be captured well by graph regularization. Therefore, a novel low-rank representation method regularized by dual-hypergraph Laplacian is proposed to reveal the intrinsic geometrical structures hidden in the samples and genes direction simultaneously, which is called dual-hypergraph Laplacian regularized LRR (DHLRR). Finally, a low-rank matrix and a sparse perturbation matrix can be recovered from genomic data by DHLRR. Based on the sparsity of differentially expressed genes, the sparse disturbance matrix can be applied to extracting differentially expressed genes. In our experiments, two gene analysis tools are used to discuss the experimental results. The results on two real genomic data and an integrated dataset prove that DHLRR is efficient and effective in finding differentially expressed genes.


Assuntos
Regulação Neoplásica da Expressão Gênica , Genômica/métodos , Neoplasias Pancreáticas/genética , Carcinoma de Células Escamosas de Cabeça e Pescoço/genética , Humanos
16.
BMC Bioinformatics ; 20(Suppl 8): 287, 2019 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-31182006

RESUMO

BACKGROUND: Predicting drug-target interactions is time-consuming and expensive. It is important to present the accuracy of the calculation method. There are many algorithms to predict global interactions, some of which use drug-target networks for prediction (ie, a bipartite graph of bound drug pairs and targets known to interact). Although these algorithms can predict some drug-target interactions to some extent, there is little effect for some new drugs or targets that have no known interaction. RESULTS: Since the datasets are usually located at or near low-dimensional nonlinear manifolds, we propose an improved GRMF (graph regularized matrix factorization) method to learn these flow patterns in combination with the previous matrix-decomposition method. In addition, we use one of the pre-processing steps previously proposed to improve the accuracy of the prediction. CONCLUSIONS: Cross-validation is used to evaluate our method, and simulation experiments are used to predict new interactions. In most cases, our method is superior to other methods. Finally, some examples of new drugs and new targets are predicted by performing simulation experiments. And the improved GRMF method can better predict the remaining drug-target interactions.


Assuntos
Algoritmos , Interações Medicamentosas , Bases de Dados como Assunto , Humanos , Reprodutibilidade dos Testes
17.
BMC Bioinformatics ; 20(1): 16, 2019 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-30626319

RESUMO

BACKGROUND: Long non-coding RNA (lncRNA) studies play an important role in the development, invasion, and metastasis of the tumor. The analysis and screening of the differential expression of lncRNAs in cancer and corresponding paracancerous tissues provides new clues for finding new cancer diagnostic indicators and improving the treatment. Predicting lncRNA-protein interactions is very important in the analysis of lncRNAs. This article proposes an Ant-Colony-Clustering-Based Bipartite Network (ACCBN) method and predicts lncRNA-protein interactions. The ACCBN method combines ant colony clustering and bipartite network inference to predict lncRNA-protein interactions. RESULTS: A five-fold cross-validation method was used in the experimental test. The results show that the values of the evaluation indicators of ACCBN on the test set are significantly better after comparing the predictive ability of ACCBN with RWR, ProCF, LPIHN, and LPBNI method. CONCLUSIONS: With the continuous development of biology, besides the research on the cellular process, the research on the interaction function between proteins becomes a new key topic of biology. The studies on protein-protein interactions had important implications for bioinformatics, clinical medicine, and pharmacology. However, there are many kinds of proteins, and their functions of interactions are complicated. Moreover, the experimental methods require time to be confirmed because it is difficult to estimate. Therefore, a viable solution is to predict protein-protein interactions efficiently with computers. The ACCBN method has a good effect on the prediction of protein-protein interactions in terms of sensitivity, precision, accuracy, and F1-score.


Assuntos
Biologia Computacional/métodos , RNA Longo não Codificante/genética , Algoritmos , Humanos
18.
BMC Bioinformatics ; 20(1): 5, 2019 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-30611214

RESUMO

BACKGROUND: Predicting drug-disease interactions (DDIs) is time-consuming and expensive. Improving the accuracy of prediction results is necessary, and it is crucial to develop a novel computing technology to predict new DDIs. The existing methods mostly use the construction of heterogeneous networks to predict new DDIs. However, the number of known interacting drug-disease pairs is small, so there will be many errors in this heterogeneous network that will interfere with the final results. RESULTS: A novel method, known as the dual-network L2,1-collaborative matrix factorization, is proposed to predict novel DDIs. The Gaussian interaction profile kernels and L2,1-norm are introduced in our method to achieve better results than other advanced methods. The network similarities of drugs and diseases with their chemical and semantic similarities are combined in this method. CONCLUSIONS: Cross validation is used to evaluate our method, and simulation experiments are used to predict new interactions using two different datasets. Finally, our prediction accuracy is better than other existing methods. This proves that our method is feasible and effective.


Assuntos
Algoritmos , Biologia Computacional/métodos , Doença , Interações Medicamentosas , Área Sob a Curva , Bases de Dados como Assunto , Humanos , Reprodutibilidade dos Testes , Semântica
19.
BMC Bioinformatics ; 20(Suppl 22): 716, 2019 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888433

RESUMO

BACKGROUND: In recent years, identification of differentially expressed genes and sample clustering have become hot topics in bioinformatics. Principal Component Analysis (PCA) is a widely used method in gene expression data. However, it has two limitations: first, the geometric structure hidden in data, e.g., pair-wise distance between data points, have not been explored. This information can facilitate sample clustering; second, the Principal Components (PCs) determined by PCA are dense, leading to hard interpretation. However, only a few of genes are related to the cancer. It is of great significance for the early diagnosis and treatment of cancer to identify a handful of the differentially expressed genes and find new cancer biomarkers. RESULTS: In this study, a new method gLSPCA is proposed to integrate both graph Laplacian and sparse constraint into PCA. gLSPCA on the one hand improves the clustering accuracy by exploring the internal geometric structure of the data, on the other hand identifies differentially expressed genes by imposing a sparsity constraint on the PCs. CONCLUSIONS: Experiments of gLSPCA and its comparison with existing methods, including Z-SPCA, GPower, PathSPCA, SPCArt, gLPCA, are performed on real datasets of both pancreatic cancer (PAAD) and head & neck squamous carcinoma (HNSC). The results demonstrate that gLSPCA is effective in identifying differentially expressed genes and sample clustering. In addition, the applications of gLSPCA on these datasets provide several new clues for the exploration of causative factors of PAAD and HNSC.


Assuntos
Algoritmos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Análise de Componente Principal , Análise por Conglomerados , Expressão Gênica , Humanos , Neoplasias/genética , Mapas de Interação de Proteínas
20.
BMC Bioinformatics ; 20(Suppl 22): 718, 2019 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888442

RESUMO

BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. RESULTS: In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. CONCLUSIONS: A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples.


Assuntos
Algoritmos , Neoplasias/genética , Análise por Conglomerados , Mineração de Dados , Bases de Dados Genéticas , Humanos , Redução Dimensional com Múltiplos Fatores , Oncogenes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...