Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE J Biomed Health Inform ; 27(11): 5665-5674, 2023 11.
Article in English | MEDLINE | ID: mdl-37656653

ABSTRACT

It is critical to correctly assemble high-dimensional single-cell RNA sequencing (scRNA-seq) datasets and downscale them for downstream analysis. However, given the complex relationships between cells, it remains a challenge to simultaneously eliminate batch effects between datasets and maintain the topology between cells within each dataset. Here, we propose scGAMNN, a deep learning model based on graph autoencoder, to simultaneously achieve batch correction and topology-preserving dimensionality reduction. The low-dimensional integrated data obtained by scGAMNN can be used for visualization, clustering and trajectory inference.By comparing it with the other five methods, multiple tasks show that scGAMNN consistently has comparable data integration performance in clustering and trajectory conservation.


Subject(s)
Algorithms , Single-Cell Analysis , Humans , Cluster Analysis , Sequence Analysis, RNA , Gene Expression Profiling
2.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37406190

ABSTRACT

Studies have confirmed that the occurrence of many complex diseases in the human body is closely related to the microbial community, and microbes can affect tumorigenesis and metastasis by regulating the tumor microenvironment. However, there are still large gaps in the clinical observation of the microbiota in disease. Although biological experiments are accurate in identifying disease-associated microbes, they are also time-consuming and expensive. The computational models for effective identification of diseases related microbes can shorten this process, and reduce capital and time costs. Based on this, in the paper, a model named DSAE_RF is presented to predict latent microbe-disease associations by combining multi-source features and deep learning. DSAE_RF calculates four similarities between microbes and diseases, which are then used as feature vectors for the disease-microbe pairs. Later, reliable negative samples are screened by k-means clustering, and a deep sparse autoencoder neural network is further used to extract effective features of the disease-microbe pairs. In this foundation, a random forest classifier is presented to predict the associations between microbes and diseases. To assess the performance of the model in this paper, 10-fold cross-validation is implemented on the same dataset. As a result, the AUC and AUPR of the model are 0.9448 and 0.9431, respectively. Furthermore, we also conduct a variety of experiments, including comparison of negative sample selection methods, comparison with different models and classifiers, Kolmogorov-Smirnov test and t-test, ablation experiments, robustness analysis, and case studies on Covid-19 and colorectal cancer. The results fully demonstrate the reliability and availability of our model.


Subject(s)
COVID-19 , Deep Learning , Microbiota , Humans , Reproducibility of Results , Algorithms , Computational Biology/methods
3.
Brief Bioinform ; 24(3)2023 05 19.
Article in English | MEDLINE | ID: mdl-37170676

ABSTRACT

Although many single-cell computational methods proposed use gene expression as input, recent studies show that replacing 'unstable' gene expression with 'stable' gene-gene associations can greatly improve the performance of downstream analysis. To obtain accurate gene-gene associations, conditional cell-specific network method (c-CSN) filters out the indirect associations of cell-specific network method (CSN) based on the conditional independence of statistics. However, when there are strong connections in networks, the c-CSN suffers from false negative problem in network construction. To overcome this problem, a new partial cell-specific network method (p-CSN) based on the partial independence of statistics is proposed in this paper, which eliminates the singularity of the c-CSN by implicitly including direct associations among estimated variables. Based on the p-CSN, single-cell network entropy (scNEntropy) is further proposed to quantify cell state. The superiorities of our method are verified on several datasets. (i) Compared with traditional gene regulatory network construction methods, the p-CSN constructs partial cell-specific networks, namely, one cell to one network. (ii) When there are strong connections in networks, the p-CSN reduces the false negative probability of the c-CSN. (iii) The input of more accurate gene-gene associations further optimizes the performance of downstream analyses. (iv) The scNEntropy effectively quantifies cell state and reconstructs cell pseudo-time.


Subject(s)
Gene Regulatory Networks , Sequence Analysis, RNA
4.
Comput Biol Med ; 151(Pt A): 106249, 2022 12.
Article in English | MEDLINE | ID: mdl-36335815

ABSTRACT

The deterioration and metastasis of cancer involve various aspects of genomic changes, including genomic DNA changes, epigenetic modifications, gene expression, and other complex interactions. Therefore, integrating single-cell multi-omics data to construct gene regulatory networks containing more omics information is of great significance for understanding the pathogenesis of cancer. In this article, an algorithm integrating single-cell RNA sequencing data and DNA methylation data to construct a gene regulatory network based on the back-propagation (BP) neural network (scBPGRN) is proposed. This algorithm uses biweight extreme correlation coefficients to measure the correlation between factors and uses neural networks to calculate generalized weights to construct gene regulation networks. Finally, the node strength is calculated to identify the genes associated with cancer. We apply the scBPGRN algorithm to hepatocellular carcinoma (HCC) data. We construct a regulatory network and identify top-ranked genes, such as MYCBP, KLHL35, PRKCZ, and SERPINA6, as the key HCC-related genes. We analyze the top 100 genes, and the HCC-related genes are concentrated in the top 20. In addition, the single cell data is found to consist of two subpopulations. We also apply scBPGRN to two subpopulations. We analyze the top 50 genes in them, and the HCC-related genes are concentrated in the top 20. The consequences of functional enrichment analysis indicate that the gene regulatory network we have constructed is valid. Our results have been verified in several pieces of literature. This study provides a reference for the integration of single-cell multi-omics data to construct gene regulatory networks.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , MicroRNAs , Humans , Gene Regulatory Networks/genetics , Carcinoma, Hepatocellular/genetics , Gene Expression Regulation, Neoplastic/genetics , MicroRNAs/genetics , Liver Neoplasms/genetics , Neural Networks, Computer
5.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-35514181

ABSTRACT

With the development of high-throughput technologies, the accumulation of large amounts of multidimensional genomic data provides an excellent opportunity to study the multilevel biological regulatory relationships in cancer. Based on the hypothesis of competitive endogenous ribonucleic acid (RNA) (ceRNA) network, lncRNAs can eliminate the inhibition of microRNAs (miRNAs) on their target genes by binding to intracellular miRNA sites so as to improve the expression level of these target genes. However, previous studies on cancer expression mechanism are mostly based on individual or two-dimensional data, and lack of integration and analysis of various RNA-seq data, making it difficult to verify the complex biological relationships involved. To explore RNA expression patterns and potential molecular mechanisms of cancer, a network-regularized sparse orthogonal-regularized joint non-negative matrix factorization (NSOJNMF) algorithm is proposed, which combines the interaction relations among RNA-seq data in the way of network regularization and effectively prevents multicollinearity through sparse constraints and orthogonal regularization constraints to generate good modular sparse solutions. NSOJNMF algorithm is performed on the datasets of liver cancer and colon cancer, then ceRNA co-modules of them are recognized. The enrichment analysis of these modules shows that >90% of them are closely related to the occurrence and development of cancer. In addition, the ceRNA networks constructed by the ceRNA co-modules not only accurately mine the known correlations of the three RNA molecules but also further discover their potential biological associations, which may contribute to the exploration of the competitive relationships among multiple RNAs and the molecular mechanisms affecting tumor development.


Subject(s)
Colonic Neoplasms , MicroRNAs , RNA, Long Noncoding , Colonic Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Genomics , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics
6.
Interdiscip Sci ; 14(2): 394-408, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35028910

ABSTRACT

Cell type determination based on transcriptome profiles is a key application of single-cell RNA sequencing (scRNA-seq). It is usually achieved through unsupervised clustering. Good feature selection is capable of improving the clustering accuracy and is a crucial component of single-cell clustering pipelines. However, most current single-cell feature selection methods are univariable filter methods ignoring gene dependency. Even the multivariable filter methods developed in recent years only consider "one-to-many" relationship between genes. In this paper, a novel single-cell feature selection method based on convex analysis of mixtures (FSCAM) is proposed, which takes into account "many-to-many" relationship. Compared to the previous "one-to-many" methods, FSCAM selects genes with a combination of relevancy, redundancy and completeness. Pertinent benchmarking is conducted on the real datasets to validate the superiority of FSCAM. Through plugging into the framework of partition around medoids (PAM) clustering, a single-cell clustering algorithm based on FSCAM method (SCC_FSCAM) is further developed. Comparing SCC_FSCAM with existing advanced clustering algorithms, the results show that our algorithm has advantages in both internal criteria (clustering number) and external criteria (adjusted Rand index) and has a good stability.


Subject(s)
Algorithms , Single-Cell Analysis , Cluster Analysis , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...