Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
BMC Genomics ; 25(1): 574, 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38849762

ABSTRACT

BACKGROUND: The Qinghai Tibetan sheep, a local breed renowned for its long hair, has experienced significant deterioration in wool characteristics due to the absence of systematic breeding practices. Therefore, it is imperative to investigate the molecular mechanisms underlying follicle development in order to genetically enhance wool-related traits and safeguard the sustainable utilization of valuable germplasm resources. However, our understanding of the regulatory roles played by coding and non-coding RNAs in hair follicle development remains largely elusive. RESULTS: A total of 20,874 mRNAs, 25,831 circRNAs, 4087 lncRNAs, and 794 miRNAs were annotated. Among them, we identified 58 DE lncRNAs, 325 DE circRNAs, 924 DE mRNAs, and 228 DE miRNAs during the development of medullary primary hair follicle development. GO and KEGG functional enrichment analyses revealed that the JAK-STAT, TGF-ß, Hedgehog, PPAR, cGMP-PKG signaling pathway play crucial roles in regulating fibroblast and epithelial development during skin and hair follicle induction. Furthermore, the interactive network analysis additionally identified several crucial mRNA, circRNA, and lncRNA molecules associated with the process of primary hair follicle development. Ultimately, by investigating DEmir's role in the ceRNA regulatory network mechanism, we identified 113 circRNA-miRNA pairs and 14 miRNA-mRNA pairs, including IGF2BP1-miR-23-x-novel-circ-01998-MSTRG.7111.3, DPT-miR-370-y-novel-circ-005802-MSTRG.14857.1 and TSPEAR-oar-miR-370-3p-novel-circ-005802- MSTRG.10527.1. CONCLUSIONS: Our study offers novel insights into the distinct expression patterns of various transcription types during hair follicle morphogenesis, establishing a solid foundation for unraveling the molecular mechanisms that drive hair development and providing a scientific basis for selectively breeding desirable wool-related traits in this specific breed.


Subject(s)
Gene Regulatory Networks , Hair Follicle , MicroRNAs , RNA, Circular , RNA, Long Noncoding , RNA, Messenger , Animals , Hair Follicle/metabolism , Hair Follicle/growth & development , RNA, Circular/genetics , RNA, Circular/metabolism , MicroRNAs/genetics , MicroRNAs/metabolism , Sheep/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Gene Expression Profiling , Skin/metabolism , Transcriptome , Fetus/metabolism
2.
Article in English | MEDLINE | ID: mdl-38787664

ABSTRACT

The advent of single-cell RNA sequencing (scRNA-seq) has brought forth fresh perspectives on intricate biological processes, revealing the nuances and divergences present among distinct cells. Accurate single-cell analysis is a crucial prerequisite for in-depth investigation into the underlying mechanisms of heterogeneity. Due to various technical noises, like the impact of dropout values, scRNA-seq data remains challenging to interpret. In this work, we propose an unsupervised learning framework for scRNA-seq data analysis (aka Sc-GNNMF). Based on the non-negativity and sparsity of scRNA-seq data, we propose employing graph-regularized non-negative matrix factorization (GNNMF) algorithm for the analysis of scRNA-seq data, which involves estimating cell-cell similarity and gene-gene similarity through Laplacian kernels and p-nearest neighbor graphs ( p-NNG). By assuming intrinsic geometric local invariance, we use a weighted p-nearest known neighbors ( p-NKN) of cell-cell interactions to guide the matrix decomposition process, promoting the closeness of cells with similar types in cell-gene data space and determining a more suitable embedding space for clustering. Sc-GNNMF demonstrates superior performance compared to other methods and maintains satisfactory compatibility and robustness, as evidenced by experiments on 11 real scRNA-seq datasets. Furthermore, Sc-GNNMF yields excellent results in clustering tasks, extracting useful gene markers, and pseudo-temporal analysis.

3.
BMC Bioinformatics ; 24(1): 417, 2023 Nov 07.
Article in English | MEDLINE | ID: mdl-37932672

ABSTRACT

MOTIVATION: Categorizing cells into distinct types can shed light on biological tissue functions and interactions, and uncover specific mechanisms under pathological conditions. Since gene expression throughout a population of cells is averaged out by conventional sequencing techniques, it is challenging to distinguish between different cell types. The accumulation of single-cell RNA sequencing (scRNA-seq) data provides the foundation for a more precise classification of cell types. It is crucial building a high-accuracy clustering approach to categorize cell types since the imbalance of cell types and differences in the distribution of scRNA-seq data affect single-cell clustering and visualization outcomes. RESULT: To achieve single-cell type detection, we propose a meta-learning-based single-cell clustering model called ScLSTM. Specifically, ScLSTM transforms the single-cell type detection problem into a hierarchical classification problem based on feature extraction by the siamese long-short term memory (LSTM) network. The similarity matrix derived from the improved sigmoid kernel is mapped to the siamese LSTM feature space to analyze the differences between cells. ScLSTM demonstrated superior classification performance on 8 scRNA-seq data sets of different platforms, species, and tissues. Further quantitative analysis and visualization of the human breast cancer data set validated the superiority and capability of ScLSTM in recognizing cell types.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Humans , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , Cluster Analysis , Algorithms
4.
Brief Bioinform ; 23(3)2022 05 13.
Article in English | MEDLINE | ID: mdl-35419595

ABSTRACT

Limitations of bulk sequencing techniques on cell heterogeneity and diversity analysis have been pushed with the development of single-cell RNA-sequencing (scRNA-seq). To detect clusters of cells is a key step in the analysis of scRNA-seq. However, the high-dimensionality of scRNA-seq data and the imbalances in the number of different subcellular types are ubiquitous in real scRNA-seq data sets, which poses a huge challenge to the single-cell-type detection.We propose a meta-learning-based model, SiaClust, which is the combination of Siamese Convolutional Neural Network (CNN) and improved spectral clustering, to achieve scRNA-seq cell type detection. To be specific, with the help of the constrained Sigmoid kernel, the raw high-dimensionality data is mapped to a low-dimensional space, and the Siamese CNN learns the differences between the cell types in the low-dimensional feature space. The similarity matrix learned by Siamese CNN is used in combination with improved spectral clustering and t-distribution Stochastic Neighbor Embedding (t-SNE) for visualization. SiaClust highlights the differences between cell types by comparing the similarity of the samples, whereas blurring the differences within the cell types is better in processing high-dimensional and imbalanced data. SiaClust significantly improves clustering accuracy by using data generated by nine different species and tissues through different scNA-seq protocols for extensive evaluation, as well as analogies to state-of-the-art single-cell clustering models. More importantly, SiaClust accurately locates the exact site of dropout gene, and is more flexible with data size and cell type.


Subject(s)
Algorithms , Single-Cell Analysis , Cluster Analysis , Gene Expression Profiling , RNA-Seq , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods
5.
BMC Bioinformatics ; 23(1): 9, 2022 Jan 04.
Article in English | MEDLINE | ID: mdl-34983364

ABSTRACT

BACKGROUND: Drug-disease associations (DDAs) can provide important information for exploring the potential efficacy of drugs. However, up to now, there are still few DDAs verified by experiments. Previous evidence indicates that the combination of information would be conducive to the discovery of new DDAs. How to integrate different biological data sources and identify the most effective drugs for a certain disease based on drug-disease coupled mechanisms is still a challenging problem. RESULTS: In this paper, we proposed a novel computation model for DDA predictions based on graph representation learning over multi-biomolecular network (GRLMN). More specifically, we firstly constructed a large-scale molecular association network (MAN) by integrating the associations among drugs, diseases, proteins, miRNAs, and lncRNAs. Then, a graph embedding model was used to learn vector representations for all drugs and diseases in MAN. Finally, the combined features were fed to a random forest (RF) model to predict new DDAs. The proposed model was evaluated on the SCMFDD-S data set using five-fold cross-validation. Experiment results showed that GRLMN model was very accurate with the area under the ROC curve (AUC) of 87.9%, which outperformed all previous works in terms of both accuracy and AUC in benchmark dataset. To further verify the high performance of GRLMN, we carried out two case studies for two common diseases. As a result, in the ranking of drugs that were predicted to be related to certain diseases (such as kidney disease and fever), 15 of the top 20 drugs have been experimentally confirmed. CONCLUSIONS: The experimental results show that our model has good performance in the prediction of DDA. GRLMN is an effective prioritization tool for screening the reliable DDAs for follow-up studies concerning their participation in drug reposition.


Subject(s)
MicroRNAs , Pharmaceutical Preparations , RNA, Long Noncoding , Area Under Curve , Humans , MicroRNAs/metabolism , Proteins , RNA, Long Noncoding/metabolism
6.
J Transl Med ; 18(1): 347, 2020 09 07.
Article in English | MEDLINE | ID: mdl-32894154

ABSTRACT

BACKGROUND: The prediction of potential drug-target interactions (DTIs) not only provides a better comprehension of biological processes but also is critical for identifying new drugs. However, due to the disadvantages of expensive and high time-consuming traditional experiments, only a small section of interactions between drugs and targets in the database were verified experimentally. Therefore, it is meaningful and important to develop new computational methods with good performance for DTIs prediction. At present, many existing computational methods only utilize the single type of interactions between drugs and proteins without paying attention to the associations and influences with other types of molecules. METHODS: In this work, we developed a novel network embedding-based heterogeneous information integration model to predict potential drug-target interactions. Firstly, a heterogeneous multi-molecuar information network is built by combining the known associations among protein, drug, lncRNA, disease, and miRNA. Secondly, the Large-scale Information Network Embedding (LINE) model is used to learn behavior information (associations with other nodes) of drugs and proteins in the network. Hence, the known drug-protein interaction pairs can be represented as a combination of attribute information (e.g. protein sequences information and drug molecular fingerprints) and behavior information of themselves. Thirdly, the Random Forest classifier is used for training and prediction. RESULTS: In the results, under the five-fold cross validation, our method obtained 85.83% prediction accuracy with 80.47% sensitivity at the AUC of 92.33%. Moreover, in the case studies of three common drugs, the top 10 candidate targets have 8 (Caffeine), 7 (Clozapine) and 6 (Pioglitazone) are respectively verified to be associated with corresponding drugs. CONCLUSIONS: In short, these results indicate that our method can be a powerful tool for predicting potential drug-target interactions and finding unknown targets for certain drugs or unknown drugs for certain targets.


Subject(s)
MicroRNAs , Pharmaceutical Preparations , RNA, Long Noncoding , Algorithms , Amino Acid Sequence , Proteins
7.
Sci Rep ; 10(1): 4972, 2020 03 18.
Article in English | MEDLINE | ID: mdl-32188871

ABSTRACT

Drug-disease association is an important piece of information which participates in all stages of drug repositioning. Although the number of drug-disease associations identified by high-throughput technologies is increasing, the experimental methods are time consuming and expensive. As supplement to them, many computational methods have been developed for an accurate in silico prediction for new drug-disease associations. In this work, we present a novel computational model combining sparse auto-encoder and rotation forest (SAEROF) to predict drug-disease association. Gaussian interaction profile kernel similarity, drug structure similarity and disease semantic similarity were extracted for exploring the association among drugs and diseases. On this basis, a rotation forest classifier based on sparse auto-encoder is proposed to predict the association between drugs and diseases. In order to evaluate the performance of the proposed model, we used it to implement 10-fold cross validation on two golden standard datasets, Fdataset and Cdataset. As a result, the proposed model achieved AUCs (Area Under the ROC Curve) of Fdataset and Cdataset are 0.9092 and 0.9323, respectively. For performance evaluation, we compared SAEROF with the state-of-the-art support vector machine (SVM) classifier and some existing computational models. Three human diseases (Obesity, Stomach Neoplasms and Lung Neoplasms) were explored in case studies. As a result, more than half of the top 20 drugs predicted were successfully confirmed by the Comparative Toxicogenomics Database(CTD database). This model is a feasible and effective method to predict drug-disease correlation, and its performance is significantly improved compared with existing methods.


Subject(s)
Algorithms , Anti-Obesity Agents/pharmacology , Antineoplastic Agents/pharmacology , Computational Biology/methods , Lung Neoplasms/drug therapy , Neural Networks, Computer , Obesity/drug therapy , Stomach Neoplasms/drug therapy , Area Under Curve , Computer Simulation , Databases, Factual , Humans , Machine Learning , ROC Curve , Support Vector Machine
8.
J Transl Med ; 17(1): 382, 2019 11 20.
Article in English | MEDLINE | ID: mdl-31747915

ABSTRACT

BACKGROUND: In the process of drug development, computational drug repositioning is effective and resource-saving with regards to its important functions on identifying new drug-disease associations. Recent years have witnessed a great progression in the field of data mining with the advent of deep learning. An increasing number of deep learning-based techniques have been proposed to develop computational tools in bioinformatics. METHODS: Along this promising direction, we here propose a drug repositioning computational method combining the techniques of Sigmoid Kernel and Convolutional Neural Network (SKCNN) which is able to learn new features effectively representing drug-disease associations via its hidden layers. Specifically, we first construct similarity metric of drugs using drug sigmoid similarity and drug structural similarity, and that of disease using disease sigmoid similarity and disease semantic similarity. Based on the combined similarities of drugs and diseases, we then use SKCNN to learn hidden representations for each drug-disease pair whose labels are finally predicted by a classifier based on random forest. RESULTS: A series of experiments were implemented for performance evaluation and their results show that the proposed SKCNN improves the prediction accuracy compared with other state-of-the-art approaches. Case studies of two selected disease are also conducted through which we prove the superior performance of our method in terms of the actual discovery of potential drug indications. CONCLUSION: The aim of this study was to establish an effective predictive model for finding new drug-disease associations. These experimental results show that SKCNN can effectively predict the association between drugs and diseases.


Subject(s)
Algorithms , Disease/genetics , Drug Repositioning , Genetic Association Studies , Area Under Curve , Asthma/genetics , Databases as Topic , Humans , Neural Networks, Computer , Obesity/genetics , ROC Curve , Reproducibility of Results , Support Vector Machine
9.
Biomed Res Int ; 2019: 2426958, 2019.
Article in English | MEDLINE | ID: mdl-31534955

ABSTRACT

Computational drug repositioning, designed to identify new indications for existing drugs, significantly reduced the cost and time involved in drug development. Prediction of drug-disease associations is promising for drug repositioning. Recent years have witnessed an increasing number of machine learning-based methods for calculating drug repositioning. In this paper, a novel feature learning method based on Gaussian interaction profile kernel and autoencoder (GIPAE) is proposed for drug-disease association. In order to further reduce the computation cost, both batch normalization layer and the full-connected layer are introduced to reduce training complexity. The experimental results of 10-fold cross validation indicate that the proposed method achieves superior performance on Fdataset and Cdataset with the AUCs of 93.30% and 96.03%, respectively, which were higher than many previous computational models. To further assess the accuracy of GIPAE, we conducted case studies on two complex human diseases. The top 20 drugs predicted, 14 obesity-related drugs, and 11 drugs related to Alzheimer's disease were validated in the CTD database. The results of cross validation and case studies indicated that GIPAE is a reliable model for predicting drug-disease associations.


Subject(s)
Alzheimer Disease/drug therapy , Computer Simulation , Databases, Factual , Drug Repositioning , Machine Learning , Computational Biology , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...