Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 46.345
Filter
1.
Sci Rep ; 14(1): 12786, 2024 06 04.
Article in English | MEDLINE | ID: mdl-38834626

ABSTRACT

Rheumatoid arthritis (RA) is a chronic systemic autoimmune disease marked by inflammatory cell infiltration and joint damage. The Chinese government has approved the prescription medication sinomenine (SIN), an effective anti-inflammation drug, for treating RA. This study evaluated the possible anti-inflammatory actions of SIN in RA based on bioinformatics analysis and experiments. Six microarray datasets were acquired from the gene expression omnibus (GEO) database. We used R software to identify differentially expressed genes (DEGs) and perform function evaluations. The CIBERSORT was used to calculate the abundance of 22 infiltrating immune cells. The weighted gene co-expression network analysis (WGCNA) was used to discover genes associated with M1 macrophages. Four public datasets were used to predict the genes of SIN. Following that, function enrichment analysis for hub genes was performed. The cytoHubba and least absolute shrinkage and selection operator (LASSO) were employed to select hub genes, and their diagnostic effectiveness was predicted using the receiver operator characteristic (ROC) curve. Molecular docking was undertaken to confirm the affinity between the SIN and hub gene. Furthermore, the therapeutic efficacy of SIN was validated in LPS-induced RAW264.7 cells line using Western blot and Enzyme-linked immunosorbent assay (ELISA). The matrix metalloproteinase 9 (MMP9) was identified as the hub M1 macrophages-related biomarker in RA using bioinformatic analysis and molecular docking. Our study indicated that MMP9 took part in IL-17 and TNF signaling pathways. Furthermore, we found that SIN suppresses the MMP9 protein overexpression and pro-inflammatory cytokines, including tumor necrosis factor-α (TNF-α) and interleukin-6 (IL-6) in the LPS-induced RAW264.7 cell line. In conclusion, our work sheds new light on the pathophysiology of RA and identifies MMP9 as a possible RA key gene. In conclusion, the above findings demonstrate that SIN, from an emerging research perspective, might be a potential cost-effective anti-inflammatory medication for treating RA.


Subject(s)
Arthritis, Rheumatoid , Computational Biology , Cytokines , Matrix Metalloproteinase 9 , Morphinans , Morphinans/pharmacology , Arthritis, Rheumatoid/drug therapy , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/metabolism , Matrix Metalloproteinase 9/metabolism , Matrix Metalloproteinase 9/genetics , Mice , Animals , RAW 264.7 Cells , Computational Biology/methods , Cytokines/metabolism , Humans , Molecular Docking Simulation , Gene Expression Regulation/drug effects , Macrophages/metabolism , Macrophages/drug effects , Anti-Inflammatory Agents/pharmacology
2.
Sci Rep ; 14(1): 12761, 2024 06 04.
Article in English | MEDLINE | ID: mdl-38834687

ABSTRACT

Abundant researches have consistently illustrated the crucial role of microRNAs (miRNAs) in a wide array of essential biological processes. Furthermore, miRNAs have been validated as promising therapeutic targets for addressing complex diseases. Given the costly and time-consuming nature of traditional biological experimental validation methods, it is imperative to develop computational methods. In the work, we developed a novel approach named efficient matrix completion (EMCMDA) for predicting miRNA-disease associations. First, we calculated the similarities across multiple sources for miRNA/disease pairs and combined this information to create a holistic miRNA/disease similarity measure. Second, we utilized this biological information to create a heterogeneous network and established a target matrix derived from this network. Lastly, we framed the miRNA-disease association prediction issue as a low-rank matrix-complete issue that was addressed via minimizing matrix truncated schatten p-norm. Notably, we improved the conventional singular value contraction algorithm through using a weighted singular value contraction technique. This technique dynamically adjusts the degree of contraction based on the significance of each singular value, ensuring that the physical meaning of these singular values is fully considered. We evaluated the performance of EMCMDA by applying two distinct cross-validation experiments on two diverse databases, and the outcomes were statistically significant. In addition, we executed comprehensive case studies on two prevalent human diseases, namely lung cancer and breast cancer. Following prediction and multiple validations, it was evident that EMCMDA proficiently forecasts previously undisclosed disease-related miRNAs. These results underscore the robustness and efficacy of EMCMDA in miRNA-disease association prediction.


Subject(s)
Algorithms , Computational Biology , Genetic Predisposition to Disease , MicroRNAs , MicroRNAs/genetics , Humans , Computational Biology/methods , Breast Neoplasms/genetics
3.
Commun Biol ; 7(1): 684, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38834836

ABSTRACT

Identifying interactions between T-cell receptors (TCRs) and immunogenic peptides holds profound implications across diverse research domains and clinical scenarios. Unsupervised clustering models (UCMs) cannot predict peptide-TCR binding directly, while supervised predictive models (SPMs) often face challenges in identifying antigens previously unencountered by the immune system or possessing limited TCR binding repertoires. Therefore, we propose HeteroTCR, an SPM based on Heterogeneous Graph Neural Network (GNN), to accurately predict peptide-TCR binding probabilities. HeteroTCR captures within-type (TCR-TCR or peptide-peptide) similarity information and between-type (peptide-TCR) interaction insights for predictions on unseen peptides and TCRs, surpassing limitations of existing SPMs. Our evaluation shows HeteroTCR outperforms state-of-the-art models on independent datasets. Ablation studies and visual interpretation underscore the Heterogeneous GNN module's critical role in enhancing HeteroTCR's performance by capturing pivotal binding process features. We further demonstrate the robustness and reliability of HeteroTCR through validation using single-cell datasets, aligning with the expectation that pMHC-TCR complexes with higher predicted binding probabilities correspond to increased binding fractions.


Subject(s)
Neural Networks, Computer , Peptides , Receptors, Antigen, T-Cell , Receptors, Antigen, T-Cell/metabolism , Receptors, Antigen, T-Cell/immunology , Receptors, Antigen, T-Cell/chemistry , Peptides/chemistry , Peptides/metabolism , Peptides/immunology , Protein Binding , Humans , Computational Biology/methods
4.
BMC Cancer ; 24(1): 681, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38834966

ABSTRACT

BACKGROUND: Our previous studies have indicated that mRNA and protein levels of PPIH are significantly upregulated in Hepatocellular Carcinoma (LIHC) and could act as predictive biomarkers for patients with LIHC. Nonetheless, the expression and implications of PPIH in the etiology and progression of common solid tumors have yet to be explored, including its potential as a serum tumor marker. METHODS: We employed bioinformatics analyses, augmented with clinical sample evaluations, to investigate the mRNA and protein expression and gene regulation networks of PPIH in various solid tumors. We also assessed the association between PPIH expression and overall survival (OS) in cancer patients using Kaplan-Meier analysis with TCGA database information. Furthermore, we evaluated the feasibility and diagnostic efficacy of PPIH as a serum marker by integrating serological studies with established clinical tumor markers. RESULTS: Through pan-cancer analysis, we found that the expression levels of PPIH mRNA in multiple tumors were significantly different from those in normal tissues. This study is the first to report that PPIH mRNA and protein levels are markedly elevated in LIHC, Colon adenocarcinoma (COAD), and Breast cancer (BC), and are associated with a worse prognosis in these cancer patients. Conversely, serum PPIH levels are decreased in patients with these tumors (LIHC, COAD, BC, gastric cancer), and when combined with traditional tumor markers, offer enhanced sensitivity and specificity for diagnosis. CONCLUSION: Our findings propose that PPIH may serve as a valuable predictive biomarker in tumor patients, and its secreted protein could be a potential serum marker, providing insights into the role of PPIH in cancer development and progression.


Subject(s)
Biomarkers, Tumor , Humans , Biomarkers, Tumor/blood , Biomarkers, Tumor/genetics , Prognosis , Female , Liver Neoplasms/genetics , Liver Neoplasms/blood , Liver Neoplasms/mortality , Gene Expression Regulation, Neoplastic , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/blood , Carcinoma, Hepatocellular/mortality , Carcinoma, Hepatocellular/pathology , Carcinoma, Hepatocellular/diagnosis , Neoplasms/genetics , Neoplasms/blood , Neoplasms/mortality , Neoplasms/diagnosis , Male , Computational Biology/methods , RNA, Messenger/genetics , RNA, Messenger/metabolism , Kaplan-Meier Estimate , Breast Neoplasms/genetics , Breast Neoplasms/blood , Breast Neoplasms/mortality , Breast Neoplasms/diagnosis , Breast Neoplasms/pathology , Stomach Neoplasms/genetics , Stomach Neoplasms/blood , Stomach Neoplasms/diagnosis , Stomach Neoplasms/mortality , Stomach Neoplasms/pathology , Colonic Neoplasms/genetics , Colonic Neoplasms/blood , Colonic Neoplasms/diagnosis , Colonic Neoplasms/pathology , Colonic Neoplasms/mortality , Gene Regulatory Networks
5.
BMC Bioinformatics ; 25(1): 205, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38834962

ABSTRACT

BACKGROUND: Although RNA-seq data are traditionally used for quantifying gene expression levels, the same data could be useful in an integrated approach to compute genetic distances as well. Challenges to using mRNA sequences for computing genetic distances include the relatively high conservation of coding sequences and the presence of paralogous and, in some species, homeologous genes. RESULTS: We developed a new computational method, RNA-clique, for calculating genetic distances using assembled RNA-seq data and assessed the efficacy of the method using biological and simulated data. The method employs reciprocal BLASTn followed by graph-based filtering to ensure that only orthologous genes are compared. Each vertex in the graph constructed for filtering represents a gene in a specific sample under comparison, and an edge connects a pair of vertices if the genes they represent are best matches for each other in their respective samples. The distance computation is a function of the BLAST alignment statistics and the constructed graph and incorporates only those genes that are present in some complete connected component of this graph. As a biological testbed we used RNA-seq data of tall fescue (Lolium arundinaceum), an allohexaploid plant ( 2 n = 14 Gb ), and bluehead wrasse (Thalassoma bifasciatum), a teleost fish. RNA-clique reliably distinguished individual tall fescue plants by genotype and distinguished bluehead wrasse RNA-seq samples by individual. In tests with simulated RNA-seq data, the ground truth phylogeny was accurately recovered from the computed distances. Moreover, tests of the algorithm parameters indicated that, even with stringent filtering for orthologs, sufficient sequence data were retained for the distance computations. Although comparisons with an alternative method revealed that RNA-clique has relatively high time and memory requirements, the comparisons also showed that RNA-clique's results were at least as reliable as the alternative's for tall fescue data and were much more reliable for the bluehead wrasse data. CONCLUSION: Results of this work indicate that RNA-clique works well as a way of deriving genetic distances from RNA-seq data, thus providing a methodological integration of functional and genetic diversity studies.


Subject(s)
RNA-Seq , RNA-Seq/methods , Sequence Analysis, RNA/methods , Computational Biology/methods , Algorithms
6.
BMC Bioinformatics ; 25(1): 204, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38824535

ABSTRACT

BACKGROUND: Protein solubility is a critically important physicochemical property closely related to protein expression. For example, it is one of the main factors to be considered in the design and production of antibody drugs and a prerequisite for realizing various protein functions. Although several solubility prediction models have emerged in recent years, many of these models are limited to capturing information embedded in one-dimensional amino acid sequences, resulting in unsatisfactory predictive performance. RESULTS: In this study, we introduce a novel Graph Attention network-based protein Solubility model, GATSol, which represents the 3D structure of proteins as a protein graph. In addition to the node features of amino acids extracted by the state-of-the-art protein large language model, GATSol utilizes amino acid distance maps generated using the latest AlphaFold technology. Rigorous testing on independent eSOL and the Saccharomyces cerevisiae test datasets has shown that GATSol outperforms most recently introduced models, especially with respect to the coefficient of determination R2, which reaches 0.517 and 0.424, respectively. It outperforms the current state-of-the-art GraphSol by 18.4% on the S. cerevisiae_test set. CONCLUSIONS: GATSol captures 3D dimensional features of proteins by building protein graphs, which significantly improves the accuracy of protein solubility prediction. Recent advances in protein structure modeling allow our method to incorporate spatial structure features extracted from predicted structures into the model by relying only on the input of protein sequences, which simplifies the entire graph neural network prediction process, making it more user-friendly and efficient. As a result, GATSol may help prioritize highly soluble proteins, ultimately reducing the cost and effort of experimental work. The source code and data of the GATSol model are freely available at https://github.com/binbinbinv/GATSol .


Subject(s)
Proteins , Solubility , Proteins/chemistry , Proteins/metabolism , Protein Conformation , Databases, Protein , Computational Biology/methods , Software , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae/chemistry , Algorithms , Models, Molecular , Amino Acid Sequence
7.
HLA ; 103(6): e15543, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38837862

ABSTRACT

The MHC class I region contains crucial genes for the innate and adaptive immune response, playing a key role in susceptibility to many autoimmune and infectious diseases. Genome-wide association studies have identified numerous disease-associated SNPs within this region. However, these associations do not fully capture the immune-biological relevance of specific HLA alleles. HLA imputation techniques may leverage available SNP arrays by predicting allele genotypes based on the linkage disequilibrium between SNPs and specific HLA alleles. Successful imputation requires diverse and large reference panels, especially for admixed populations. This study employed a bioinformatics approach to call SNPs and HLA alleles in multi-ethnic samples from the 1000 genomes (1KG) dataset and admixed individuals from Brazil (SABE), utilising 30X whole-genome sequencing data. Using HIBAG, we created three reference panels: 1KG (n = 2504), SABE (n = 1171), and the full model (n = 3675) encompassing all samples. In extensive cross-validation of these reference panels, the multi-ethnic 1KG reference exhibited overall superior performance than the reference with only Brazilian samples. However, the best results were achieved with the full model. Additionally, we expanded the scope of imputation by developing reference panels for non-classical, MICA, MICB and HLA-H genes, previously unavailable for multi-ethnic populations. Validation in an independent Brazilian dataset showcased the superiority of our reference panels over the Michigan Imputation Server, particularly in predicting HLA-B alleles among Brazilians. Our investigations underscored the need to enhance or adapt reference panels to encompass the target population's genetic diversity, emphasising the significance of multiethnic references for accurate imputation across different populations.


Subject(s)
Alleles , Ethnicity , Gene Frequency , Polymorphism, Single Nucleotide , Humans , Brazil , Ethnicity/genetics , HLA Antigens/genetics , Linkage Disequilibrium , Genome-Wide Association Study/methods , Genotype , Genetics, Population/methods , Histocompatibility Antigens Class I/genetics , Computational Biology/methods
8.
Eur J Med Res ; 29(1): 307, 2024 Jun 02.
Article in English | MEDLINE | ID: mdl-38825674

ABSTRACT

BACKGROUND: Tumor necrosis factor receptor-associated factors family genes play a pivotal role in tumorigenesis and metastasis, functioning as adapters or E3 ubiquitin ligases across various signaling pathways. To date, limited research has explored the association between tumor necrosis factor receptor-associated factors family genes and the clinicopathological characteristics of tumors, immunity, and the tumor microenvironment (TME). This comprehensive study investigates the relationship between tumor necrosis factor receptor-associated factors family and prognosis, TME, immune response, and drug sensitivity in a pan-cancer context. METHODS: Utilizing current public databases, this study examines the expression levels and prognostic significance of tumor necrosis factor receptor-associated factors family genes in a pan-cancer context through bioinformatic analysis. In addition, it investigates the correlation between tumor necrosis factor receptor-associated factors expression and various factors, including the TME, immune subtypes, stemness scores, and drug sensitivity in pan-cancer. RESULTS: Elevated expression levels of tumor necrosis factor receptor-associated factor 2, 3, 4, and 7 were observed across various cancer types. Patients exhibiting high expression of these genes generally faced a worse prognosis. Furthermore, a significant correlation was noted between the expression of tumor necrosis factor receptor-associated factors family genes and multiple dimensions of the TME, immune subtypes, and drug sensitivity.


Subject(s)
Neoplasms , Tumor Microenvironment , Humans , Prognosis , Neoplasms/genetics , Neoplasms/drug therapy , Tumor Microenvironment/genetics , Tumor Microenvironment/immunology , Tumor Necrosis Factor Receptor-Associated Peptides and Proteins/genetics , Gene Expression Regulation, Neoplastic , Computational Biology/methods , Drug Resistance, Neoplasm/genetics , Biomarkers, Tumor/genetics
9.
Oncol Res ; 32(6): 1011-1019, 2024.
Article in English | MEDLINE | ID: mdl-38827323

ABSTRACT

This review aimed to describe the inculpation of microRNAs (miRNAs) in thyroid cancer (TC) and its subtypes, mainly medullary thyroid carcinoma (MTC), and to outline web-based tools and databases for bioinformatics analysis of miRNAs in TC. Additionally, the capacity of miRNAs to serve as therapeutic targets and biomarkers in TC management will be discussed. This review is based on a literature search of relevant articles on the role of miRNAs in TC and its subtypes, mainly MTC. Additionally, web-based tools and databases for bioinformatics analysis of miRNAs in TC were identified and described. MiRNAs can perform as oncomiRs or antioncoges, relying on the target mRNAs they regulate. MiRNA replacement therapy using miRNA mimics or antimiRs that aim to suppress the function of certain miRNAs can be applied to correct miRNAs aberrantly expressed in diseases, particularly in cancer. MiRNAs are involved in the modulation of fundamental pathways related to cancer, resembling cell cycle checkpoints and DNA repair pathways. MiRNAs are also rather stable and can reliably be detected in different types of biological materials, rendering them favorable diagnosis and prognosis biomarkers as well. MiRNAs have emerged as promising tools for evaluating medical outcomes in TC and as possible therapeutic targets. The contribution of miRNAs in thyroid cancer, particularly MTC, is an active area of research, and the utility of web applications and databases for the biological data analysis of miRNAs in TC is becoming increasingly important.


Subject(s)
Biomarkers, Tumor , Carcinoma, Neuroendocrine , Computational Biology , MicroRNAs , Thyroid Neoplasms , Humans , Thyroid Neoplasms/genetics , Thyroid Neoplasms/diagnosis , Thyroid Neoplasms/therapy , Thyroid Neoplasms/pathology , MicroRNAs/genetics , Biomarkers, Tumor/genetics , Carcinoma, Neuroendocrine/genetics , Carcinoma, Neuroendocrine/pathology , Carcinoma, Neuroendocrine/diagnosis , Prognosis , Computational Biology/methods , Gene Expression Regulation, Neoplastic , Internet , Molecular Targeted Therapy
10.
BMC Cancer ; 24(1): 683, 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38840078

ABSTRACT

BACKGROUND: MicroRNAs (miRNAs) emerge in various organisms, ranging from viruses to humans, and play crucial regulatory roles within cells, participating in a variety of biological processes. In numerous prediction methods for miRNA-disease associations, the issue of over-dependence on both similarity measurement data and the association matrix still hasn't been improved. In this paper, a miRNA-Disease association prediction model (called TP-MDA) based on tree path global feature extraction and fully connected artificial neural network (FANN) with multi-head self-attention mechanism is proposed. The TP-MDA model utilizes an association tree structure to represent the data relationships, multi-head self-attention mechanism for extracting feature vectors, and fully connected artificial neural network with 5-fold cross-validation for model training. RESULTS: The experimental results indicate that the TP-MDA model outperforms the other comparative models, AUC is 0.9714. In the case studies of miRNAs associated with colorectal cancer and lung cancer, among the top 15 miRNAs predicted by the model, 12 in colorectal cancer and 15 in lung cancer were validated respectively, the accuracy is as high as 0.9227. CONCLUSIONS: The model proposed in this paper can accurately predict the miRNA-disease association, and can serve as a valuable reference for data mining and association prediction in the fields of life sciences, biology, and disease genetics, among others.


Subject(s)
MicroRNAs , Neural Networks, Computer , Humans , MicroRNAs/genetics , Genetic Predisposition to Disease , Computational Biology/methods , Colorectal Neoplasms/genetics , Lung Neoplasms/genetics , Algorithms
11.
J Cell Mol Med ; 28(11): e18405, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38842134

ABSTRACT

Prostate cancer (PCa), a prevalent malignancy among elderly males, exhibits a notable rate of advancement, even when subjected to conventional androgen deprivation therapy or chemotherapy. An effective progression prediction model would prove invaluable in identifying patients with a higher progression risk. Using bioinformatics strategies, we integrated diverse data sets of PCa to construct a novel risk model predicated on gene expression and progression-free survival (PFS). The accuracy of the model was assessed through validation using an independent data set. Eight genes were discerned as independent prognostic factors and included in the prediction model. Patients assigned to the high-risk cohort demonstrated a diminished PFS, and the areas under the curve of our model in the validation set for 1-year, 3-year, and 5-year PFS were 0.9325, 0.9041 and 0.9070, respectively. Additionally, through the application of single-cell RNA sequencing to two castration-related prostate cancer (CRPC) samples and two hormone-related prostate cancer (HSPC) samples, we discovered that luminal cells within CRPC exhibited an elevated risk score. Subsequent molecular biology experiments corroborated our findings, illustrating heightened SYK expression levels within tumour tissues and its contribution to cancer cell migration. We found that the knockdown of SYK could inhibit migration in PCa cells. Our progression-related risk model demonstrated the potential prognostic value of SYK and indicated its potential as a target for future diagnosis and treatment strategies in PCa management.


Subject(s)
Computational Biology , Disease Progression , Gene Expression Regulation, Neoplastic , Prostatic Neoplasms , Male , Humans , Computational Biology/methods , Prognosis , Prostatic Neoplasms/genetics , Prostatic Neoplasms/pathology , Prostatic Neoplasms/diagnosis , Gene Expression Profiling , Biomarkers, Tumor/genetics , Risk Factors , Cell Line, Tumor
12.
J Cell Mol Med ; 28(11): e18442, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38842135

ABSTRACT

Epithelial-mesenchymal transition (EMT) and its reversal process are important potential mechanisms in the development of HCC. Selaginella doederleinii Hieron is widely used in Traditional Chinese Medicine for the treatment of various tumours and Amentoflavone is its main active ingredient. This study investigates the mechanism of action of Amentoflavone on EMT in hepatocellular carcinoma from the perspective of bioinformatics and network pharmacology. Bioinformatics was used to screen Amentoflavone-regulated EMT genes that are closely related to the prognosis of HCC, and a molecular prediction model was established to assess the prognosis of HCC. The network pharmacology was used to predict the pathway axis regulated by Amentoflavone. Molecular docking of Amentoflavone with corresponding targets was performed. Detection and evaluation of the effects of Amentoflavone on cell proliferation, migration, invasion and apoptosis by CCK-8 kit, wound healing assay, Transwell assay and annexin V-FITC/propidium iodide staining. Eventually three core genes were screened, inculding NR1I2, CDK1 and CHEK1. A total of 590 GO enrichment entries were obtained, and five enrichment results were obtained by KEGG pathway analysis. Genes were mainly enriched in the p53 signalling pathway. The outcomes derived from both the wound healing assay and Transwell assay demonstrated significant inhibition of migration and invasion in HCC cells upon exposure to different concentrations of Amentoflavone. The results of Annexin V-FITC/PI staining assay showed that different concentrations of Amentoflavone induces apoptosis in HCC cells. This study revealed that the mechanism of Amentoflavone reverses EMT in hepatocellular carcinoma, possibly by inhibiting the expression of core genes and blocking the p53 signalling pathway axis to inhibit the migration and invasion of HCC cells.


Subject(s)
Apoptosis , Biflavonoids , Carcinoma, Hepatocellular , Cell Movement , Cell Proliferation , Epithelial-Mesenchymal Transition , Gene Expression Regulation, Neoplastic , Liver Neoplasms , Signal Transduction , Tumor Suppressor Protein p53 , Carcinoma, Hepatocellular/metabolism , Carcinoma, Hepatocellular/pathology , Carcinoma, Hepatocellular/drug therapy , Carcinoma, Hepatocellular/genetics , Epithelial-Mesenchymal Transition/drug effects , Humans , Liver Neoplasms/metabolism , Liver Neoplasms/pathology , Liver Neoplasms/drug therapy , Liver Neoplasms/genetics , Biflavonoids/pharmacology , Tumor Suppressor Protein p53/metabolism , Tumor Suppressor Protein p53/genetics , Signal Transduction/drug effects , Cell Movement/drug effects , Cell Proliferation/drug effects , Apoptosis/drug effects , Gene Expression Regulation, Neoplastic/drug effects , Cell Line, Tumor , Molecular Docking Simulation , Computational Biology/methods
13.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38842509

ABSTRACT

Peptide- and protein-based therapeutics are becoming a promising treatment regimen for myriad diseases. Toxicity of proteins is the primary hurdle for protein-based therapies. Thus, there is an urgent need for accurate in silico methods for determining toxic proteins to filter the pool of potential candidates. At the same time, it is imperative to precisely identify non-toxic proteins to expand the possibilities for protein-based biologics. To address this challenge, we proposed an ensemble framework, called VISH-Pred, comprising models built by fine-tuning ESM2 transformer models on a large, experimentally validated, curated dataset of protein and peptide toxicities. The primary steps in the VISH-Pred framework are to efficiently estimate protein toxicities taking just the protein sequence as input, employing an under sampling technique to handle the humongous class-imbalance in the data and learning representations from fine-tuned ESM2 protein language models which are then fed to machine learning techniques such as Lightgbm and XGBoost. The VISH-Pred framework is able to correctly identify both peptides/proteins with potential toxicity and non-toxic proteins, achieving a Matthews correlation coefficient of 0.737, 0.716 and 0.322 and F1-score of 0.759, 0.696 and 0.713 on three non-redundant blind tests, respectively, outperforming other methods by over $10\%$ on these quality metrics. Moreover, VISH-Pred achieved the best accuracy and area under receiver operating curve scores on these independent test sets, highlighting the robustness and generalization capability of the framework. By making VISH-Pred available as an easy-to-use web server, we expect it to serve as a valuable asset for future endeavors aimed at discerning the toxicity of peptides and enabling efficient protein-based therapeutics.


Subject(s)
Proteins , Proteins/metabolism , Proteins/chemistry , Machine Learning , Databases, Protein , Computational Biology/methods , Humans , Peptides/toxicity , Peptides/chemistry , Computer Simulation , Algorithms , Software
14.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38842510

ABSTRACT

Accurate and comprehensive annotation of microprotein-coding small open reading frames (smORFs) is critical to our understanding of normal physiology and disease. Empirical identification of translated smORFs is carried out primarily using ribosome profiling (Ribo-seq). While effective, published Ribo-seq datasets can vary drastically in quality and different analysis tools are frequently employed. Here, we examine the impact of these factors on identifying translated smORFs. We compared five commonly used software tools that assess open reading frame translation from Ribo-seq (RibORFv0.1, RibORFv1.0, RiboCode, ORFquant, and Ribo-TISH) and found surprisingly low agreement across all tools. Only ~2% of smORFs were called translated by all five tools, and ~15% by three or more tools when assessing the same high-resolution Ribo-seq dataset. For larger annotated genes, the same analysis showed ~74% agreement across all five tools. We also found that some tools are strongly biased against low-resolution Ribo-seq data, while others are more tolerant. Analyzing Ribo-seq coverage revealed that smORFs detected by more than one tool tend to have higher translation levels and higher fractions of in-frame reads, consistent with what was observed for annotated genes. Together these results support employing multiple tools to identify the most confident microprotein-coding smORFs and choosing the tools based on the quality of the dataset and the planned downstream characterization experiments of the predicted smORFs.


Subject(s)
Open Reading Frames , Software , Ribosomes/metabolism , Ribosomes/genetics , Molecular Sequence Annotation/methods , Humans , Protein Biosynthesis , Computational Biology/methods , Ribosome Profiling
16.
Front Immunol ; 15: 1302909, 2024.
Article in English | MEDLINE | ID: mdl-38846934

ABSTRACT

Background: Membranous nephropathy (MN) is an autoimmune disease and represents the most prevalent type of renal pathology in adult patients afflicted with nephrotic syndrome. Despite substantial evidence suggesting a possible link between MN and cancer, the precise underlying mechanisms remain elusive. Methods: In this study, we acquired and integrated two MN datasets (comprising a single-cell dataset and a bulk RNA-seq dataset) from the Gene Expression Omnibus database for differential expression gene (DEG) analysis, hub genes were obtained by LASSO and random forest algorithms, the diagnostic ability of hub genes was assessed using ROC curves, and the degree of immune cell infiltration was evaluated using the ssGSEA function. Concurrently, we gathered pan-cancer-related genes from the TCGA and GTEx databases, to analyze the expression, mutation status, drug sensitivity and prognosis of hub genes in pan-cancer. Results: We conducted intersections between the set of 318 senescence-related genes and the 366 DEGs, resulting in the identification of 13 senescence-related DEGs. Afterwards, we meticulously analyzed these genes using the LASSO and random forest algorithms, which ultimately led to the discovery of six hub genes through intersection (PIK3R1, CCND1, TERF2IP, SLC25A4, CAPN2, and TXN). ROC curves suggest that these hub genes have good recognition of MN. After performing correlation analysis, examining immune infiltration, and conducting a comprehensive pan-cancer investigation, we validated these six hub genes through immunohistochemical analysis using human renal biopsy tissues. The pan-cancer analysis notably accentuates the robust association between these hub genes and the prognoses of individuals afflicted by diverse cancer types, further underscoring the importance of mutations within these hub genes across various cancers. Conclusion: This evidence indicates that these genes could potentially play a pivotal role as a critical link connecting MN and cancer. As a result, they may hold promise as valuable targets for intervention in cases of both MN and cancer.


Subject(s)
Glomerulonephritis, Membranous , Humans , Glomerulonephritis, Membranous/genetics , Glomerulonephritis, Membranous/immunology , Glomerulonephritis, Membranous/diagnosis , Glomerulonephritis, Membranous/metabolism , Gene Expression Profiling , Neoplasms/genetics , Neoplasms/immunology , Neoplasms/metabolism , Computational Biology/methods , Prognosis , Biomarkers, Tumor/genetics , Transcriptome , Gene Regulatory Networks , Biomarkers , Databases, Genetic
17.
PLoS One ; 19(6): e0303628, 2024.
Article in English | MEDLINE | ID: mdl-38843230

ABSTRACT

Genes strictly regulate the development of teeth and their surrounding oral structures. Alteration of gene regulation leads to tooth disorders and developmental anomalies in tooth, oral, and facial regions. With the advancement of gene sequencing technology, genomic data is rapidly increasing. However, the large sets of genomic and proteomic data related to tooth development and dental disorders are currently dispersed in many primary databases and literature, making it difficult for users to navigate, extract, study, or analyze. We have curated the scattered genetic data on tooth development and created a knowledgebase called 'Bioinformatics for Dentistry' (https://dentalbioinformatics.com/). This database compiles genomic and proteomic data on human tooth development and developmental anomalies and organizes them according to their roles in different stages of tooth development. The database is built by systemically curating relevant data from the National Library of Medicine (NCBI) GenBank, OMIM: Online Mendelian Inheritance in Man, AlphaFold Protein Structure Database, Reactome pathway knowledgebase, Wiki Pathways, and PubMed. The accuracy of the included data was verified from supporting primary literature. Upon data curation and validation, a simple, easy-to-navigate browser interface was created on WordPress version 6.3.2, with PHP version 8.0. The website is hosted in a cloud hosting service to provide fast and reliable data transfer rate. Plugins are used to ensure the browser's compatibility across different devices. Bioinformatics for Dentistry contains four embedded filters for complex and specific searches and free-text search options for quick and simple searching through the datasets. Bioinformatics for Dentistry is made freely available worldwide, with the hope that this knowledgebase will improve our understanding of the complex genetic regulation of tooth development and will open doors to research initiatives and discoveries. This database will be expanded in the future by incorporating resources and built-in sequence analysis tools, and it will be maintained and updated annually.


Subject(s)
Computational Biology , Databases, Genetic , Tooth , Humans , Computational Biology/methods , Tooth/growth & development , Odontogenesis/genetics , Dentistry , Proteomics/methods , Genomics/methods
18.
BMC Med Inform Decis Mak ; 24(1): 159, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38844961

ABSTRACT

BACKGROUND: Compared with the time-consuming and labor-intensive for biological validation in vitro or in vivo, the computational models can provide high-quality and purposeful candidates in an instant. Existing computational models face limitations in effectively utilizing sparse local structural information for accurate predictions in circRNA-disease associations. This study addresses this challenge with a proposed method, CDA-DGRL (Prediction of CircRNA-Disease Association based on Double-line Graph Representation Learning), which employs a deep learning framework leveraging graph networks and a dual-line representation model integrating graph node features. METHOD: CDA-DGRL comprises several key steps: initially, the integration of diverse biological information to compute integrated similarities among circRNAs and diseases, leading to the construction of a heterogeneous network specific to circRNA-disease associations. Subsequently, circRNA and disease node features are derived using sparse autoencoders. Thirdly, a graph convolutional neural network is employed to capture the local graph network structure by inputting the circRNA-disease heterogeneous network alongside node features. Fourthly, the utilization of node2vec facilitates depth-first sampling of the circRNA-disease heterogeneous network to grasp the global graph network structure, addressing issues associated with sparse raw data. Finally, the fusion of local and global graph network structures is inputted into an extra trees classifier to identify potential circRNA-disease associations. RESULTS: The results, obtained through a rigorous five-fold cross-validation on the circR2Disease dataset, demonstrate the superiority of CDA-DGRL with an AUC value of 0.9866 and an AUPR value of 0.9897 compared to existing state-of-the-art models. Notably, the hyper-random tree classifier employed in this model outperforms other machine learning classifiers. CONCLUSION: Thus, CDA-DGRL stands as a promising methodology for reliably identifying circRNA-disease associations, offering potential avenues to alleviate the necessity for extensive traditional biological experiments. The source code and data for this study are available at https://github.com/zywait/CDA-DGRL .


Subject(s)
Biomarkers, Tumor , Neoplasms , RNA, Circular , Humans , RNA, Circular/genetics , Neoplasms/genetics , Biomarkers, Tumor/genetics , Deep Learning , Computational Biology/methods , Neural Networks, Computer
19.
BMC Complement Med Ther ; 24(Suppl 2): 218, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38845010

ABSTRACT

BACKGROUND: Natural herbs are frequently used to treat diseases or to relieve symptoms in many countries. Moreover, as their safety has been proven for a long time, they are considered as main sources of new drug development. However, in many cases, the herbs are still prescribed relying on ancient records and/or traditional practices without scientific evidences. More importantly, the medicinal efficacy of the herbs has to be evaluated in the perspective of MCMT (multi-compound multi-target) effects, but most efforts focus on identifying and analyzing a single compound experimentally. To overcome these hurdles, computational approaches which are based on the scientific evidences and are able to handle the MCMT effects are needed to predict the herb-disease associations. RESULTS: In this study, we proposed a network-based in silico method to predict the herb-disease associations. To this end, we devised a new network-based measure, WACP (weighted average closest path length), which not only quantifies proximity between herb-related genes and disease-related genes but also considers compound compositions of each herb. As a result, we confirmed that our method successfully predicts the herb-disease associations in the human protein interactome (AUROC = 0.777). In addition, we observed that our method is superior than the other simple network-based proximity measures (e.g. average shortest and closest path length). Additionally, we analyzed the associations between Brassica oleracea var. italica and its known associated diseases more specifically as case studies. Finally, based on the prediction results of the WACP, we suggested novel herb-disease pairs which are expected to have potential relations and their literature evidences. CONCLUSIONS: This method could be a promising solution to modernize the use of the natural herbs by providing the scientific evidences about the molecular associations between the herb-related genes targeted by multiple compounds and the disease-related genes in the human protein interactome.


Subject(s)
Protein Interaction Maps , Humans , Computer Simulation , Computational Biology/methods
20.
Front Immunol ; 15: 1394593, 2024.
Article in English | MEDLINE | ID: mdl-38835776

ABSTRACT

Background: Microsatellite instability (MSI) secondary to mismatch repair (MMR) deficiency is characterized by insertions and deletions (indels) in short DNA sequences across the genome. These indels can generate neoantigens, which are ideal targets for precision immune interception. However, current neoantigen databases lack information on neoantigens arising from coding microsatellites. To address this gap, we introduce The MicrOsatellite Neoantigen Discovery Tool (MONET). Method: MONET identifies potential mutated tumor-specific neoantigens (neoAgs) by predicting frameshift mutations in coding microsatellite sequences of the human genome. Then MONET annotates these neoAgs with key features such as binding affinity, stability, expression, frequency, and potential pathogenicity using established algorithms, tools, and public databases. A user-friendly web interface (https://monet.mdanderson.org/) facilitates access to these predictions. Results: MONET predicts over 4 million and 15 million Class I and Class II potential frameshift neoAgs, respectively. Compared to existing databases, MONET demonstrates superior coverage (>85% vs. <25%) using a set of experimentally validated neoAgs. Conclusion: MONET is a freely available, user-friendly web tool that leverages publicly available resources to identify neoAgs derived from microsatellite loci. This systems biology approach empowers researchers in the field of precision immune interception.


Subject(s)
Antigens, Neoplasm , Databases, Genetic , Microsatellite Repeats , Humans , Microsatellite Repeats/genetics , Antigens, Neoplasm/genetics , Antigens, Neoplasm/immunology , Microsatellite Instability , Frameshift Mutation , Software , Computational Biology/methods , Neoplasms/genetics , Neoplasms/immunology
SELECTION OF CITATIONS
SEARCH DETAIL
...