Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
PeerJ ; 12: e17396, 2024.
Article in English | MEDLINE | ID: mdl-38799058

ABSTRACT

Deciphering the targets of microRNAs (miRNAs) in plants is crucial for comprehending their function and the variation in phenotype that they cause. As the highly cell-specific nature of miRNA regulation, recent computational approaches usually utilize expression data to identify the most physiologically relevant targets. Although these methods are effective, they typically require a large sample size and high-depth sequencing to detect potential miRNA-target pairs, thereby limiting their applicability in improving plant breeding. In this study, we propose a novel miRNA-target prediction framework named kmerPMTF (k-mer-based prediction framework for plant miRNA-target). Our framework effectively extracts the latent semantic embeddings of sequences by utilizing k-mer splitting and a deep self-supervised neural network. We construct multiple similarity networks based on k-mer embeddings and employ graph convolutional networks to derive deep representations of miRNAs and targets and calculate the probabilities of potential associations. We evaluated the performance of kmerPMTF on four typical plant datasets: Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, and Prunus persica. The results demonstrate its ability to achieve AUPRC values of 84.9%, 91.0%, 80.1%, and 82.1% in 5-fold cross-validation, respectively. Compared with several state-of-the-art existing methods, our framework achieves better performance on threshold-independent evaluation metrics. Overall, our study provides an efficient and simplified methodology for identifying plant miRNA-target associations, which will contribute to a deeper comprehension of miRNA regulatory mechanisms in plants.


Subject(s)
MicroRNAs , Neural Networks, Computer , MicroRNAs/genetics , MicroRNAs/metabolism , RNA, Plant/genetics , RNA, Plant/metabolism , Computational Biology/methods , Gene Expression Regulation, Plant
2.
BMC Genomics ; 25(1): 175, 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38350848

ABSTRACT

BACKGROUND: Brain diseases pose a significant threat to human health, and various network-based methods have been proposed for identifying gene biomarkers associated with these diseases. However, the brain is a complex system, and extracting topological semantics from different brain networks is necessary yet challenging to identify pathogenic genes for brain diseases. RESULTS: In this study, we present a multi-network representation learning framework called M-GBBD for the identification of gene biomarker in brain diseases. Specifically, we collected multi-omics data to construct eleven networks from different perspectives. M-GBBD extracts the spatial distributions of features from these networks and iteratively optimizes them using Kullback-Leibler divergence to fuse the networks into a common semantic space that represents the gene network for the brain. Subsequently, a graph consisting of both gene and large-scale disease proximity networks learns representations through graph convolution techniques and predicts whether a gene is associated which brain diseases while providing associated scores. Experimental results demonstrate that M-GBBD outperforms several baseline methods. Furthermore, our analysis supported by bioinformatics revealed CAMP as a significantly associated gene with Alzheimer's disease identified by M-GBBD. CONCLUSION: Collectively, M-GBBD provides valuable insights into identifying gene biomarkers for brain diseases and serves as a promising framework for brain networks representation learning.


Subject(s)
Alzheimer Disease , Semantics , Humans , Brain/diagnostic imaging , Alzheimer Disease/genetics , Genetic Markers , Learning
3.
Interdiscip Sci ; 2024 Feb 04.
Article in English | MEDLINE | ID: mdl-38310628

ABSTRACT

MicroRNA (miRNA) serves as a pivotal regulator of numerous cellular processes, and the identification of miRNA-disease associations (MDAs) is crucial for comprehending complex diseases. Recently, graph neural networks (GNN) have made significant advancements in MDA prediction. However, these methods tend to learn one type of node representation from a single heterogeneous network, ignoring the importance of multiple network topologies and node attributes. Here, we propose SMDAP (Sequence hierarchical modeling-based Mirna-Disease Association Prediction framework), a novel GNN-based framework that incorporates multiple network topologies and various node attributes including miRNA seed and full-length sequences to predict potential MDAs. Specifically, SMDAP consists of two types of MDA representation: following a heterogeneous pattern, we construct a transfer learning-like synchronous mutual learning network to learn the first MDA representation in conjunction with the miRNA seed sequence. Meanwhile, following a homogeneous pattern, we design a subgraph-inspired asynchronous multi-scale embedding network to obtain the second MDA representation based on the miRNA full-length sequence. Subsequently, an adaptive fusion approach is designed to combine the two branches such that we can score the MDAs by the downstream classifier and infer novel MDAs. Comprehensive experiments demonstrate that SMDAP integrates the advantages of multiple network topologies and node attributes into two branch representations. Moreover, the area under the receiver operating characteristic curve is 0.9622 on DB1, which is a 5.06% increase from the baselines. The area under the precision-recall curve is 0.9777, which is a 7.33% increase from the baselines. In addition, case studies on three human cancers validated the predictive performance of SMDAP. Overall, SMDAP represents a powerful tool for MDA prediction.

4.
Biology (Basel) ; 12(9)2023 Sep 03.
Article in English | MEDLINE | ID: mdl-37759602

ABSTRACT

The recently emerging high-throughput Pore-C (HiPore-C) can identify whole-genome high-order chromatin multi-way interactions with an ultra-high output, contributing to deciphering three-dimensional (3D) genome organization. However, it also brings new challenges to relevant data analysis. To alleviate this problem, we proposed the EpiMCI, a model for multi-way chromatin interaction prediction based on a hypergraph neural network with epigenomic signals as the input. The EpiMCI integrated separate hyperedge representations with coupling hyperedge information and obtained AUCs of 0.981 and 0.984 in the GM12878 and K562 datasets, respectively, which outperformed the current available method. Moreover, the EpiMCI can be applied to denoise the HiPore-C data and improve the data quality efficiently. Furthermore, the vertex embeddings extracted from the EpiMCI reflected the global chromatin architecture accurately. The principal component analysis suggested that it was well aligned with the activities of genomic regions at the chromatin compartment level. Taken together, the EpiMCI can accurately predict multi-way chromatin interactions and can be applied to studies relying on chromatin architecture.

5.
ACS Omega ; 8(30): 27386-27397, 2023 Aug 01.
Article in English | MEDLINE | ID: mdl-37546619

ABSTRACT

Identifying noncoding RNAs (ncRNAs)-drug resistance association computationally would have a marked effect on understanding ncRNA molecular function and drug target mechanisms and alleviating the screening cost of corresponding biological wet experiments. Although graph neural network-based methods have been developed and facilitated the detection of ncRNAs related to drug resistance, it remains a challenge to explore a highly trusty ncRNA-drug resistance association prediction framework, due to inevitable noise edges originating from the batch effect and experimental errors. Herein, we proposed a framework, referred to as RDRGSE (RDR association prediction by using graph skeleton extraction and attentional feature fusion), for detecting ncRNA-drug resistance association. Specifically, starting with the construction of the original ncRNA-drug resistance association as a bipartite graph, RDRGSE took advantage of a bi-view skeleton extraction strategy to obtain two types of skeleton views, followed by a graph neural network-based estimator for iteratively optimizing skeleton views aimed at learning high-quality ncRNA-drug resistance edge embedding and optimal graph skeleton structure, jointly. Then, RDRGSE adopted adaptive attentional feature fusion to obtain final edge embedding and identified potential RDRAs under an end-to-end pattern. Comprehensive experiments were conducted, and experimental results indicated the significant advantage of a skeleton structure for ncRNA-drug resistance association discovery. Compared with state-of-the-art approaches, RDRGSE improved the prediction performance by 6.7% in terms of AUC and 6.1% in terms of AUPR. Also, ablation-like analysis and independent case studies corroborated RDRGSE generalization ability and robustness. Overall, RDRGSE provides a powerful computational method for ncRNA-drug resistance association prediction, which can also serve as a screening tool for drug resistance biomarkers.

6.
Front Genet ; 14: 1084482, 2023.
Article in English | MEDLINE | ID: mdl-37274787

ABSTRACT

Identification of long non-coding RNAs (lncRNAs) associated with common diseases is crucial for patient self-diagnosis and monitoring of health conditions using artificial intelligence (AI) technology at home. LncRNAs have gained significant attention due to their crucial roles in the pathogenesis of complex human diseases and identifying their associations with diseases can aid in developing diagnostic biomarkers at the molecular level. Computational methods for predicting lncRNA-disease associations (LDAs) have become necessary due to the time-consuming and labor-intensive nature of wet biological experiments in hospitals, enabling patients to access LDAs through their AI terminal devices at any time. Here, we have developed a predictive tool, LDAGRL, for identifying potential LDAs using a bridge heterogeneous information network (BHnet) constructed via Structural Deep Network Embedding (SDNE). The BHnet consists of three types of molecules as bridge nodes to implicitly link the lncRNA with disease nodes and the SDNE is used to learn high-quality node representations and make LDA predictions in a unified graph space. To assess the feasibility and performance of LDAGRL, extensive experiments, including 5-fold cross-validation, comparison with state-of-the-art methods, comparison on different classifiers and comparison of different node feature combinations, were conducted, and the results showed that LDAGRL achieved satisfactory prediction performance, indicating its potential as an effective LDAs prediction tool for family medicine and primary care.

7.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37381618

ABSTRACT

Although sequencing-based high-throughput chromatin interaction data are widely used to uncover genome-wide three-dimensional chromatin architecture, their sparseness and high signal-noise-ratio greatly restrict the precision of the obtained structural elements. To improve data quality, we here present iEnhance (chromatin interaction data resolution enhancement), a multi-scale spatial projection and encoding network, to predict high-resolution chromatin interaction matrices from low-resolution and noisy input data. Specifically, iEnhance projects the input data into matrix spaces to extract multi-scale global and local feature sets, then hierarchically fused these features by attention mechanism. After that, dense channel encoding and residual channel decoding are used to effectively infer robust chromatin interaction maps. iEnhance outperforms state-of-the-art Hi-C resolution enhancement tools in both visual and quantitative evaluation. Comprehensive analysis shows that unlike other tools, iEnhance can recover both short-range structural elements and long-range interaction patterns precisely. More importantly, iEnhance can be transferred to data enhancement of other tissues or cell lines of unknown resolution. Furthermore, iEnhance performs robustly in enhancement of diverse chromatin interaction data including those from single-cell Hi-C and Micro-C experiments.


Subject(s)
Chromatin , Chromosomes , Chromatin/genetics , Genome , Cell Line
8.
BMC Genomics ; 24(1): 256, 2023 May 11.
Article in English | MEDLINE | ID: mdl-37170226

ABSTRACT

BACKGROUND: As an important source of genetic variation, copy number variation (CNV) can alter the dosage of DNA segments, which in turn may affect gene expression level and phenotype. However, our knowledge of CNV in apple is still limited. Here, we obtained high-confidence CNVs and investigated their functional impact based on genome resequencing data of two apple populations, cultivars and wild relatives. RESULTS: In this study, we identified 914,610 CNVs comprising 14,839 CNV regions (CNVRs) from 346 apple accessions, including 289 cultivars and 57 wild relatives. CNVRs summed to 71.19 Mb, accounting for 10.03% of the apple genome. Under the low linkage disequilibrium (LD) with nearby SNPs, they could also accurately reflect the population structure of apple independent of SNPs. Furthermore, A total of 3,621 genes were covered by CNVRs and functionally involved in biological processes such as defense response, reproduction and metabolic processes. In addition, the population differentiation index ([Formula: see text]) analysis between cultivars and wild relatives revealed 127 CN-differentiated genes, which may contribute to trait differences in these two populations. CONCLUSIONS: This study was based on identification of CNVs from 346 diverse apple accessions, which to our knowledge was the largest dataset for CNV analysis in apple. Our work presented the first comprehensive CNV map and provided valuable resources for understanding genomic variations in apple.


Subject(s)
DNA Copy Number Variations , Malus , Malus/genetics , Genetics, Population , Genome , Phenotype , Polymorphism, Single Nucleotide
9.
BMC Bioinformatics ; 24(1): 18, 2023 Jan 17.
Article in English | MEDLINE | ID: mdl-36650439

ABSTRACT

BACKGROUND: Emerging evidences show that Piwi-interacting RNAs (piRNAs) play a pivotal role in numerous complex human diseases. Identifying potential piRNA-disease associations (PDAs) is crucial for understanding disease pathogenesis at molecular level. Compared to the biological wet experiments, the computational methods provide a cost-effective strategy. However, few computational methods have been developed so far. RESULTS: Here, we proposed an end-to-end model, referred to as PDA-PRGCN (PDA prediction using subgraph Projection and Residual scaling-based feature augmentation through Graph Convolutional Network). Specifically, starting with the known piRNA-disease associations represented as a graph, we applied subgraph projection to construct piRNA-piRNA and disease-disease subgraphs for the first time, followed by a residual scaling-based feature augmentation algorithm for node initial representation. Then, we adopted graph convolutional network (GCN) to learn and identify potential PDAs as a link prediction task on the constructed heterogeneous graph. Comprehensive experiments, including the performance comparison of individual components in PDA-PRGCN, indicated the significant improvement of integrating subgraph projection, node feature augmentation and dual-loss mechanism into GCN for PDA prediction. Compared with state-of-the-art approaches, PDA-PRGCN gave more accurate and robust predictions. Finally, the case studies further corroborated that PDA-PRGCN can reliably detect PDAs. CONCLUSION: PDA-PRGCN provides a powerful method for PDA prediction, which can also serve as a screening tool for studies of complex diseases.


Subject(s)
Algorithms , Piwi-Interacting RNA , Humans
10.
Inorg Chem ; 61(23): 8879-8886, 2022 Jun 13.
Article in English | MEDLINE | ID: mdl-35649271

ABSTRACT

The iminoboryl o-carboranes (Me3Si)-Cb-B≡N-R (Cb = B10C2H10, 3a, R = SiMe3; 3b, R = tBu) have been successfully synthesized by tetrahydrofuran (THF)-promoted isomerization from the corresponding o-carborane-fused aminoboriranes Cb{BN(SiMe3)R} (2). The synthetic protocol of the previously reported borirane 2a was optimized. The borirane Cb{BN(SiMe3)tBu} (2b) and the iminoboranes 3a and 3b were fully characterized by NMR, IR, and single-crystal X-ray diffraction analyses. The borirane 2a isomerizes more readily than 2b. The kinetics study revealed a bimolecular mechanism between borirane and THF, which is in good agreement with the computationally proposed reaction pathway. The title compounds are thermally robust, but compound 3a dimerized in the presence of a catalytic amount of tBuNC to give the cyclodimer 4. Quick equilibrium between 4 and the isonitrile adduct 4·tBuNC was observed in solution.

11.
Brief Bioinform ; 23(2)2022 03 10.
Article in English | MEDLINE | ID: mdl-35037024

ABSTRACT

Predicting drug-target interactions (DTIs) is a convenient strategy for drug discovery. Although various computational methods have been put forward in recent years, DTIs prediction is still a challenging task. In this paper, based on indirect prior information (we term them as mediators), we proposed a new model, called Bridging-BPs (bridging paths), for DTIs prediction. Specifically, we regarded linkage process between mediators and DTs (drugs and proteins) as 'bridging' and source (drug)-mediators-destination (protein) as bridging paths. By integrating various bridging paths, we constructed a bridging heterogeneous graph for DTIs. After that, an improved graph-embedding algorithm-BPs2vec-was designed to capture deep topological features underlying the bridging graph, thereby obtaining the low-dimensional node vector representations. Then, the vector representations were fed into a Random Forest classifier to train and score the probability, outputting the final classification results for potential DTIs. Under 5-fold cross validation, our method obtained AUPR of 88.97% and AUC of 88.63%, suggesting that Bridging-BPs could effectively mine the link relationships hidden in indirect prior information and it significantly improved the accuracy and robustness of DTIs prediction without direct prior information. Finally, we confirmed the practical prediction ability of Bridging-BPs by case studies.


Subject(s)
Drug Development , Proteins , Algorithms , Drug Development/methods , Drug Discovery/methods , Drug Interactions , Proteins/metabolism
12.
Front Genet ; 12: 781277, 2021.
Article in English | MEDLINE | ID: mdl-34966413

ABSTRACT

Pseudogenes were originally regarded as non-functional components scattered in the genome during evolution. Recent studies have shown that pseudogenes can be transcribed into long non-coding RNA and play a key role at multiple functional levels in different physiological and pathological processes. microRNAs (miRNAs) are a type of non-coding RNA, which plays important regulatory roles in cells. Numerous studies have shown that pseudogenes and miRNAs have interactions and form a ceRNA network with mRNA to regulate biological processes and involve diseases. Exploring the associations of pseudogenes and miRNAs will facilitate the clinical diagnosis of some diseases. Here, we propose a prediction model PMGAE (Pseudogene-MiRNA association prediction based on the Graph Auto-Encoder), which incorporates feature fusion, graph auto-encoder (GAE), and eXtreme Gradient Boosting (XGBoost). First, we calculated three types of similarities including Jaccard similarity, cosine similarity, and Pearson similarity between nodes based on the biological characteristics of pseudogenes and miRNAs. Subsequently, we fused the above similarities to construct a similarity profile as the initial representation features for nodes. Then, we aggregated the similarity profiles and associations of nodes to obtain the low-dimensional representation vector of nodes through a GAE. In the last step, we fed these representation vectors into an XGBoost classifier to predict new pseudogene-miRNA associations (PMAs). The results of five-fold cross validation show that PMGAE achieves a mean AUC of 0.8634 and mean AUPR of 0.8966. Case studies further substantiated the reliability of PMGAE for mining PMAs and the study of endogenous RNA networks in relation to diseases.

13.
J Am Chem Soc ; 142(41): 17243-17249, 2020 10 14.
Article in English | MEDLINE | ID: mdl-32941023

ABSTRACT

The base-free benzoborirene 1,2-BR-1,2-C6H4 (7) and its three-dimensional inorganic analogue 1,2-BR-1,2-C2B10H10 (13) have been successfully synthesized by Cp2ZrBr2 and LiCl elimination, respectively. The Cl analogue of the key intermediate for the formation of benzoborirene 7 has been isolated and structurally characterized, thus suggesting the reaction pathway via benzyne Zr complex formation, B-Br/Cbenzyne-Zr σ-bond metathesis, and a Cp2ZrBr2 elimination/ring-closing process. The rationality of the reaction pathway has been confirmed by DFT calculations. In addition, the title compounds shared the same reactivity pattern (i.e., 1,3-silyl migration) toward MeIiPr (8), thus allowing for the synthetic approach to the first carborane-substituted iminoborane 14.

14.
Cell Metab ; 27(1): 151-166.e6, 2018 01 09.
Article in English | MEDLINE | ID: mdl-29198988

ABSTRACT

Amino acids are known regulators of cellular signaling and physiology, but how they are sensed intracellularly is not fully understood. Herein, we report that each aminoacyl-tRNA synthetase (ARS) senses its cognate amino acid sufficiency through catalyzing the formation of lysine aminoacylation (K-AA) on its specific substrate proteins. At physiologic levels, amino acids promote ARSs bound to their substrates and form K-AAs on the ɛ-amine of lysines in their substrates by producing reactive aminoacyl adenylates. The K-AA marks can be removed by deacetylases, such as SIRT1 and SIRT3, employing the same mechanism as that involved in deacetylation. These dynamically regulated K-AAs transduce signals of their respective amino acids. Reversible leucylation on ras-related GTP-binding protein A/B regulates activity of the mammalian target of rapamycin complex 1. Glutaminylation on apoptosis signal-regulating kinase 1 suppresses apoptosis. We discovered non-canonical functions of ARSs and revealed systematic and functional amino acid sensing and signal transduction networks.


Subject(s)
Aminoacylation , Intracellular Space/metabolism , Lysine/metabolism , Signal Transduction , Amino Acyl-tRNA Synthetases/metabolism , Apoptosis , Biocatalysis , HEK293 Cells , Humans , Mechanistic Target of Rapamycin Complex 1/metabolism , Substrate Specificity
15.
J Biol Chem ; 290(43): 26314-27, 2015 Oct 23.
Article in English | MEDLINE | ID: mdl-26324710

ABSTRACT

Nine aminoacyl-tRNA synthetases (aaRSs) and three scaffold proteins form a super multiple aminoacyl-tRNA synthetase complex (MSC) in the human cytoplasm. Domains that have been added progressively to MSC components during evolution are linked by unstructured flexible peptides, producing an elongated and multiarmed MSC structure that is easily attacked by proteases in vivo. A yeast two-hybrid screen for proteins interacting with LeuRS, a representative MSC member, identified calpain 2, a calcium-activated neutral cysteine protease. Calpain 2 and calpain 1 could partially hydrolyze most MSC components to generate specific fragments that resembled those reported previously. The cleavage sites of calpain in ArgRS, GlnRS, and p43 were precisely mapped. After cleavage, their N-terminal regions were removed. Sixty-three amino acid residues were removed from the N terminus of ArgRS to form ArgRSΔN63; GlnRS formed GlnRSΔN198, and p43 formed p43ΔN106. GlnRSΔN198 had a much weaker affinity for its substrates, tRNA(Gln) and glutamine. p43ΔN106 was the same as the previously reported p43-derived apoptosis-released factor. The formation of p43ΔN106 by calpain depended on Ca(2+) and could be specifically inhibited by calpeptin and by RNAi of the regulatory subunit of calpain in vivo. These results showed, for the first time, that calpain plays an essential role in dissociating the MSC and might regulate the canonical and non-canonical functions of certain components of the MSC.


Subject(s)
Amino Acyl-tRNA Synthetases/metabolism , Calpain/metabolism , Amino Acid Sequence , Amino Acyl-tRNA Synthetases/chemistry , Humans , Molecular Sequence Data , Proteolysis , Recombinant Proteins/metabolism , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...