Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38709606

ABSTRACT

RNA-protein interactions (RPIs) play an important role in several fundamental cellular physiological processes, including cell motility, chromosome replication, transcription and translation, and signaling. Predicting RPI can guide the exploration of cellular biological functions, intervening in diseases, and designing drugs. Given this, this study proposes the RPI-gated graph convolutional network (RPI-GGCN) method for predicting RPI based on the gated graph convolutional neural network (GGCN) and co-regularized variational autoencoder (Co-VAE). First, different types of feature information were extracted from RNA and protein sequences by nine feature extraction methods. Second, Co-VAEs are used to eliminate the redundancy of fused features and generate optimal features. Finally, this study introduces gated cyclic units into graph convolutional networks (GCNs) to construct a model for RPI prediction, which efficiently extracts topological information and improves the model's interpretable feature learning and expression capabilities. In the fivefold cross-validation test, the RPI-GGCN method achieved prediction accuracies of 97.27%, 97.32%, 96.54%, 95.76%, and 94.98% on the RPI369, RPI488, RPI1446, RPI1807, and RPI2241 datasets. To test the generalization performance of the model, we used the model trained on RPI369 to predict the independent NPInter v3.0 dataset and achieved excellent performance in all six independent validation sets. By visualizing the RPI network graph based on the prediction results, we aim to provide a new perspective and reference for studying RPI mechanisms and exploring new RPIs. Extensive experimental results demonstrate that RPI-GGCN can provide an efficient, accurate, and stable RPI prediction method.

2.
Comput Biol Med ; 170: 107944, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38215617

ABSTRACT

The prediction of multi-label protein subcellular localization (SCL) is a pivotal area in bioinformatics research. Recent advancements in protein structure research have facilitated the application of graph neural networks. This paper introduces a novel approach termed ML-FGAT. The approach begins by extracting node information of proteins from sequence data, physical-chemical properties, evolutionary insights, and structural details. Subsequently, various evolutionary techniques are integrated to consolidate multi-view information. A linear discriminant analysis framework, grounded on entropy weight, is then employed to reduce the dimensionality of the merged features. To enhance the robustness of the model, the training dataset is augmented using feature-generative adversarial networks. For the primary prediction step, graph attention networks are employed to determine multi-label protein SCL, leveraging both node and neighboring information. The interpretability is enhanced by analyzing the attention weight parameters. The training is based on the Gram-positive bacteria dataset, while validation employs newly constructed datasets: human, virus, Gram-negative bacteria, plant, and SARS-CoV-2. Following a leave-one-out cross-validation procedure, ML-FGAT demonstrates noteworthy superiority in this domain.


Subject(s)
Computational Biology , Neural Networks, Computer , Humans , Discriminant Analysis , Entropy , Physical Examination
3.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37328639

ABSTRACT

Precise targeting of transcription factor binding sites (TFBSs) is essential to comprehending transcriptional regulatory processes and investigating cellular function. Although several deep learning algorithms have been created to predict TFBSs, the models' intrinsic mechanisms and prediction results are difficult to explain. There is still room for improvement in prediction performance. We present DeepSTF, a unique deep-learning architecture for predicting TFBSs by integrating DNA sequence and shape profiles. We use the improved transformer encoder structure for the first time in the TFBSs prediction approach. DeepSTF extracts DNA higher-order sequence features using stacked convolutional neural networks (CNNs), whereas rich DNA shape profiles are extracted by combining improved transformer encoder structure and bidirectional long short-term memory (Bi-LSTM), and, finally, the derived higher-order sequence features and representative shape profiles are integrated into the channel dimension to achieve accurate TFBSs prediction. Experiments on 165 ENCODE chromatin immunoprecipitation sequencing (ChIP-seq) datasets show that DeepSTF considerably outperforms several state-of-the-art algorithms in predicting TFBSs, and we explain the usefulness of the transformer encoder structure and the combined strategy using sequence features and shape profiles in capturing multiple dependencies and learning essential features. In addition, this paper examines the significance of DNA shape features predicting TFBSs. The source code of DeepSTF is available at https://github.com/YuBinLab-QUST/DeepSTF/.


Subject(s)
DNA , Neural Networks, Computer , Binding Sites , Protein Binding , DNA/genetics , DNA/chemistry , Transcription Factors/genetics , Transcription Factors/chemistry
4.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36748992

ABSTRACT

Interactions between DNA and transcription factors (TFs) play an essential role in understanding transcriptional regulation mechanisms and gene expression. Due to the large accumulation of training data and low expense, deep learning methods have shown huge potential in determining the specificity of TFs-DNA interactions. Convolutional network-based and self-attention network-based methods have been proposed for transcription factor binding sites (TFBSs) prediction. Convolutional operations are efficient to extract local features but easy to ignore global information, while self-attention mechanisms are expert in capturing long-distance dependencies but difficult to pay attention to local feature details. To discover comprehensive features for a given sequence as far as possible, we propose a Dual-branch model combining Self-Attention and Convolution, dubbed as DSAC, which fuses local features and global representations in an interactive way. In terms of features, convolution and self-attention contribute to feature extraction collaboratively, enhancing the representation learning. In terms of structure, a lightweight but efficient architecture of network is designed for the prediction, in particular, the dual-branch structure makes the convolution and the self-attention mechanism can be fully utilized to improve the predictive ability of our model. The experiment results on 165 ChIP-seq datasets show that DSAC obviously outperforms other five deep learning based methods and demonstrate that our model can effectively predict TFBSs based on sequence feature alone. The source code of DSAC is available at https://github.com/YuBinLab-QUST/DSAC/.


Subject(s)
DNA , Neural Networks, Computer , Protein Binding , Binding Sites , Transcription Factors/genetics
5.
J Agric Food Chem ; 67(10): 2946-2953, 2019 Mar 13.
Article in English | MEDLINE | ID: mdl-30807132

ABSTRACT

Phenylglyoxylic acid (PGA) are key building blocks and widely used to synthesize pharmaceutical intermediates or food additives. However, the existing synthetic methods for PGA generally involve toxic cyanide and complex processes. To explore an alternative method for PGA biosynthesis, we envisaged cascade biocatalysis for the one-pot synthesis of PGA from racemic mandelic acid. A novel mandelate racemase named ArMR showing higher expression level (216.9 U·mL-1 fermentation liquor) was cloned from Agrobacterium radiobacter and identified, and six recombinant Escherichia coli strains were engineered to coexpress three enzymes of mandelate racemase, d-mandelate dehydrogenase and l-lactate dehydrogenase, and transform racemic mandelic acid to PGA. Among them, the recombinant E. coli TCD 04, engineered to coexpress three enzymes of ArMR, LhDMDH, and LhLDH, can transform racemic mandelic acid (100 mM) to PGA with 98% conversion. Taken together, we provide a green approach for one-pot biosynthesis of PGA from racemic mandelic acid.


Subject(s)
Escherichia coli/metabolism , Glyoxylates/metabolism , Mandelic Acids/metabolism , Agrobacterium tumefaciens/enzymology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Escherichia coli/genetics , Kinetics , L-Lactate Dehydrogenase/genetics , L-Lactate Dehydrogenase/metabolism , Lactobacillus helveticus/enzymology , Lactobacillus helveticus/genetics , Mandelic Acids/chemistry , Metabolic Engineering , Racemases and Epimerases/genetics , Racemases and Epimerases/metabolism
6.
J Agric Food Chem ; 66(11): 2805-2811, 2018 Mar 21.
Article in English | MEDLINE | ID: mdl-29460618

ABSTRACT

d-Mandelate dehydrogenase (DMDH) has the potential to convert d-mandelic acid to phenylglyoxylic acid (PGA), which is a key building block in the field of chemical synthesis and is widely used to synthesize pharmaceutical intermediates or food additives. A novel NAD+-dependent d-mandelate dehydrogenase was cloned from Lactobacillus harbinensi (LhDMDH) by genome mining and expressed in Escherichia coli BL21. After being purified to homogeneity, the oxidation activity of LhDMDH toward d-mandelic acid was approximately 1200 U·mg-1, which was close to four times the activity of the probe. Meanwhile, the kcat/ Km value of LhDMDH was 28.80 S-1·mM-1, which was distinctly higher than the probe. By coculturing two E. coli strains expressing LhDMDH and LcLDH, we developed a system for the efficient synthesis of PGA, achieving a 60% theoretical yield and 99% purity without adding coenzyme or cosubstrate. Our data supports the implementation of a promising strategy for the chiral resolution of racemic mandelic acid and the biosynthesis of PGA.


Subject(s)
Alcohol Oxidoreductases/metabolism , Bacterial Proteins/metabolism , Glyoxylates/metabolism , Lactobacillus/enzymology , Mandelic Acids/metabolism , Alcohol Oxidoreductases/chemistry , Alcohol Oxidoreductases/genetics , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Biocatalysis , Kinetics , Lactobacillus/chemistry , Lactobacillus/genetics
7.
Chin Med J (Engl) ; 126(21): 4066-71, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24229675

ABSTRACT

BACKGROUND: Currently, migration has become one of the risk factors of high burden of tuberculosis in China. This study was to explore the influence of mass migration on the dynamics of Mycobacterium (M.) tuberculosis in Beijing, the capital and an urban area of China. METHODS: Three hundred and thirty-six M. tuberculosis strains from the Changping district, where the problem of urban migrants was more pronounced than in other Beijing regions, were genotyped by Spoligotyping, large sequence polymorphisms (LSPs 105 and 181), and variable number tandem repeat (VNTR) typing. Based on the genotype data, the phylogeny of the isolates was studied. RESULTS: In Changping district, the proportion of Beijing lineage M. tuberculosis isolates amounted to 89.0% (299/336), among which 86.6 % (252) belonged to the modern lineage. The frequency of modern Beijing lineage strains is so high (around 75% (252/336)) that associated risk factors affecting the tuberculosis epidemic cannot be determined. The time to the most recent common ancestor (TMRCA) of the Beijing lineage strains was estimated to be 5073 (95% CI: 4000-6200) years. There was no significant difference in the genetic variation of Beijing isolates from urban migrants and local residents. CONCLUSIONS: The clone of modern Beijing lineage M. tuberculosis, which is dominant in the Beijing area, most likely started to expand with the five thousand-year-old Chinese civilization. In the future, with the urbanization in the whole of China, modern Beijing lineage M. tuberculosis may gain the larger geographical spread.


Subject(s)
Mycobacterium tuberculosis/genetics , China , Genetics, Population , Genotype , Humans , Mycobacterium tuberculosis/classification , Phylogeny , Transients and Migrants
8.
Zhonghua Liu Xing Bing Xue Za Zhi ; 34(4): 374-8, 2013 Apr.
Article in Chinese | MEDLINE | ID: mdl-23937844

ABSTRACT

OBJECTIVE: Using methodology of molecular genetics to explore the origin, phylogen, and gene flow of Mycobacterium tuberculosis (MTB) Beijing lineage in the five provinces from northern China, including Heilongjiang, Jilin, Liaoning, Neimenggu and Ningxia. METHODS: 234 MTB Beijing lineage strains were genotyped by 24 Variable Number Tandem Repeat (VNTR), and the h (the allelic diversity) value of each VNTR locus was calculated. On individual level of phylogeny, it was constructed Neighbor-Joining (N-J) tree and minimum spanning tree (MST). Phylogenetic tree was built at the population level, and the most recent common ancestor (TMRCA) was estimated through Bayesian model. Molecular variance (AMOVA) was used to understand the gene flow among strains discovered from the five provinces. RESULTS: Allelic diversities of the 24 VNTR loci were low (h: 0.000 - 0.744). 234 strains of MTB Beijing lineage were dispersed in individual branch of the N-J tree, with 62.0% (145) of them grouped to the same "colonial complexes" in MST. At the population level, the evolution relationship of 234 strains appeared the closest to Beijing lineage, which was from MIRU-VNTRplus database, and the bootstrap was 100. The TMRCA was 5308 (95%CI: 4263 - 6470) years. Differences of pairwise Fst values acquired by AMOVA between Jilin and Heilongjiang, Liaoning, Neimenggu and Ningxia, were not statistically significant (P > 0.05). CONCLUSION: The genetic similarity of Beijing lineage MTB from the five provinces of northern China was high. The phylogeny branches had no characteristic dispersal in each province. It was speculated that these strains showed an evolution from a clone of MTB Beijing lineage (about 5000 years ago). The gene flow was taking place between neighboring zones.


Subject(s)
DNA, Bacterial/isolation & purification , Mycobacterium tuberculosis/genetics , Phylogeny , Alleles , China/epidemiology , Genes, Bacterial , Genetic Variation , Genotype , Mycobacterium tuberculosis/isolation & purification
SELECTION OF CITATIONS
SEARCH DETAIL
...