Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Brief Bioinform ; 22(2): 2126-2140, 2021 03 22.
Article in English | MEDLINE | ID: mdl-32363397

ABSTRACT

Promoters are short consensus sequences of DNA, which are responsible for transcription activation or the repression of all genes. There are many types of promoters in bacteria with important roles in initiating gene transcription. Therefore, solving promoter-identification problems has important implications for improving the understanding of their functions. To this end, computational methods targeting promoter classification have been established; however, their performance remains unsatisfactory. In this study, we present a novel stacked-ensemble approach (termed SELECTOR) for identifying both promoters and their respective classification. SELECTOR combined the composition of k-spaced nucleic acid pairs, parallel correlation pseudo-dinucleotide composition, position-specific trinucleotide propensity based on single-strand, and DNA strand features and using five popular tree-based ensemble learning algorithms to build a stacked model. Both 5-fold cross-validation tests using benchmark datasets and independent tests using the newly collected independent test dataset showed that SELECTOR outperformed state-of-the-art methods in both general and specific types of promoter prediction in Escherichia coli. Furthermore, this novel framework provides essential interpretations that aid understanding of model success by leveraging the powerful Shapley Additive exPlanation algorithm, thereby highlighting the most important features relevant for predicting both general and specific types of promoters and overcoming the limitations of existing 'Black-box' approaches that are unable to reveal causal relationships from large amounts of initially encoded features.


Subject(s)
Escherichia coli/genetics , Machine Learning , Promoter Regions, Genetic , Datasets as Topic , Genes, Bacterial , Reproducibility of Results
2.
Toxicol Res (Camb) ; 8(5): 754-766, 2019 Sep 01.
Article in English | MEDLINE | ID: mdl-31588352

ABSTRACT

This study sought novel ionizing radiation-response (IR-response) genes in Caenorhabditis elegans (C. elegans). C. elegans was divided into three groups and exposed to different high doses of IR: 0 gray (Gy), 200 Gy, and 400 Gy. Total RNA was extracted from each group and sequenced. When the transcriptomes were compared among these groups, many genes were shown to be differentially expressed, and these genes were significantly enriched in IR-related biological processes and pathways, including gene ontology (GO) terms related to cellular behaviours, cellular growth and purine metabolism and kyoto encyclopedia of genes and genomes (KEGG) pathways related to ATP binding, GTPase regulator activity, and RNA degradation. Quantitative reverse-transcription PCR (qRT-PCR) confirmed that these genes displayed differential expression across the treatments. Further gene network analysis showed a cluster of novel gene families, such as the guanylate cyclase (GCY), Sm-like protein (LSM), diacylglycerol kinase (DGK), skp1-related protein (SKR), and glutathione S-transferase (GST) gene families which were upregulated. Thus, these genes likely play important roles in IR response. Meanwhile, some important genes that are well known to be involved in key signalling pathways, such as phosphoinositide-specific phospholipase C-3 (PLC-3), phosphatidylinositol 3-kinase age-1 (AGE-1), Raf homolog serine/threonine-protein kinase (LIN-45) and protein cbp-1 (CBP-1), also showed differential expression during IR response, suggesting that IR response might perturb these key signalling pathways. Our study revealed a series of novel IR-response genes in Caenorhabditis elegans that might act as regulators of IR response and represent promising markers of IR exposure.

SELECTION OF CITATIONS
SEARCH DETAIL
...