Search | VHL Regional Portal

The Limitations of Existing Approaches in Improving MicroRNA Target Prediction Accuracy.

Loganantharaj, Rasiah; Randall, Thomas A.

Methods Mol Biol ; 1617: 133-158, 2017.

Article in English | MEDLINE | ID: mdl-28540682

ABSTRACT

MicroRNAs (miRNAs) are small (18-24 nt) endogenous RNAs found across diverse phyla involved in posttranscriptional regulation, primarily downregulation of mRNAs. Experimentally determining miRNA-mRNA interactions can be expensive and time-consuming, making the accurate computational prediction of miRNA targets a high priority. Since miRNA-mRNA base pairing in mammals is not perfectly complementary and only a fraction of the identified motifs are real binding sites, accurately predicting miRNA targets remains challenging. The limitations and bottlenecks of existing algorithms and approaches are discussed in this chapter.A new miRNA-mRNA interaction algorithm was implemented in Python (TargetFind) to capture three different modes of association and to maximize detection sensitivity to around 95% for mouse (mm9) and human (hg19) reference data. For human (hg19) data, the prediction accuracy with any one feature among evolutionarily conserved score, multiple targets in a UTR or changes in free energy varied within a close range from 63.5% to 66%. When the results of these features are combined with majority voting, the expected prediction accuracy increases to 69.5%. When all three features are used together, the average best prediction accuracy with tenfold cross validation from the classifiers naïve Bayes, support vector machine, artificial neural network, and decision tree were, respectively, 66.5%, 67.1%, 69%, and 68.4%. The results reveal the advantages and limitations of these approaches.When comparing different sets of features on their strength in predicting true hg19 targets, evolutionarily conserved score slightly outperformed all other features based on thermostability, and target multiplicity. The sophisticated supervised learning algorithms did not improve the prediction accuracy significantly compared to a simple threshold based approach on conservation score or combining the results of each feature with majority agreements. The targets from randomly generated UTRs behaved similar to that of noninteracting pairs with respect to changes in free energy. Availability of additional experimental data describing noninteracting pairs will advance our understanding of the characteristics and the factors positively and negatively influencing these interactions.

Subject(s)

Gene Expression Regulation , Genomics/methods , MicroRNAs/genetics , RNA, Messenger/genetics , Algorithms , Animals , Evolution, Molecular , Humans , Machine Learning , MicroRNAs/chemistry , RNA Stability , RNA, Messenger/chemistry , Software , Thermodynamics , Untranslated Regions

The effects of shared information on semantic calculations in the gene ontology.

Bible, Paul W; Sun, Hong-Wei; Morasso, Maria I; Loganantharaj, Rasiah; Wei, Lai.

Comput Struct Biotechnol J ; 15: 195-211, 2017.

Article in English | MEDLINE | ID: mdl-28217262

ABSTRACT

The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts then substitutes this calculation into traditional term similarity measures such as Resnik, Lin, and Jiang-Conrath. Alternative SI approaches, when combined with ontology choice and term similarity type, lead to many gene-to-gene similarity measures. No thorough investigation has been made into the behavior, complexity, and performance of semantic methods derived from distinct SI approaches. We apply bootstrapping to compare the generalized performance of 57 gene-to-gene semantic measures across six benchmarks. Considering the number of measures, we additionally evaluate whether these methods can be leveraged through ensemble machine learning to improve prediction performance. Results showed that the choice of ontology type most strongly influenced performance across all evaluations. Combining measures into an ensemble classifier reduces cross-validation error beyond any individual measure for protein interaction prediction. This improvement resulted from information gained through the combination of ontology types as ensemble methods within each GO type offered no improvement. These results demonstrate that multiple SI measures can be leveraged for machine learning tasks such as automated gene function prediction by incorporating methods from across the ontologies. To facilitate future research in this area, we developed the GO Graph Tool Kit (GGTK), an open source C++ library with Python interface (github.com/paulbible/ggtk).

PAPST, a User Friendly and Powerful Java Platform for ChIP-Seq Peak Co-Localization Analysis and Beyond.

Bible, Paul W; Kanno, Yuka; Wei, Lai; Brooks, Stephen R; O'Shea, John J; Morasso, Maria I; Loganantharaj, Rasiah; Sun, Hong-Wei.

PLoS One ; 10(5): e0127285, 2015.

Article in English | MEDLINE | ID: mdl-25970601

ABSTRACT

Comparative co-localization analysis of transcription factors (TFs) and epigenetic marks (EMs) in specific biological contexts is one of the most critical areas of ChIP-Seq data analysis beyond peak calling. Yet there is a significant lack of user-friendly and powerful tools geared towards co-localization analysis based exploratory research. Most tools currently used for co-localization analysis are command line only and require extensive installation procedures and Linux expertise. Online tools partially address the usability issues of command line tools, but slow response times and few customization features make them unsuitable for rapid data-driven interactive exploratory research. We have developed PAPST: Peak Assignment and Profile Search Tool, a user-friendly yet powerful platform with a unique design, which integrates both gene-centric and peak-centric co-localization analysis into a single package. Most of PAPST's functions can be completed in less than five seconds, allowing quick cycles of data-driven hypothesis generation and testing. With PAPST, a researcher with or without computational expertise can perform sophisticated co-localization pattern analysis of multiple TFs and EMs, either against all known genes or a set of genomic regions obtained from public repositories or prior analysis. PAPST is a versatile, efficient, and customizable tool for genome-wide data-driven exploratory research. Creatively used, PAPST can be quickly applied to any genomic data analysis that involves a comparison of two or more sets of genomic coordinate intervals, making it a powerful tool for a wide range of exploratory genomic research. We first present PAPST's general purpose features then apply it to several public ChIP-Seq data sets to demonstrate its rapid execution and potential for cutting-edge research with a case study in enhancer analysis. To our knowledge, PAPST is the first software of its kind to provide efficient and sophisticated post peak-calling ChIP-Seq data analysis as an easy-to-use interactive application. PAPST is available at https://github.com/paulbible/papst and is a public domain work.

Subject(s)

Sequence Analysis, DNA , Software , Animals , Binding Sites , Chromatin Immunoprecipitation , Enhancer Elements, Genetic , Mice , Molecular Sequence Annotation , Mouse Embryonic Stem Cells/physiology , Programming Languages , Transcription Factors/physiology

PAVIS: a tool for Peak Annotation and Visualization.

Huang, Weichun; Loganantharaj, Rasiah; Schroeder, Bryce; Fargo, David; Li, Leping.

Bioinformatics ; 29(23): 3097-9, 2013 Dec 01.

Article in English | MEDLINE | ID: mdl-24008416

ABSTRACT

We introduce a web-based tool, Peak Annotation and Visualization (PAVIS), for annotating and visualizing ChIP-seq peak data. PAVIS is designed with non-bioinformaticians in mind and presents a straightforward user interface to facilitate biological interpretation of ChIP-seq peak or other genomic enrichment data. PAVIS, through association with annotation, provides relevant genomic context for each peak, such as peak location relative to genomic features including transcription start site, intron, exon or 5'/3'-untranslated region. PAVIS reports the relative enrichment P-values of peaks in these functionally distinct categories, and provides a summary plot of the relative proportion of peaks in each category. PAVIS, unlike many other resources, provides a peak-oriented annotation and visualization system, allowing dynamic visualization of tens to hundreds of loci from one or more ChIP-seq experiments, simultaneously. PAVIS enables rapid, and easy examination and cross-comparison of the genomic context and potential functions of the underlying genomic elements, thus supporting downstream hypothesis generation.

Subject(s)

Chromatin Immunoprecipitation , Genomics , High-Throughput Nucleotide Sequencing/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA/methods , Software , Chromatin Assembly and Disassembly , Gene Expression Regulation , Humans , Internet

Exploration and Exploitation of Data in Bioinformatics.

Loganantharaj, Rasiah.

Int J Bioinform Res Appl ; 8(1-2): 1-3, 2012.

Article in English | MEDLINE | ID: mdl-22586749

Subject(s)

Computational Biology/methods , Databases, Factual , Computational Biology/trends

A new algorithm for quantifying binding site pattern similarity with applications for Next Generation Sequencing.

Bible, Paul W; Loganantharaj, Rasiah.

Int J Bioinform Res Appl ; 8(1-2): 4-17, 2012.

Article in English | MEDLINE | ID: mdl-22450267

ABSTRACT

New sources of regulatory data, such as transcription factor ChIP-seq experiments, can yield important insights into biological function through downstream analysis of motifs. Position Frequency Matrices (PFMs) are a standard format for representing transcription factor binding patterns. Comparison measures between these binding patterns are necessary to allow more sophisticated detection and classification of regulatory sequences. In this work we have developed a novel algorithm for gapped alignment of PFMs called PfmSim. We compare our measure with a standard measure, Sandelin and Wasserman, on similarity and classification tasks. Our measure gives better similarity values as evaluated by multiple tests.

Subject(s)

Algorithms , Sequence Analysis, DNA , Base Sequence , Binding Sites , Position-Specific Scoring Matrices , Protein Structure, Tertiary , Sequence Alignment

Identification of the B-Raf/Mek/Erk MAP kinase pathway as a target for all-trans retinoic acid during skin cancer promotion.

Cheepala, Satish B; Yin, Weihong; Syed, Zanobia; Gill, Jennifer N; McMillian, Alaina; Kleiner, Heather E; Lynch, Mark; Loganantharaj, Rasiah; Trutschl, Marjan; Cvek, Urska; Clifford, John L.

Mol Cancer ; 8: 27, 2009 May 11.

Article in English | MEDLINE | ID: mdl-19432991

ABSTRACT

BACKGROUND: Retinoids have been studied extensively for their potential as therapeutic and chemopreventive agents for a variety of cancers, including nonmelanoma skin cancer (NMSC). Despite their use for many years, the mechanism of action of retinoids in the prevention of NMSC is still unclear. In this study we have attempted to understand the chemopreventive mechanism of all-trans retinoic acid (ATRA), a primary biologically active retinoid, in order to more efficiently utilize retinoids in the clinic. RESULTS: We have used the 2-stage dimethylbenzanthracene (DMBA)/12-O-tetradecanoylphorbol-13-acetate (TPA) mouse skin carcinogenesis model to investigate the chemopreventive effects of ATRA. We have compared the gene expression profiles of control skin to skin subjected to the 2-stage protocol, with or without ATRA, using Affymetrix 430 2.0 DNA microarrays. Approximately 49% of the genes showing altered expression with TPA treatment are conversely affected when ATRA is co-administered. The activity of these genes, which we refer to as 'counter-regulated', may contribute to chemoprevention by ATRA. The counter-regulated genes have been clustered into functional categories and bioinformatic analysis has identified the B-Raf/Mek/Erk branch of the MAP kinase pathway as one containing several genes whose upregulation by TPA is blocked by ATRA. We also show that ATRA blocks signaling through this pathway, as revealed by immunohistochemistry and Western blotting. Finally, we found that blocking the B-Raf/Mek/Erk pathway with a pharmacological inhibitor, Sorafenib (BAY43-9006), induces squamous differentiation of existing skin SCCs formed in the 2-stage model. CONCLUSION: These results indicate that ATRA targets the B-Raf/Mek/Erk signaling pathway in the 2-stage mouse skin carcinogenesis model and this activity coincides with its chemopreventive action. This demonstrates the potential for targeting the B-Raf/Mek/Erk pathway for chemoprevention and therapy of skin SCC in humans. In addition our DNA microarray results provide the first expression signature for the chemopreventive effect of ATRA in a mouse skin cancer model. This is a potential source for novel targets for ATRA and other chemopreventive and therapeutic agents that can eventually be tested in the clinic.

Subject(s)

Antineoplastic Agents/pharmacology , Extracellular Signal-Regulated MAP Kinases/drug effects , MAP Kinase Kinase Kinases/drug effects , Proto-Oncogene Proteins B-raf/drug effects , Skin Neoplasms/prevention & control , Tretinoin/pharmacology , 9,10-Dimethyl-1,2-benzanthracene/toxicity , Animals , Blotting, Western , Carcinogens/toxicity , Cell Transformation, Neoplastic/drug effects , Cell Transformation, Neoplastic/genetics , Extracellular Signal-Regulated MAP Kinases/metabolism , Female , Gene Expression/drug effects , Immunohistochemistry , MAP Kinase Kinase Kinases/metabolism , Mice , Oligonucleotide Array Sequence Analysis , Proto-Oncogene Proteins B-raf/metabolism , Signal Transduction/drug effects , Signal Transduction/physiology , Skin Neoplasms/enzymology , Skin Neoplasms/genetics , Tetradecanoylphorbol Acetate/toxicity

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL