Pesquisa | Portal Regional da BVS (teste)

Detection and identification of cis-regulatory elements using change-point and classification algorithms.

Maderazo, Dominic; Flegg, Jennifer A; Algama, Manjula; Ramialison, Mirana; Keith, Jonathan.

BMC Genomics ; 23(1): 78, 2022 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-35078412

RESUMO

BACKGROUND: Transcriptional regulation is primarily mediated by the binding of factors to non-coding regions in DNA. Identification of these binding regions enhances understanding of tissue formation and potentially facilitates the development of gene therapies. However, successful identification of binding regions is made difficult by the lack of a universal biological code for their characterisation. RESULTS: We extend an alignment-based method, changept, and identify clusters of biological significance, through ontology and de novo motif analysis. Further, we apply a Bayesian method to estimate and combine binary classifiers on the clusters we identify to produce a better performing composite. CONCLUSIONS: The analysis we describe provides a computational method for identification of conserved binding sites in the human genome and facilitates an alternative interrogation of combinations of existing data sets with alignment data.

Assuntos

Algoritmos , Sequências Reguladoras de Ácido Nucleico , Teorema de Bayes , Sítios de Ligação , Genoma Humano , Humanos , Sequências Reguladoras de Ácido Nucleico/genética

Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M.

BMC Genomics ; 18(1): 259, 2017 03 27.

Artigo em Inglês | MEDLINE | ID: mdl-28347272

RESUMO

BACKGROUND: Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. RESULTS: We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. CONCLUSIONS: This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

Assuntos

Genoma , RNA não Traduzido/metabolismo , Animais , Teorema de Bayes , Sítios de Ligação , Sequência Conservada , Humanos , Íntrons , Camundongos , Desenvolvimento Muscular/genética , Conformação de Ácido Nucleico , RNA não Traduzido/química , RNA não Traduzido/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Interface Usuário-Computador , Peixe-Zebra/genética

Discovery of putative small non-coding RNAs from the obligate intracellular bacterium Wolbachia pipientis.

Woolfit, Megan; Algama, Manjula; Keith, Jonathan M; McGraw, Elizabeth A; Popovici, Jean.

PLoS One ; 10(3): e0118595, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25739023

RESUMO

Wolbachia pipientis is an endosymbiotic bacterium that induces a wide range of effects in its insect hosts, including manipulation of reproduction and protection against pathogens. Little is known of the molecular mechanisms underlying the insect-Wolbachia interaction, though it is likely to be mediated via the secretion of proteins or other factors. There is an increasing amount of evidence that bacteria regulate many cellular processes, including secretion of virulence factors, using small non-coding RNAs (sRNAs), but sRNAs have not previously been described from Wolbachia. We have used two independent approaches, one based on comparative genomics and the other using RNA-Seq data generated for gene expression studies, to identify candidate sRNAs in Wolbachia. We experimentally characterized the expression of one of these candidates in four Wolbachia strains, and showed that it is differentially regulated in different host tissues and sexes. Given the roles played by sRNAs in other host-associated bacteria, the conservation of the candidate sRNAs between different Wolbachia strains, and the sex- and tissue-specific differential regulation we have identified, we hypothesise that sRNAs may play a significant role in the biology of Wolbachia, and in particular in its interactions with its host.

Assuntos

Espaço Intracelular/microbiologia , Pequeno RNA não Traduzido/genética , Wolbachia/genética , Wolbachia/fisiologia , Animais , Biologia Computacional , Sequência Conservada , Drosophila melanogaster/microbiologia , Feminino , Especificidade de Hospedeiro , Masculino , Especificidade de Órgãos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNA , Transcrição Gênica

Investigating genomic structure using changept: A Bayesian segmentation model.

Algama, Manjula; Keith, Jonathan M.

Comput Struct Biotechnol J ; 10(17): 107-15, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-25349679

RESUMO

Genomes are composed of a wide variety of elements with distinct roles and characteristics. Some of these elements are well-characterised functional components such as protein-coding exons. Other elements play regulatory or structural roles, encode functional non-protein-coding RNAs, or perform some other function yet to be characterised. Still others may have no functional importance, though they may nevertheless be of interest to biologists. One technique for investigating the composition of genomes is to segment sequences into compositionally homogenous blocks. This technique, known as 'sequence segmentation' or 'change-point analysis', is used to identify patterns of variation across genomes such as GC-rich and GC-poor regions, coding and non-coding regions, slowly evolving and rapidly evolving regions and many other types of variation. In this mini-review we outline many of the genome segmentation methods currently available and then focus on a Bayesian DNA segmentation algorithm, with examples of its various applications.

Drosophila 3' UTRs are more complex than protein-coding sequences.

Algama, Manjula; Oldmeadow, Christopher; Tasker, Edward; Mengersen, Kerrie; Keith, Jonathan M.

PLoS One ; 9(5): e97336, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24824035

RESUMO

The 3' UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3' UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3' UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3' UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3' UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.

Assuntos

Regiões 3' não Traduzidas/genética , Drosophila/genética , Variação Genética , Modelos Genéticos , Animais , Sequência de Bases , Biologia Computacional , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Especificidade da Espécie

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA