Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
BMC Genomics ; 18(1): 216, 2017 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-28245801

RESUMO

BACKGROUND: While NGS allows rapid global detection of transcripts, it remains difficult to distinguish ncRNAs from short mRNAs. To detect potentially translated RNAs, we developed an improved protocol for bacterial ribosomal footprinting (RIBOseq). This allowed distinguishing ncRNA from mRNA in EHEC. A high ratio of ribosomal footprints per transcript (ribosomal coverage value, RCV) is expected to indicate a translated RNA, while a low RCV should point to a non-translated RNA. RESULTS: Based on their low RCV, 150 novel non-translated EHEC transcripts were identified as putative ncRNAs, representing both antisense and intergenic transcripts, 74 of which had expressed homologs in E. coli MG1655. Bioinformatics analysis predicted statistically significant target regulons for 15 of the intergenic transcripts; experimental analysis revealed 4-fold or higher differential expression of 46 novel ncRNA in different growth media. Out of 329 annotated EHEC ncRNAs, 52 showed an RCV similar to protein-coding genes, of those, 16 had RIBOseq patterns matching annotated genes in other enterobacteriaceae, and 11 seem to possess a Shine-Dalgarno sequence, suggesting that such ncRNAs may encode small proteins instead of being solely non-coding. To support that the RIBOseq signals are reflecting translation, we tested the ribosomal-footprint covered ORF of ryhB and found a phenotype for the encoded peptide in iron-limiting condition. CONCLUSION: Determination of the RCV is a useful approach for a rapid first-step differentiation between bacterial ncRNAs and small mRNAs. Further, many known ncRNAs may encode proteins as well.


Assuntos
Escherichia coli O157/genética , Peptídeos/genética , Pequeno RNA não Traduzido/genética , Ribossomos/genética , Análise de Sequência de RNA , Sequência de Bases , Perfilação da Expressão Gênica , Fenótipo
2.
Z Naturforsch C J Biosci ; 71(9-10): 335-345, 2016 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-27583467

RESUMO

During a 1-year longitudinal study, water, sediment and water plants from two creeks and one pond were sampled monthly and analyzed for the presence of Listeria species. A total of 90 % of 30 sediment samples, 84 % of 31 water plant samples and 67 % of 36 water samples were tested positive. Generally, most probable number counts ranged between 1 and 40 g-1, only occasionally >110 cfu g-1 were detected. Species differentiation based on FT-IR spectroscopy and multiplex PCR of a total of 1220 isolates revealed L. innocua (46 %), L. seeligeri (27 %), L. monocytogenes (25 %) and L. ivanovii (2 %). Titers and species compositions were similar during all seasons. While the species distributions in sediments and associated Ranunculus fluitans plants appeared to be similar in both creeks, RAPD typing did not provide conclusive evidence that the populations of these environments were connected. It is concluded that (i) the fresh-water sediments and water plants are year-round populated by Listeria, (ii) no clear preference for growth in habitats as different as sediments and water plants was found and (iii) the RAPD-based intraspecific biodiversity is high compared to the low population density.


Assuntos
Organismos Aquáticos/microbiologia , Água Doce/microbiologia , Sedimentos Geológicos/microbiologia , Listeria/fisiologia , Plantas/microbiologia , Carga Bacteriana , Biodiversidade , Ecossistema , Variação Genética , Interações Hospedeiro-Patógeno , Listeria/classificação , Listeria/genética , Tipagem Molecular/métodos , Lagoas , Densidade Demográfica , Técnica de Amplificação ao Acaso de DNA Polimórfico/métodos , Ranunculus/microbiologia , Rios , Estações do Ano , Especificidade da Espécie , Fatores de Tempo
3.
BMC Evol Biol ; 15: 283, 2015 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-26677845

RESUMO

BACKGROUND: Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RESULTS: RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. CONCLUSIONS: Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.


Assuntos
Escherichia coli Enteropatogênica/genética , Evolução Molecular , Animais , Proteínas de Bactérias/genética , Sequência de Bases , Bovinos , Códon de Iniciação , Biologia Computacional , Escherichia coli Enteropatogênica/classificação , Escherichia coli Enteropatogênica/crescimento & desenvolvimento , Fezes/microbiologia , Homologia de Genes , Dados de Sequência Molecular , Fases de Leitura Aberta , Óperon , Filogenia , Regiões Promotoras Genéticas , Shigella/genética
4.
BMC Bioinformatics ; 16: 50, 2015 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-25887410

RESUMO

BACKGROUND: Barcode multiplexing is a key strategy for sharing the rising capacity of next-generation sequencing devices: Synthetic DNA tags, called barcodes, are attached to natural DNA fragments within the library preparation procedure. Different libraries, can individually be labeled with barcodes for a joint sequencing procedure. A post-processing step is needed to sort the sequencing data according to their origin, utilizing these DNA labels. The final separation step is called demultiplexing and is mainly determined by the characteristics of the DNA code words used as labels. Currently, we are facing two different strategies for barcoding: One is based on the Hamming distance, the other uses the edit metric to measure distances of code words. The theory of channel coding provides well-known code constructions for Hamming metric. They provide a large number of code words with variable lengths and maximal correction capability regarding substitution errors. However, some sequencing platforms are known to have exceptional high numbers of insertion or deletion errors. Barcodes based on the edit distance can take insertion and deletion errors into account in the decoding process. Unfortunately, there is no explicit code-construction known that gives optimal codes for edit metric. RESULTS: In the present work we focus on an entirely different perspective to obtain DNA barcodes. We consider a concatenated code construction, producing so-called watermark codes, which were first proposed by Davey and Mackay, to communicate via binary channels with synchronization errors. We adapt and extend the concepts of watermark codes to use them for DNA sequencing. Moreover, we provide an exemplary set of barcodes that are experimentally compatible with common next-generation sequencing platforms. Finally, a realistic simulation scenario is use to evaluate the proposed codes to show that the watermark concept is suitable for DNA sequencing applications. CONCLUSION: Our adaption of watermark codes enables the construction of barcodes that are capable of correcting substitutions, insertion and deletion errors. The presented approach has the advantage of not needing any markers or technical sequences to recover the position of the barcode in the sequencing reads, which poses a significant restriction with other approaches.


Assuntos
Algoritmos , Biologia Computacional/métodos , Código de Barras de DNA Taxonômico , DNA/química , Biblioteca Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , DNA/análise , DNA/genética , Humanos
5.
PLoS One ; 9(10): e108768, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25271416

RESUMO

Overlapping genes are two protein-coding sequences sharing a significant part of the same DNA locus in different reading frames. Although in recent times an increasing number of examples have been found in bacteria the underlying mechanisms of their evolution are unknown. In this work we explore how selective pressure in a protein-coding sequence influences its overlapping genes in alternative reading frames. We model evolution using a time-continuous Markov process and derive the corresponding model for the remaining frames to quantify selection pressure and genetic noise. Our findings lead to the presumption that, once information is embedded in the reverse reading frame -2 (relative to the mother gene in +1) purifying selection in the protein-coding reading frame automatically protects the sequences in both frames. We also found that this coincides with the fact that the genetic noise measured using the conditional entropy is minimal in frame -2 under selection in the coding frame.


Assuntos
Evolução Molecular , Homologia de Genes , Modelos Genéticos , Fases de Leitura , Cadeias de Markov , Fases de Leitura Aberta
6.
BMC Genomics ; 15: 353, 2014 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-24885796

RESUMO

BACKGROUND: Multiple infection sources for enterohemorrhagic Escherichia coli O157:H7 (EHEC) are known, including animal products, fruit and vegetables. The ecology of this pathogen outside its human host is largely unknown and one third of its annotated genes are still hypothetical. To identify genetic determinants expressed under a variety of environmental factors, we applied strand-specific RNA-sequencing, comparing the SOLiD and Illumina systems. RESULTS: Transcriptomes of EHEC were sequenced under 11 different biotic and abiotic conditions: LB medium at pH4, pH7, pH9, or at 15°C; LB with nitrite or trimethoprim-sulfamethoxazole; LB-agar surface, M9 minimal medium, spinach leaf juice, surface of living radish sprouts, and cattle feces. Of 5379 annotated genes in strain EDL933 (genome and plasmid), a surprising minority of only 144 had null sequencing reads under all conditions. We therefore developed a statistical method to distinguish weakly transcribed genes from background transcription. We find that 96% of all genes and 91.5% of the hypothetical genes exhibit a significant transcriptional signal under at least one condition. Comparing SOLiD and Illumina systems, we find a high correlation between both approaches for fold-changes of the induced or repressed genes. The pathogenicity island LEE showed highest transcriptional activity in LB medium, minimal medium, and after treatment with antibiotics. Unique sets of genes, including many hypothetical genes, are highly up-regulated on radish sprouts, cattle feces, or in the presence of antibiotics. Furthermore, we observed induction of the shiga-toxin carrying phages by antibiotics and confirmed active biofilm related genes on radish sprouts, in cattle feces, and on agar plates. CONCLUSIONS: Since only a minority of genes (2.7%) were not active under any condition tested (null reads), we suggest that the assumption of significant genome over-annotations is wrong. Environmental transcriptomics uncovered hitherto unknown gene functions and unique regulatory patterns in EHEC. For instance, the environmental function of azoR had been elusive, but this gene is highly active on radish sprouts. Thus, NGS-transcriptomics is an appropriate technique to propose new roles of hypothetical genes and to guide future research.


Assuntos
Escherichia coli O157/genética , Fezes/microbiologia , Raphanus/microbiologia , Transcriptoma , Animais , Antibacterianos/farmacologia , Bovinos , Escherichia coli O157/isolamento & purificação , Escherichia coli O157/metabolismo , Proteínas de Escherichia coli/genética , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Concentração de Íons de Hidrogênio , Raphanus/genética , Raphanus/metabolismo , Análise de Sequência de RNA , Transcriptoma/efeitos dos fármacos , Fatores de Virulência/genética , Fatores de Virulência/metabolismo
7.
BMC Bioinformatics ; 14 Suppl 10: S4, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24267277

RESUMO

In this paper we present an algorithm based on the sum-product algorithm that finds elements in the preimage of a feed-forward Boolean networks given an output of the network. Our probabilistic method runs in linear time with respect to the number of nodes in the network. We evaluate our algorithm for randomly constructed Boolean networks and a regulatory network of Escherichia coli and found that it gives a valid solution in most cases.


Assuntos
Algoritmos , Biologia Computacional , Redes Reguladoras de Genes , Biologia Computacional/métodos , Simulação por Computador , Escherichia coli/genética , Modelos Genéticos , Probabilidade
8.
PLoS One ; 8(5): e64371, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23741321

RESUMO

Nested canalizing Boolean functions (NCF) play an important role in biologically motivated regulatory networks and in signal processing, in particular describing stack filters. It has been conjectured that NCFs have a stabilizing effect on the network dynamics. It is well known that the average sensitivity plays a central role for the stability of (random) Boolean networks. Here we provide a tight upper bound on the average sensitivity of NCFs as a function of the number of relevant input variables. As conjectured in literature this bound is smaller than 4/3. This shows that a large number of functions appearing in biological networks belong to a class that has low average sensitivity, which is even close to a tight lower bound.


Assuntos
Análise de Fourier , Redes Reguladoras de Genes , Modelos Estatísticos , Simulação por Computador
9.
EURASIP J Bioinform Syst Biol ; 2013(1): 6, 2013 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-23642003

RESUMO

: Consider a large Boolean network with a feed forward structure. Given a probability distribution on the inputs, can one find, possibly small, collections of input nodes that determine the states of most other nodes in the network? To answer this question, a notion that quantifies the determinative power of an input over the states of the nodes in the network is needed. We argue that the mutual information (MI) between a given subset of the inputs X={X1,...,Xn} of some node i and its associated function fi(X) quantifies the determinative power of this set of inputs over node i. We compare the determinative power of a set of inputs to the sensitivity to perturbations to these inputs, and find that, maybe surprisingly, an input that has large sensitivity to perturbations does not necessarily have large determinative power. However, for unate functions, which play an important role in genetic regulatory networks, we find a direct relation between MI and sensitivity to perturbations. As an application of our results, we analyze the large-scale regulatory network of Escherichia coli. We identify the most determinative nodes and show that a small subset of those reduces the overall uncertainty of the network state significantly. Furthermore, the network is found to be tolerant to perturbations of its inputs.

10.
EURASIP J Bioinform Syst Biol ; 2013(1): 1, 2013 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-23311536

RESUMO

: Transcriptional regulation networks are often modeled as Boolean networks. We discuss certain properties of Boolean functions (BFs), which are considered as important in such networks, namely, membership to the classes of unate or canalizing functions. Of further interest is the average sensitivity (AS) of functions. In this article, we discuss several algorithms to test the properties of interest. To test canalizing properties of functions, we apply spectral techniques, which can also be used to characterize the AS of functions as well as the influences of variables in unate BFs. Further, we provide and review upper and lower bounds on the AS of unate BFs based on the spectral representation. Finally, we apply these methods to a transcriptional regulation network of Escherichia coli, which controls central parts of the E. coli metabolism. We find that all functions are unate. Also the analysis of the AS of the network reveals an exceptional robustness against transient fluctuations of the binary variables.a.

11.
PLoS One ; 8(12): e82933, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24386128

RESUMO

We consider the design and evaluation of short barcodes, with a length between six and eight nucleotides, used for parallel sequencing on platforms where substitution errors dominate. Such codes should have not only good error correction properties but also the code words should fulfil certain biological constraints (experimental parameters). We compare published barcodes with codes obtained by two new constructions methods, one based on the currently best known linear codes and a simple randomized construction method. The evaluation done is with respect to the error correction capabilities, barcode size and their experimental parameters and fundamental bounds on the code size and their distance properties. We provide a list of codes for lengths between six and eight nucleotides, where for length eight, two substitution errors can be corrected. In fact, no code with larger minimum distance can exist.


Assuntos
DNA/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA
12.
PLoS One ; 7(9): e45103, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23028785

RESUMO

An analytical model based on the statistical properties of Open Reading Frames (ORFs) of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.


Assuntos
Escherichia coli O157/genética , Genoma Bacteriano/genética , Fases de Leitura Aberta/genética , Estatística como Assunto , Composição de Bases/genética , Sequência de Bases , Cadeias de Markov , Modelos Genéticos , Anotação de Sequência Molecular
13.
EURASIP J Bioinform Syst Biol ; 2012(1): 14, 2012 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-23033959

RESUMO

: Bacterial strains that were genetically blocked in important metabolic pathways and grown under selective conditions underwent a process of adaptive evolution: certain pathways may have been deregulated and therefore allowed for the circumvention of the given block. A block of endogenous pyruvate synthesis from glycerol was realized by a knockout of pyruvate kinase and phosphoenolpyruvate carboxylase in E. coli. The resulting mutant strain was able to grow on a medium containing glycerol and lactate, which served as an exogenous pyruvate source. Heterologous expression of a pyruvate carboxylase gene from Corynebacterium glutamicum was used for anaplerosis of the TCA cycle. Selective conditions were controlled in a continuous culture with limited lactate feed and an excess of glycerol feed. After 200-300 generations pyruvate-prototrophic mutants were isolated. The genomic analysis of an evolved strain revealed that the genotypic basis for the regained pyruvate-prototrophy was not obvious. A constraint-based model of the metabolism was employed to compute all possible detours around the given metabolic block by solving a hierarchy of linear programming problems. The regulatory network was expected to be responsible for the adaptation process. Hence, a Boolean model of the transcription factor network was connected to the metabolic model. Our model analysis only showed a marginal impact of transcriptional control on the biomass yield on substrate which is a key variable in the selection process. In our experiment, microarray analysis confirmed that transcriptional control probably played a minor role in the deregulation of the alternative pathways for the circumvention of the block.

14.
EURASIP J Bioinform Syst Biol ; 2011: 6, 2011 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-21989141

RESUMO

Boolean models of regulatory networks are assumed to be tolerant to perturbations. That qualitatively implies that each function can only depend on a few nodes. Biologically motivated constraints further show that functions found in Boolean regulatory networks belong to certain classes of functions, for example, the unate functions. It turns out that these classes have specific properties in the Fourier domain. That motivates us to study the problem of detecting controlling nodes in classes of Boolean networks using spectral techniques. We consider networks with unbalanced functions and functions of an average sensitivity less than 23k, where k is the number of controlling variables for a function. Further, we consider the class of 1-low networks which include unate networks, linear threshold networks, and networks with nested canalyzing functions. We show that the application of spectral learning algorithms leads to both better time and sample complexity for the detection of controlling nodes compared with algorithms based on exhaustive search. For a particular algorithm, we state analytical upper bounds on the number of samples needed to find the controlling nodes of the Boolean functions. Further, improved algorithms for detecting controlling nodes in large-scale unate networks are given and numerically studied.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA