Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 17(9): e0274591, 2022.
Article in English | MEDLINE | ID: mdl-36136981

ABSTRACT

The evolution of RNA-seq technologies has yielded datasets of scientific value that are often generated as condition associated biological replicates within expression studies. With expanding data archives opportunity arises to augment replicate numbers when conditions of interest overlap. Despite correction procedures for estimating transcript abundance, a source of ambiguity is transcript level intra-condition count variation; as indicated by disjointed results between analysis tools. We present TVscript, a tool that removes reference-based transcripts associated with intra-condition count variation above specified thresholds and we explore the effects of such variation on differential expression analysis. Initially iterative differential expression analysis involving simulated counts, where levels of intra-condition variation and sets of over represented transcripts are explicitly specified, was performed. Then counts derived from inter- and intra-study data representing brain samples of dogs, wolves and foxes (wolves vs. dogs and aggressive vs. tame foxes) were used. For simulations, the sensitivity in detecting differentially expressed transcripts increased after removing hyper-variable transcripts, although at levels of intra-condition variation above 5% detection became unreliable. For real data, prior to applying TVscript, ≈20% of the transcripts identified as being differentially expressed were associated with high levels of intra-condition variation, an over representation relative to the reference set. As transcripts harbouring such variation were removed pre-analysis, a discordance from 26 to 40% in the lists of differentially expressed transcripts is observed when compared to those obtained using the non-filtered reference. The removal of transcripts possessing intra-condition variation values within (and above) the 97th and 95th percentiles, for wolves vs. dogs and aggressive vs. tame foxes, maximized the sensitivity in detecting differentially expressed transcripts as a result of alterations within gene-wise dispersion estimates. Through analysis of our real data the support for seven genes with potential for being involved with selection for tameness is provided. TVscript is available at: https://sourceforge.net/projects/tvscript/.


Subject(s)
Wolves , Animals , Dogs , Foxes/genetics , Exome Sequencing , Wolves/genetics
2.
PLoS One ; 17(6): e0259726, 2022.
Article in English | MEDLINE | ID: mdl-35696379

ABSTRACT

To date basic visualization of sequence alignments have largely focused on displaying per-site columns of nucleotide, or amino acid, residues along with associated frequency summarizations. The persistence of this tendency to the recent tools designed for viewing mapped read data indicates that such a perspective not only provides a reliable visualization of per-site alterations, but also offers implicit reassurance to the end-user in relation to data accessibility. However, the initial insight gained is limited, something that is especially true when viewing alignments consisting of many sequences representing differing factors such as location, date and subtype. A basic alignment viewer can have potential to increase initial insight through visual enhancement, whilst not delving into the realms of complex sequence analysis. We present CView, a visualizer that expands on the per-site representation of residues through the incorporation of a dynamic network that is based on the summarization of diversity present across different regions of the alignment. Within the network, nodes are based on the clustering of sequence fragments that span windows placed consecutively along the alignment. Edges are placed between nodes of neighbouring windows where they share sequence identification(s), i.e. different regions of the same sequence(s). Thus, if a node is selected on the network, then the relationship that sequences passing through that node have to other regions of diversity within the alignment can be observed through path tracing. In addition to augmenting visual insight, CView provides export features including variant summarization, per-site residue and kmer frequencies, consensus sequence, alignment dissection as well as clustering; each useful across a range of research areas. The software has been designed to be user friendly, intuitive and interactive. It is open source and an executable jar, source code, quick start, usage tutorial and test data are available (under the GNU General Public License) from https://sourceforge.net/projects/cview/.


Subject(s)
Software , Sequence Alignment , Sequence Analysis
3.
PLoS Comput Biol ; 17(11): e1009631, 2021 11.
Article in English | MEDLINE | ID: mdl-34813594

ABSTRACT

With the exponential growth of sequence information stored over the last decade, including that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric sequences has become essential when assembling read data. In transcriptomics, de novo assembled chimeras can closely resemble underlying transcripts, but patterns such as those seen between co-evolving sites, or mapped read counts, become obscured. We have created a de Bruijn based de novo assembler for RNA-Seq data that utilizes a classification system to describe the complexity of underlying graphs from which contigs are created. Each contig is labelled with one of three levels, indicating whether or not ambiguous paths exist. A by-product of this is information on the range of complexity of the underlying gene families present. As a demonstration of CStones ability to assemble high-quality contigs, and to label them in this manner, both simulated and real data were used. For simulated data, ten million read pairs were generated from cDNA libraries representing four species, Drosophila melanogaster, Panthera pardus, Rattus norvegicus and Serinus canaria. These were assembled using CStone, Trinity and rnaSPAdes; the latter two being high-quality, well established, de novo assembers. For real data, two RNA-Seq datasets, each consisting of ≈30 million read pairs, representing two adult D. melanogaster whole-body samples were used. The contigs that CStone produced were comparable in quality to those of Trinity and rnaSPAdes in terms of length, sequence identity of aligned regions and the range of cDNA transcripts represented, whilst providing additional information on chimerism. Here we describe the details of CStones assembly and classification process, and propose that similar classification systems can be incorporated into other de novo assembly tools. Within a related side study, we explore the effects that chimera's within reference sets have on the identification of differentially expression genes. CStone is available at: https://sourceforge.net/projects/cstone/.


Subject(s)
Transcriptome , Animals , Chimerism , DNA, Complementary/genetics , Datasets as Topic , Sequence Analysis, RNA/methods , Software
4.
Chemosphere ; 220: 748-759, 2019 Apr.
Article in English | MEDLINE | ID: mdl-30611073

ABSTRACT

Bacteria harboring conjugative plasmids have the potential for spreading antibiotic resistance through horizontal gene transfer. It is described that the selection and dissemination of antibiotic resistance is enhanced by stressors, like metals or antibiotics, which can occur as environmental contaminants. This study aimed at unveiling the composition of the conjugative plasmidome of a hospital effluent multidrug resistant Escherichia coli strain (H1FC54) under different mating conditions. To meet this objective, plasmid pulsed field gel electrophoresis, optical mapping analyses and DNA sequencing were used in combination with phenotype analysis. Strain H1FC54 was observed to harbor five plasmids, three of which were conjugative and two of these, pH1FC54_330 and pH1FC54_140, contained metal and antibiotic resistance genes. Transconjugants obtained in the absence or presence of tellurite (0.5 µM or 5 µM), arsenite (0.5 µM, 5 µM or 15 µM) or ceftazidime (10 mg/L) and selected in the presence of sodium azide (100 mg/L) and tetracycline (16 mg/L) presented distinct phenotypes, associated with the acquisition of different plasmid combinations, including two co-integrate plasmids, of 310 kbp and 517 kbp. The variable composition of the conjugative plasmidome, the formation of co-integrates during conjugation, as well as the transfer of non-transferable plasmids via co-integration, and the possible association between antibiotic, arsenite and tellurite tolerance was demonstrated. These evidences bring interesting insights into the comprehension of the molecular and physiological mechanisms that underlie antibiotic resistance propagation in the environment.


Subject(s)
Escherichia coli/genetics , Genetic Variation , Drug Resistance, Microbial/genetics , Drug Resistance, Multiple , Electrophoresis, Gel, Pulsed-Field , Escherichia coli/isolation & purification , Gene Transfer, Horizontal , Hospitals , Metals/pharmacology , Plasmids/genetics
5.
G3 (Bethesda) ; 7(8): 2763-2778, 2017 08 07.
Article in English | MEDLINE | ID: mdl-28637810

ABSTRACT

Transposable element (TE) insertions are among the most challenging types of variants to detect in genomic data because of their repetitive nature and complex mechanisms of replication . Nevertheless, the recent availability of large resequencing data sets has spurred the development of many new methods to detect TE insertions in whole-genome shotgun sequences. Here we report an integrated bioinformatics pipeline for the detection of TE insertions in whole-genome shotgun data, called McClintock (https://github.com/bergmanlab/mcclintock), which automatically runs and standardizes output for multiple TE detection methods. We demonstrate the utility of McClintock by evaluating six TE detection methods using simulated and real genome data from the model microbial eukaryote, Saccharomyces cerevisiae We find substantial variation among McClintock component methods in their ability to detect nonreference TEs in the yeast genome, but show that nonreference TEs at nearly all biologically realistic locations can be detected in simulated data by combining multiple methods that use split-read and read-pair evidence. In general, our results reveal that split-read methods detect fewer nonreference TE insertions than read-pair methods, but generally have much higher positional accuracy. Analysis of a large sample of real yeast genomes reveals that most McClintock component methods can recover known aspects of TE biology in yeast such as the transpositional activity status of families, target preferences, and target site duplication structure, albeit with varying levels of accuracy. Our work provides a general framework for integrating and analyzing results from multiple TE detection methods, as well as useful guidance for researchers studying TEs in yeast resequencing data.


Subject(s)
Computational Biology/methods , DNA Transposable Elements/genetics , Genome , Sequence Analysis, DNA , Software , Animals , Drosophila melanogaster/genetics , Gene Duplication , Molecular Sequence Annotation , Mutagenesis, Insertional/genetics , RNA, Transfer/genetics , Saccharomyces cerevisiae/genetics
6.
Nature ; 482(7384): 173-8, 2012 Feb 08.
Article in English | MEDLINE | ID: mdl-22318601

ABSTRACT

A major challenge of biology is understanding the relationship between molecular genetic variation and variation in quantitative traits, including fitness. This relationship determines our ability to predict phenotypes from genotypes and to understand how evolutionary forces shape variation within and between species. Previous efforts to dissect the genotype-phenotype map were based on incomplete genotypic information. Here, we describe the Drosophila melanogaster Genetic Reference Panel (DGRP), a community resource for analysis of population genomics and quantitative traits. The DGRP consists of fully sequenced inbred lines derived from a natural population. Population genomic analyses reveal reduced polymorphism in centromeric autosomal regions and the X chromosome, evidence for positive and negative selection, and rapid evolution of the X chromosome. Many variants in novel genes, most at low frequency, are associated with quantitative traits and explain a large fraction of the phenotypic variance. The DGRP facilitates genotype-phenotype mapping using the power of Drosophila genetics.


Subject(s)
Drosophila melanogaster/genetics , Genome-Wide Association Study , Genomics , Quantitative Trait Loci/genetics , Alleles , Animals , Centromere/genetics , Chromosomes, Insect/genetics , Genotype , Phenotype , Polymorphism, Single Nucleotide/genetics , Selection, Genetic/genetics , Starvation/genetics , Telomere/genetics , X Chromosome/genetics
7.
PLoS One ; 7(2): e30008, 2012.
Article in English | MEDLINE | ID: mdl-22347367

ABSTRACT

Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes.


Subject(s)
Drosophila melanogaster/genetics , Genome/genetics , Retroelements , Animals , Sequence Analysis , Species Specificity , Terminal Repeat Sequences
8.
PLoS Genet ; 8(12): e1003129, 2012.
Article in English | MEDLINE | ID: mdl-23284297

ABSTRACT

Wolbachia are maternally inherited symbiotic bacteria, commonly found in arthropods, which are able to manipulate the reproduction of their host in order to maximise their transmission. The evolutionary history of endosymbionts like Wolbachia can be revealed by integrating information on infection status in natural populations with patterns of sequence variation in Wolbachia and host mitochondrial genomes. Here we use whole-genome resequencing data from 290 lines of Drosophila melanogaster from North America, Europe, and Africa to predict Wolbachia infection status, estimate relative cytoplasmic genome copy number, and reconstruct Wolbachia and mitochondrial genome sequences. Overall, 63% of Drosophila strains were predicted to be infected with Wolbachia by our in silico analysis pipeline, which shows 99% concordance with infection status determined by diagnostic PCR. Complete Wolbachia and mitochondrial genomes show congruent phylogenies, consistent with strict vertical transmission through the maternal cytoplasm and imperfect transmission of Wolbachia. Bayesian phylogenetic analysis reveals that the most recent common ancestor of all Wolbachia and mitochondrial genomes in D. melanogaster dates to around 8,000 years ago. We find evidence for a recent global replacement of ancestral Wolbachia and mtDNA lineages, but our data suggest that the derived wMel lineage arose several thousand years ago, not in the 20th century as previously proposed. Our data also provide evidence that this global replacement event is incomplete and is likely to be one of several similar incomplete replacement events that have occurred since the out-of-Africa migration that allowed D. melanogaster to colonize worldwide habitats. This study provides a complete genomic analysis of the evolutionary mode and temporal dynamics of the D. melanogaster-Wolbachia symbiosis, as well as important resources for further analyses of the impact of Wolbachia on host biology.


Subject(s)
Drosophila melanogaster , Metagenomics , Symbiosis , Wolbachia , Animals , Bayes Theorem , Drosophila melanogaster/genetics , Drosophila melanogaster/physiology , Evolution, Molecular , Genetic Variation , Genome, Mitochondrial , Haplotypes , Phylogeny , Wolbachia/genetics , Wolbachia/physiology
9.
Nucleic Acids Res ; 36(19): 6199-208, 2008 Nov.
Article in English | MEDLINE | ID: mdl-18829720

ABSTRACT

Understanding the molecular mechanisms that influence transposable element target site preferences is a fundamental challenge in functional and evolutionary genomics. Large-scale transposon insertion projects provide excellent material to study target site preferences in the absence of confounding effects of post-insertion evolutionary change. Growing evidence from a wide variety of prokaryotes and eukaryotes indicates that DNA transposons recognize staggered-cut palindromic target site motifs (TSMs). Here, we use over 10 000 accurately mapped P-element insertions in the Drosophila melanogaster genome to test predictions of the staggered-cut palindromic target site model for DNA transposon insertion. We provide evidence that the P-element targets a 14-bp palindromic motif that can be identified at the primary sequence level, which predicts the local spacing, hotspots and strand orientation of P-element insertions. Intriguingly, we find that the although P-element destroys the complete 14-bp target site upon insertion, the terminal three nucleotides of the P-element inverted repeats complement and restore the original TSM, suggesting a mechanistic link between transposon target sites and their terminal inverted repeats. Finally, we discuss how the staggered-cut palindromic target site model can be used to assess the accuracy of genome mappings for annotated P-element insertions.


Subject(s)
DNA Transposable Elements , Drosophila melanogaster/genetics , Models, Genetic , Animals , Chromosome Mapping , Genome, Insect , Genomics , Terminal Repeat Sequences
SELECTION OF CITATIONS
SEARCH DETAIL
...