Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 39(5)2023 05 04.
Article in English | MEDLINE | ID: mdl-36961337

ABSTRACT

MOTIVATION: Iso-Seq RNA long-read sequencing enables the identification of full-length transcripts and isoforms, removing the need for complex analysis such as transcriptome assembly. However, the raw sequencing data need to be processed in a series of steps before annotation is complete. Here, we present nf-core/isoseq, a pipeline for automatic read processing and genome annotation. Following nf-core guidelines, the pipeline has few dependencies and can be run on any of platforms. AVAILABILITY AND IMPLEMENTATION: The pipeline is freely available online on the nf-core website (https://nf-co.re/isoseq) and on GitHub (https://github.com/nf-core/isoseq) under MIT License (DOI: 10.5281/zenodo.7116979).


Subject(s)
Alternative Splicing , Genome , Protein Isoforms/genetics , Sequence Analysis, RNA , Transcriptome , Molecular Sequence Annotation
2.
Genomics ; 112(2): 1660-1673, 2020 03.
Article in English | MEDLINE | ID: mdl-31669705

ABSTRACT

Efforts to elucidate the causes of biological differences between wild fowls and their domesticated relatives, the chicken, have to date mainly focused on the identification of single nucleotide mutations. Other types of genomic variations have however been demonstrated to be important in avian evolution and associated to variations in phenotype. They include several types of sequences duplicated in tandem that can vary in their repetition number. Here we report on genome size differences between the red jungle fowl and several domestic chicken breeds and selected lines. Sequences duplicated in tandem such as rDNA, telomere repeats, satellite DNA and segmental duplications were found to have been significantly re-shaped during domestication and subsequently by human-mediated selection. We discuss the extent to which changes in genome organization that occurred during domestication agree with the hypothesis that domesticated animal genomes have been shaped by evolutionary forces aiming to adapt them to anthropized environments.


Subject(s)
Breeding , Chickens/genetics , Domestication , Genome Size , Polymorphism, Genetic , Animals , Centromere/genetics , Gene Duplication , RNA, Ribosomal/genetics , Tandem Repeat Sequences , Telomere/genetics
3.
BMC Genomics ; 17(1): 659, 2016 08 19.
Article in English | MEDLINE | ID: mdl-27542599

ABSTRACT

BACKGROUND: The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). RESULTS: We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. CONCLUSIONS: Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.


Subject(s)
Chickens/genetics , Genome , Genomics , Tandem Repeat Sequences , Animals , Chromosome Mapping , Computational Biology/methods , CpG Islands , DNA Transposable Elements , Data Mining , Genomics/methods , Microsatellite Repeats , Molecular Sequence Annotation , Software
4.
BMC Bioinformatics ; 17(1): 204, 2016 May 06.
Article in English | MEDLINE | ID: mdl-27153821

ABSTRACT

BACKGROUND: Several tools are available for visualizing genomic data. Some, such as Gbrowse and Jbrowse, are very efficient for small genomic regions, but they are not suitable for entire genomes. Others, like Phenogram and CViT, can be used to visualise whole genomes, but are not designed to display very dense genomic features (eg: interspersed repeats). We have therefore developed DensityMap, a lightweight Perl program that can display the densities of several features (genes, ncRNA, cpg, etc.) along chromosomes on the scale of the whole genome. A critical advantage of DensityMap is that it uses GFF annotation files directly to compute the densities of features without needing additional information from the user. The resulting picture is readily configurable, and the colour scales used can be customized for a best fit to the data plotted. RESULTS: DensityMap runs on Linux architecture with few requirements so that users can easily and quickly visualize the distributions and densities of genomic features for an entire genome. The input is GFF3-formated data representing chromosomes (linkage groups or pseudomolecules) and sets of features which are used to calculate representations in density maps. In practise, DensityMap uses a tilling window to compute the density of one or more features and the number of bases covered by these features along chromosomes. The densities are represented by colour scales that can be customized to highlight critical points. DensityMap can compare the distributions of features; it calculates several chromosomal density maps in a single image, each of which describes a different genomic feature. It can also use the genome nucleotide sequence to compute and plot a density map of the GC content along chromosomes. CONCLUSIONS: DensityMap is a compact, easily-used tool for displaying the distribution and density of all types of genomic features within a genome. It is flexible enough to visualize the densities of several types of features in a single representation. The images produced are readily configurable and their SVG format ensures that they can be edited.


Subject(s)
Drosophila melanogaster/genetics , Genome , Genomics/methods , Software , Animals , Base Composition/genetics , Exons/genetics , Genetic Linkage , Long Interspersed Nucleotide Elements/genetics , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , Retroelements/genetics
5.
Genome Biol Evol ; 7(3): 735-49, 2015 Jan 29.
Article in English | MEDLINE | ID: mdl-25637221

ABSTRACT

We used nine complete genome sequences, from grape, poplar, Arabidopsis, soybean, lotus, apple, strawberry, cacao, and papaya, to investigate the paleohistory of rosid crops. We characterized an ancestral rosid karyotype, structured into 7/21 protochomosomes, with a minimal set of 6,250 ordered protogenes and a minimum physical coding gene space of 50 megabases. We also proposed ancestral karyotypes for the Caricaceae, Brassicaceae, Malvaceae, Fabaceae, Rosaceae, Salicaceae, and Vitaceae families with 9, 8, 10, 6, 12, 9, 12, and 19 protochromosomes, respectively. On the basis of these ancestral karyotypes and present-day species comparisons, we proposed a two-step evolutionary scenario based on allohexaploidization involving the newly characterized A, B, and C diploid progenitors leading to dominant (stable) and sensitive (plastic) genomic compartments in any modern rosid crops. Finally, a new user-friendly online tool, "DicotSyntenyViewer" (available from http://urgi.versailles.inra.fr/synteny-dicot), has been made available for accurate translational genomics in rosids.


Subject(s)
Crops, Agricultural/genetics , Evolution, Molecular , Gene Order , Genome, Plant , Karyotype , Magnoliopsida/genetics , Chromosomes, Plant , Gene Duplication , Genomics , Magnoliopsida/classification , Phylogeny , Polyploidy , Synteny
6.
Stand Genomic Sci ; 9(3): 940-7, 2014 Jun 15.
Article in English | MEDLINE | ID: mdl-25197475

ABSTRACT

Members of the family Iridoviridae are animal viruses that infect only invertebrates and poikilothermic vertebrates. The invertebrate iridoviruses 22 (IIV22) and 25 (IIV25) were originally isolated from a single sample of blackfly larva (Simulium spp., order Diptera) collected from the Ystwyth river near Aberystwyth, Wales. Recently, the genomes of IIV22 (197.7 kbp) and IIV25 (204.8 kbp) were sequenced and reported. Here, we describe the complete genome sequence of IIV22A, a variant that was isolated from the same pool of virions collected from the blackfly larva from which the IIV22 virion genome originated. The IIV22A genome, 196.5 kbp, is smaller than IIV22. Nevertheless, it contains 7 supplementary putative ORFs. Its analysis enables evaluation of the degree of genomic polymorphisms within an IIV isolate. Despite the occurrence of this IIV variant with IIV22 and IIV25 in a single blackfly larva and the features of their DNA polymerase, we found no evidence of lateral genetic transfers between the genomes of these two IIV species.

7.
J Gen Virol ; 95(Pt 7): 1585-1590, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24722681

ABSTRACT

Members of the family Iridoviridae are animal viruses that infect only invertebrates and poikilothermic vertebrates. The invertebrate iridovirus 31 (IIV31) was originally isolated from adult pill bugs, Armadillidium vulgare (class Crustacea, order Isopoda, suborder Oniscidea), found in southern California on the campus of the University of California, Riverside, USA. IIV31 virions are icosahedral, have a diameter of about 135 nm, and contain a dsDNA genome 220.222 kbp in length, with 35.09 mol % G+C content and 203 ORFs. Here, we describe the complete genome sequence of this virus and its annotation. This is the eighth genome sequence of an IIV reported.


Subject(s)
DNA, Viral/chemistry , DNA, Viral/genetics , Genome, Viral , Iridovirus/classification , Iridovirus/genetics , Isopoda/virology , Animals , Base Composition , California , Iridovirus/isolation & purification , Iridovirus/ultrastructure , Microscopy, Electron, Transmission , Molecular Sequence Data , Open Reading Frames , Sequence Analysis, DNA , Virion/ultrastructure
8.
J Invertebr Pathol ; 116: 43-7, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24394746

ABSTRACT

Members of the family Iridoviridae are animal viruses that infect only invertebrates and poikilothermic vertebrates. The invertebrate iridovirus 30 (IIV30) was originally isolated from a larva of the corn earworm, Helicoverpa zea (order lepidoptera, Family Noctuidae) in western Australia. The IIV30 virions are icosahedral, have a diameter of about 130nm, and contain a dsDNA genome of 198.5kbp with 28.11% in GC content and 177 coding sequences. Here we describe its complete genome sequence and annotate the genes for which we could assign a putative function. This is the sixth genome sequence of an invertebrate iridovirus reported.


Subject(s)
Genome, Viral , Iridovirus/genetics , Moths/virology , Animals , Base Sequence , Chromosome Mapping , Iridovirus/isolation & purification , Molecular Sequence Data , Sequence Analysis, DNA
9.
Arch Virol ; 159(5): 1181-5, 2014 May.
Article in English | MEDLINE | ID: mdl-24232916

ABSTRACT

Members of the family Iridoviridae are animal viruses that infect only invertebrates and poikilothermic vertebrates. Invertebrate iridovirus 25 (IIV-25) was originally isolated from the larva of a blackfly (Simulium spp., order Diptera) found in the Ystwyth river near Aberystwyth, Wales. IIV-25 virions are icosahedral, have a diameter of ~130 nm, and contain a dsDNA genome of 204.8 kbp, with a G+C content of 30.32 %, that codes for 177 proteins. Here, we describe the complete genome sequence of this virus and its annotation. This is the fifth genome sequence of an invertebrate iridovirus reported.


Subject(s)
Diptera/virology , Genome, Viral , Iridovirus/genetics , Iridovirus/isolation & purification , Animals , Gene Expression Regulation, Viral , Larva/virology
10.
Genome Biol Evol ; 6(1): 12-33, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24317974

ABSTRACT

Modern plant genomes are diploidized paleopolyploids. We revisited grass genome paleohistory in response to the diploidization process through a detailed investigation of the evolutionary fate of duplicated blocks. Ancestrally duplicated genes can be conserved, deleted, and shuffled, defining dominant (bias toward duplicate retention) and sensitive (bias toward duplicate erosion) chromosomal fragments. We propose a new grass genome paleohistory deriving from an ancestral karyotype structured in seven protochromosomes containing 16,464 protogenes and following evolutionary rules where 1) ancestral shared polyploidizations shaped conserved dominant (D) and sensitive (S) subgenomes, 2) subgenome dominance is revealed by both gene deletion and shuffling from the S blocks, 3) duplicate deletion/movement may have been mediated by single-/double-stranded illegitimate recombination mechanisms, 4) modern genomes arose through centromeric fusion of protochromosomes, leading to functional monocentric neochromosomes, 5) the fusion of two dominant blocks leads to supradominant neochromosomes (D + D = D) with higher ancestral gene retention compared with D + S = D (i.e., fusion of blocks with opposite sensitivity) or even S + S = S (i.e., fusion of two sensitive ancestral blocks). A new user-friendly online tool named "PlantSyntenyViewer," available at http://urgi.versailles.inra.fr/synteny-cereal, presents the refined comparative genomics data.


Subject(s)
Evolution, Molecular , Genes, Dominant , Genes, Plant , Poaceae/genetics , Polyploidy , Software , Chromosomes, Plant , Gene Deletion , Gene Duplication , Genomics/methods , Karyotype , Phylogeny , Recombination, Genetic
11.
Plant J ; 76(6): 1030-44, 2013 Dec.
Article in English | MEDLINE | ID: mdl-24164652

ABSTRACT

Bread wheat derives from a grass ancestor structured in seven protochromosomes followed by a paleotetraploidization to reach a 12 chromosomes intermediate and a neohexaploidization (involving subgenomes A, B and D) event that finally shaped the 21 modern chromosomes. Insights into wheat syntenome in sequencing conserved orthologous set (COS) genes unravelled differences in genomic structure (such as gene conservation and diversity) and genetical landscape (such as recombination pattern) between ancestral as well as recent duplicated blocks. Contrasted evolutionary plasticity is observed where the B subgenome appears more sensitive (i.e. plastic) in contrast to A as dominant (i.e. stable) in response to the neotetraploidization and D subgenome as supra-dominant (i.e. pivotal) in response to the neohexaploidization event. Finally, the wheat syntenome, delivered through a public web interface PlantSyntenyViewer at http://urgi.versailles.inra.fr/synteny-wheat, can be considered as a guide for accelerated dissection of major agronomical traits in wheat.


Subject(s)
Chromosomes, Plant/genetics , Evolution, Molecular , Genome, Plant/genetics , Genomics , Synteny/genetics , Triticum/genetics , Conserved Sequence , DNA, Plant/chemistry , DNA, Plant/genetics , Genes, Dominant , Genetic Markers , Models, Biological , Polymorphism, Single Nucleotide , Polyploidy , Sequence Analysis, DNA
12.
J Gen Virol ; 94(Pt 9): 2112-2116, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23804567

ABSTRACT

Members of the family Iridoviridae are animal viruses that infect only invertebrates and poikilothermic vertebrates. Invertebrate iridescent virus 22 (IIV-22) was originally isolated from the larva of a blackfly (Simulium sp., order Diptera) found in the Ystwyth river, near Aberystwyth, Wales, UK. IIV-22 virions are icosahedral, with a diameter of about 130 nm and contain a dsDNA genome that is 197.7 kb in length, has a G+C content of 28.05 mol% and contains 167 coding sequences. Here, we describe the complete genome sequence of this virus and its annotation. This is the fourth genome sequence of an invertebrate iridovirus to be reported.


Subject(s)
DNA, Viral/chemistry , DNA, Viral/genetics , Genome, Viral , Iridovirus/genetics , Simuliidae/virology , Animals , Base Composition , Base Sequence , Iridovirus/isolation & purification , Larva/virology , Molecular Sequence Data , Sequence Analysis, DNA , Wales
SELECTION OF CITATIONS
SEARCH DETAIL
...