Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Mol Biol Evol ; 41(4)2024 Apr 02.
Article in English | MEDLINE | ID: mdl-38577785

ABSTRACT

Transposable elements (TEs) are major components of eukaryotic genomes and are implicated in a range of evolutionary processes. Yet, TE annotation and characterization remain challenging, particularly for nonspecialists, since existing pipelines are typically complicated to install, run, and extract data from. Current methods of automated TE annotation are also subject to issues that reduce overall quality, particularly (i) fragmented and overlapping TE annotations, leading to erroneous estimates of TE count and coverage, and (ii) repeat models represented by short sections of total TE length, with poor capture of 5' and 3' ends. To address these issues, we present Earl Grey, a fully automated TE annotation pipeline designed for user-friendly curation and annotation of TEs in eukaryotic genome assemblies. Using nine simulated genomes and an annotation of Drosophila melanogaster, we show that Earl Grey outperforms current widely used TE annotation methodologies in ameliorating the issues mentioned above while scoring highly in benchmarking for TE annotation and classification and being robust across genomic contexts. Earl Grey provides a comprehensive and fully automated TE annotation toolkit that provides researchers with paper-ready summary figures and outputs in standard formats compatible with other bioinformatics tools. Earl Grey has a modular format, with great scope for the inclusion of additional modules focused on further quality control and tailored analyses in future releases.


Subject(s)
DNA Transposable Elements , Drosophila melanogaster , Animals , DNA Transposable Elements/genetics , Molecular Sequence Annotation , Drosophila melanogaster/genetics , Genomics/methods , Computational Biology
2.
Genome Biol Evol ; 16(4)2024 04 02.
Article in English | MEDLINE | ID: mdl-38489588

ABSTRACT

Comprehensive characterization of structural variation in natural populations has only become feasible in the last decade. To investigate the population genomic nature of structural variation, reproducible and high-confidence structural variation callsets are first required. We created a population-scale reference of the genome-wide landscape of structural variation across 33 Nordic house sparrows (Passer domesticus). To produce a consensus callset across all samples using short-read data, we compare heuristic-based quality filtering and visual curation (Samplot/PlotCritic and Samplot-ML) approaches. We demonstrate that curation of structural variants is important for reducing putative false positives and that the time invested in this step outweighs the potential costs of analyzing short-read-discovered structural variation data sets that include many potential false positives. We find that even a lenient manual curation strategy (e.g. applied by a single curator) can reduce the proportion of putative false positives by up to 80%, thus enriching the proportion of high-confidence variants. Crucially, in applying a lenient manual curation strategy with a single curator, nearly all (>99%) variants rejected as putative false positives were also classified as such by a more stringent curation strategy using three additional curators. Furthermore, variants rejected by manual curation failed to reflect the expected population structure from SNPs, whereas variants passing curation did. Combining heuristic-based quality filtering with rapid manual curation of structural variants in short-read data can therefore become a time- and cost-effective first step for functional and population genomic studies requiring high-confidence structural variation callsets.


Subject(s)
Genome , Genomics , Metagenomics , Polymorphism, Single Nucleotide
3.
BMC Res Notes ; 16(1): 335, 2023 Nov 16.
Article in English | MEDLINE | ID: mdl-37974222

ABSTRACT

OBJECTIVES: High-quality species-specific transposable element (TE) libraries are required for studies to elucidate the evolutionary dynamics of TEs and gain an understanding of their impacts on host genomes. Such high-quality TE resources are severely lacking for species in the fungal kingdom. To facilitate future studies on the putative role of TEs in rapid adaptation observed in the fungal wheat pathogen Zymoseptoria tritici, we produced a manually curated TE library. This was generated by detecting TEs in 19 reference genome assemblies representing the global diversity of the species supplemented by multiple sister species genomes. Improvements over previous TE libraries have been made on TE boundary resolution, detection of ORFs, TE domains, terminal inverted repeats, and class-specific motifs. DATA DESCRIPTION: A TE consensus library for Z. tritici formatted for use with RepeatMasker. This data is relevant to other researchers investigating TE-host evolutionary dynamics in Z. tritici or who are interested in comparative studies of the fungal kingdom. Further, this TE library can be used to improve gene annotation. Finally, this TE library increases the number of manually curated TE datasets, providing resources to further our understanding of TE diversity.


Subject(s)
Ascomycota , DNA Transposable Elements , DNA Transposable Elements/genetics , Ascomycota/genetics , Molecular Sequence Annotation , Gene Library
4.
Genome Res ; 33(10): 1718-1733, 2023 10.
Article in English | MEDLINE | ID: mdl-37852781

ABSTRACT

The evolution of resistance is a major challenge for the sustainable control of pests and pathogens. Thus, a deeper understanding of the evolutionary and genomic mechanisms underpinning resistance evolution is required to safeguard health and food production. Several studies have implicated transposable elements (TEs) in xenobiotic-resistance evolution in insects. However, analyses are generally restricted to one insect species and/or one or a few xenobiotic gene families (XGFs). We examine evidence for TE accumulation at XGFs by performing a comparative genomic analysis across 20 aphid genomes, considering major subsets of XGFs involved in metabolic resistance to insecticides: cytochrome P450s, glutathione S-transferases, esterases, UDP-glucuronosyltransferases, and ABC transporters. We find that TEs are significantly enriched at XGFs compared with other genes. XGFs show similar levels of TE enrichment to those of housekeeping genes. But unlike housekeeping genes, XGFs are not constitutively expressed in germline cells, supporting the selective enrichment of TEs at XGFs rather than enrichment owing to chromatin availability. Hotspots of extreme TE enrichment occur around certain XGFs. We find, in aphids of agricultural importance, particular enrichment of TEs around cytochrome P450 genes with known functions in the detoxification of synthetic insecticides. Our results provide evidence supporting a general role for TEs as a source of genomic variation at host XGFs and highlight the existence of considerable variability in TE content across XGFs and host species. These findings show the need for detailed functional verification analyses to clarify the significance of individual TE insertions and elucidate underlying mechanisms at TE-XGF hotspots.


Subject(s)
Aphids , Insecticides , Animals , Aphids/genetics , Xenobiotics , DNA Transposable Elements/genetics , Genomics
5.
Mol Biol Evol ; 40(5)2023 05 02.
Article in English | MEDLINE | ID: mdl-37183864

ABSTRACT

Chromosome-scale genome assemblies based on ultralong-read sequencing technologies are able to illuminate previously intractable aspects of genome biology such as fine-scale centromere structure and large-scale variation in genome features such as heterochromatin, GC content, recombination rate, and gene content. We present here a new chromosome-scale genome of the Mongolian gerbil (Meriones unguiculatus), which includes the complete sequence of all centromeres. Gerbils are thus the one of the first vertebrates to have their centromeres completely sequenced. Gerbil centromeres are composed of four different repeats of length 6, 37, 127, or 1,747 bp, which occur in simple alternating arrays and span 1-6 Mb. Gerbil genomes have both an extensive set of GC-rich genes and chromosomes strikingly enriched for constitutive heterochromatin. We sought to determine if there was a link between these two phenomena and found that the two heterochromatic chromosomes of the Mongolian gerbil have distinct underpinnings: Chromosome 5 has a large block of intraarm heterochromatin as the result of a massive expansion of centromeric repeats, while chromosome 13 is comprised of extremely large (>150 kb) repeated sequences. In addition to characterizing centromeres, our results demonstrate the importance of including karyotypic features such as chromosome number and the locations of centromeres in the interpretation of genome sequence data and highlight novel patterns involved in the evolution of chromosomes.


Subject(s)
Centromere , Heterochromatin , Animals , Gerbillinae/genetics , Heterochromatin/genetics , Centromere/genetics , Genome , Repetitive Sequences, Nucleic Acid
6.
G3 (Bethesda) ; 12(9)2022 08 25.
Article in English | MEDLINE | ID: mdl-35929795

ABSTRACT

The scarce swallowtail, Iphiclides podalirius (Linnaeus, 1758), is a species of butterfly in the family Papilionidae. Here, we present a chromosome-level genome assembly for Iphiclides podalirius as well as gene and transposable element annotations. We investigate how the density of genomic features differs between the 30 Iphiclides podalirius chromosomes. We find that shorter chromosomes have higher heterozygosity at four-fold-degenerate sites and a greater density of transposable elements. While the first result is an expected consequence of differences in recombination rate, the second suggests a counter-intuitive relationship between recombination and transposable element evolution. This high-quality genome assembly, the first for any species in the tribe Leptocircini, will be a valuable resource for population genomics in the genus Iphiclides and comparative genomics more generally.


Subject(s)
Butterflies , Animals , Butterflies/genetics , DNA Transposable Elements/genetics , Genomics , Molecular Sequence Annotation
7.
Mol Ecol ; 31(16): 4332-4350, 2022 08.
Article in English | MEDLINE | ID: mdl-35801824

ABSTRACT

Insects are capable of extraordinary feats of long-distance movement that have profound impacts on the function of terrestrial ecosystems. The ability to undertake these movements arose multiple times through the evolution of a suite of traits that make up the migratory syndrome, however the underlying genetic pathways involved remain poorly understood. Migratory hoverflies (Diptera: Syrphidae) are an emerging model group for studies of migration. They undertake seasonal movements in huge numbers across large parts of the globe and are important pollinators, biological control agents and decomposers. Here, we assembled a high-quality draft genome of the marmalade hoverfly (Episyrphus balteatus). We leveraged this genomic resource to undertake a genome-wide transcriptomic comparison of actively migrating Episyrphus, captured from a high mountain pass as they flew south to overwinter, with the transcriptomes of summer forms which were non-migratory. We identified 1543 genes with very strong evidence for differential expression. Interrogation of this gene set reveals a remarkable range of roles in metabolism, muscle structure and function, hormonal regulation, immunity, stress resistance, flight and feeding behaviour, longevity, reproductive diapause and sensory perception. These features of the migrant phenotype have arisen by the integration and modification of pathways such as insulin signalling for diapause and longevity, JAK/SAT for immunity, and those leading to octopamine production and fuelling to boost flight capabilities. Our results provide a powerful genomic resource for future research, and paint a comprehensive picture of global expression changes in an actively migrating insect, identifying key genomic components involved in this important life-history strategy.


Subject(s)
Diptera , Transcriptome , Animal Migration , Animals , Diptera/genetics , Ecosystem , Insecta/genetics , Phenotype , Transcriptome/genetics
8.
Genomics ; 114(4): 110440, 2022 07.
Article in English | MEDLINE | ID: mdl-35905835

ABSTRACT

The moth Heortia vitessoides Moore (Lepidoptera: Crambidae) is a major pest of ecologically, commercially and culturally important agarwood-producing trees in the genus Aquilaria. In particular, H. vitessoides is one of the most destructive defoliating pests of the incense tree Aquilaria sinesis, which produces a valuable fragrant wood used as incense and in traditional Chinese medicine [33]. Nevertheless, a genomic resource for H. vitessoides is lacking. Here, we present a chromosomal-level assembly for H. vitessoides, consisting of a 517 megabase (Mb) genome assembly with high physical contiguity (scaffold N50 of 18.2 Mb) and high completeness (97.9% complete BUSCO score). To aid gene annotation, 8 messenger RNA transcriptomes from different developmental stages were generated, and a total of 16,421 gene models were predicted. Expansion of gene families involved in xenobiotic metabolism and development were detected, including duplications of cytosolic sulfotransferase (SULT) genes shared among lepidopterans. In addition, small RNA sequencing of 5 developmental stages of H. vitessoides facilitated the identification of 85 lepidopteran conserved microRNAs, 94 lineage-specific microRNAs, as well as several microRNA clusters. A large proportion of the H. vitessoides genome consists of repeats, with a 29.12% total genomic contribution from transposable elements, of which long interspersed nuclear elements (LINEs) are the dominant component (17.41%). A sharp decrease in the genome-wide percentage of LINEs with lower levels of genetic distance to family consensus sequences suggests that LINE activity has peaked in H. vitessoides. In contrast, opposing patterns suggest a substantial recent increase in DNA and LTR element activity. Together with annotations of essential sesquiterpenoid hormonal pathways, neuropeptides, microRNAs and transposable elements, the high-quality genomic and transcriptomic resources we provide for the economically important moth H. vitessoides provide a platform for the development of genomic approaches to pest management, and contribute to addressing fundamental research questions in Lepidoptera.


Subject(s)
Lepidoptera , MicroRNAs , Moths , Animals , DNA Transposable Elements , Lepidoptera/genetics , Moths/genetics , Trees/genetics
9.
Nat Commun ; 13(1): 3010, 2022 05 30.
Article in English | MEDLINE | ID: mdl-35637228

ABSTRACT

Animals display a fascinating diversity of body plans. Correspondingly, genomic analyses have revealed dynamic evolution of gene gains and losses among animal lineages. Here we sequence six new myriapod genomes (three millipedes, three centipedes) at key phylogenetic positions within this major but understudied arthropod lineage. We combine these with existing genomic resources to conduct a comparative analysis across all available myriapod genomes. We find that millipedes generally have considerably smaller genomes than centipedes, with the repeatome being a major contributor to genome size, driven by independent large gains of transposons in three centipede species. In contrast to millipedes, centipedes gained a large number of gene families after the subphyla diverged, with gains contributing to sensory and locomotory adaptations that facilitated their ecological shift to predation. We identify distinct horizontal gene transfer (HGT) events from bacteria to millipedes and centipedes, with no identifiable HGTs shared among all myriapods. Loss of juvenile hormone O-methyltransferase, a key enzyme in catalysing sesquiterpenoid hormone production in arthropods, was also revealed in all millipede lineages. Our findings suggest that the rapid evolution of distinct genomic pathways in centipede and millipede lineages following their divergence from the myriapod ancestor, was shaped by differing ecological pressures.


Subject(s)
Arthropods , Gene Transfer, Horizontal , Animals , Arthropods/genetics , Chilopoda , Genome/genetics , Phylogeny
10.
G3 (Bethesda) ; 12(6)2022 05 30.
Article in English | MEDLINE | ID: mdl-35348678

ABSTRACT

The lesser marbled fritillary, Brenthis ino (Rottemburg, 1775), is a species of Palearctic butterfly. Male Brenthis ino individuals have been reported to have between 12 and 14 pairs of chromosomes, a much-reduced chromosome number than is typical in butterflies. Here, we present a chromosome-level genome assembly for Brenthis ino, as well as gene and transposable element annotations. The assembly is 411.8 Mb in length with a contig N50 of 9.6 Mb and a scaffold N50 of 29.5 Mb. We also show evidence that the male individual from which we generated HiC data was heterozygous for a neo-Z chromosome, consistent with inheriting 14 chromosomes from one parent and 13 from the other. This genome assembly will be a valuable resource for studying chromosome evolution in Lepidoptera, as well as for comparative and population genomics more generally.


Subject(s)
Butterflies , Fritillaria , Animals , Butterflies/genetics , Chromosomes/genetics , Fritillaria/genetics , Genome , Male , Molecular Sequence Annotation , Sex Chromosomes
11.
Mob DNA ; 13(1): 5, 2022 Feb 16.
Article in English | MEDLINE | ID: mdl-35172896

ABSTRACT

BACKGROUND: Lepidoptera (butterflies and moths) are an important model system in ecology and evolution. A high-quality chromosomal genome assembly is available for the monarch butterfly (Danaus plexippus), but it lacks an in-depth transposable element (TE) annotation, presenting an opportunity to explore monarch TE dynamics and the impact of TEs on shaping the monarch genome. RESULTS: We find 6.21% of the monarch genome is comprised of TEs, a reduction of 6.85% compared to the original TE annotation performed on the draft genome assembly. Monarch TE content is low compared to two closely related species with available genomes, Danaus chrysippus (33.97% TE) and Danaus melanippus (11.87% TE). The biggest TE contributions to genome size in the monarch are LINEs and Penelope-like elements, and three newly identified families, r2-hero_dPle (LINE), penelope-1_dPle (Penelope-like), and hase2-1_dPle (SINE), collectively contribute 34.92% of total TE content. We find evidence of recent TE activity, with two novel Tc1 families rapidly expanding over recent timescales (tc1-1_dPle, tc1-2_dPle). LINE fragments show signatures of genomic deletions indicating a high rate of TE turnover. We investigate associations between TEs and wing colouration and immune genes and identify a three-fold increase in TE content around immune genes compared to other host genes. CONCLUSIONS: We provide a detailed TE annotation and analysis for the monarch genome, revealing a considerably smaller TE contribution to genome content compared to two closely related Danaus species with available genome assemblies. We identify highly successful novel DNA TE families rapidly expanding over recent timescales, and ongoing signatures of both TE expansion and removal highlight the dynamic nature of repeat content in the monarch genome. Our findings also suggest that insect immune genes are promising candidates for future interrogation of TE-mediated host adaptation.

12.
Commun Biol ; 4(1): 83, 2021 01 19.
Article in English | MEDLINE | ID: mdl-33469163

ABSTRACT

Whole genome duplication (WGD) has occurred in relatively few sexually reproducing invertebrates. Consequently, the WGD that occurred in the common ancestor of horseshoe crabs ~135 million years ago provides a rare opportunity to decipher the evolutionary consequences of a duplicated invertebrate genome. Here, we present a high-quality genome assembly for the mangrove horseshoe crab Carcinoscorpius rotundicauda (1.7 Gb, N50 = 90.2 Mb, with 89.8% sequences anchored to 16 pseudomolecules, 2n = 32), and a resequenced genome of the tri-spine horseshoe crab Tachypleus tridentatus (1.7 Gb, N50 = 109.7 Mb). Analyses of gene families, microRNAs, and synteny show that horseshoe crabs have undergone three rounds (3R) of WGD. Comparison of C. rotundicauda and T. tridentatus genomes from populations from several geographic locations further elucidates the diverse fates of both coding and noncoding genes. Together, the present study represents a cornerstone for improving our understanding of invertebrate WGD events on the evolutionary fates of genes and microRNAs, at both the individual and population level. We also provide improved genomic resources for horseshoe crabs, of applied value for breeding programs and conservation of this fascinating and unusual invertebrate lineage.


Subject(s)
Gene Duplication/genetics , Horseshoe Crabs/genetics , MicroRNAs/genetics , Animals , Evolution, Molecular , Genome/genetics , Genomics , Phylogeny
13.
Wellcome Open Res ; 6: 304, 2021.
Article in English | MEDLINE | ID: mdl-35136843

ABSTRACT

We present a genome assembly from an individual female Melitaea athalia (also known as Mellicta athalia; the heath fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 610 megabases in span. In total, 99.98% of the assembly is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,824 protein coding genes.

14.
BMC Genomics ; 21(1): 726, 2020 Oct 19.
Article in English | MEDLINE | ID: mdl-33076831

ABSTRACT

BACKGROUND: Teleost fish play important roles in aquatic ecosystems and aquaculture. Threadfins (Perciformes: Polynemidae) show a range of interesting biology, and are of considerable importance for both wild fisheries and aquaculture. Additionally, the four-finger threadfin Eleutheronema tetradactylum is of conservation relevance since its populations are considered to be in rapid decline and it is classified as endangered. However, no genomic resources are currently available for the threadfin family Polynemidae. RESULTS: We sequenced and assembled the first threadfin fish genome, the four-finger threadfin E. tetradactylum. We provide a genome assembly for E. tetradactylum with high contiguity (scaffold N50 = 56.3 kb) and high BUSCO completeness at 96.5%. The assembled genome size of E. tetradactylum is just 610.5 Mb, making it the second smallest perciform genome assembled to date. Just 9.07-10.91% of the genome sequence was found to consist of repetitive elements (standard RepeatMasker analysis vs custom analysis), making this the lowest repeat content identified to date for any perciform fish. A total of 37,683 protein-coding genes were annotated, and we include analyses of developmental transcription factors, including the Hox, ParaHox, and Sox families. MicroRNA genes were also annotated and compared with other chordate lineages, elucidating the gains and losses of chordate microRNAs. CONCLUSIONS: The four-finger threadfin E. tetradactylum genome presented here represents the first available genome sequence for the ecologically, biologically, and commercially important clade of threadfin fish. Our findings provide a useful genomic resource for future research into the interesting biology and evolution of this valuable group of food fish.


Subject(s)
Genome , Perciformes , Animals , Perciformes/genetics
15.
BMC Genomics ; 21(1): 713, 2020 Oct 15.
Article in English | MEDLINE | ID: mdl-33059600

ABSTRACT

BACKGROUND: Homeobox-containing genes encode crucial transcription factors involved in animal, plant and fungal development, and changes to homeobox genes have been linked to the evolution of novel body plans and morphologies. In animals, some homeobox genes are clustered together in the genome, either as remnants from ancestral genomic arrangements, or due to coordinated gene regulation. Consequently, analyses of homeobox gene organization across animal phylogeny provide important insights into the evolution of genome organization and developmental gene control, and their interaction. However, homeobox gene organization remains to be fully elucidated in several key animal ancestors, including those of molluscs, lophotrochozoans and bilaterians. RESULTS: Here, we present a high-quality chromosome-level genome assembly of the Hong Kong oyster, Magallana hongkongensis (2n = 20), for which 93.2% of the genomic sequences are contained on 10 pseudomolecules (~ 758 Mb, scaffold N50 = 72.3 Mb). Our genome assembly was scaffolded using Hi-C reads, facilitating a larger scaffold size compared to the recently published M. hongkongensis genome of Peng et al. (Mol Ecol Resources, 2020), which was scaffolded using the Crassostrea gigas assembly. A total of 46,963 predicted gene models (45,308 protein coding genes) were incorporated in our genome, and genome completeness estimated by BUSCO was 94.6%. Homeobox gene linkages were analysed in detail relative to available data for other mollusc lineages. CONCLUSIONS: The analyses performed in this study and the accompanying genome sequence provide important genetic resources for this economically and culturally valuable oyster species, and offer a platform to improve understanding of animal biology and evolution more generally. Transposable element content is comparable to that found in other mollusc species, contrary to the conclusion of another recent analysis. Also, our chromosome-level assembly allows the inference of ancient gene linkages (synteny) for the homeobox-containing genes, even though a number of the homeobox gene clusters, like the Hox/ParaHox clusters, are undergoing dispersal in molluscs such as this oyster.


Subject(s)
Genes, Homeobox , Ostreidae , Animals , Genes, Homeobox/genetics , Genome , Ostreidae/genetics , Phylogeny , Synteny
16.
PLoS Biol ; 18(9): e3000636, 2020 09.
Article in English | MEDLINE | ID: mdl-32991578

ABSTRACT

The Myriapoda, composed of millipedes and centipedes, is a fascinating but poorly understood branch of life, including species with a highly unusual body plan and a range of unique adaptations to their environment. Here, we sequenced and assembled 2 chromosomal-level genomes of the millipedes Helicorthomorpha holstii (assembly size = 182 Mb; shortest scaffold/contig length needed to cover 50% of the genome [N50] = 18.11 Mb mainly on 8 pseudomolecules) and Trigoniulus corallinus (assembly size = 449 Mb, N50 = 26.78 Mb mainly on 17 pseudomolecules). Unique genomic features, patterns of gene regulation, and defence systems in millipedes, not observed in other arthropods, are revealed. Both repeat content and intron size are major contributors to the observed differences in millipede genome size. Tight Hox and the first loose ecdysozoan ParaHox homeobox clusters are identified, and a myriapod-specific genomic rearrangement including Hox3 is also observed. The Argonaute (AGO) proteins for loading small RNAs are duplicated in both millipedes, but unlike in insects, an AGO duplicate has become a pseudogene. Evidence of post-transcriptional modification in small RNAs-including species-specific microRNA arm switching-providing differential gene regulation is also obtained. Millipedes possesses a unique ozadene defensive gland unlike the venomous forcipules found in centipedes. We identify sets of genes associated with the ozadene that play roles in chemical defence as well as antimicrobial activity. Macro-synteny analyses revealed highly conserved genomic blocks between the 2 millipedes and deuterostomes. Collectively, our analyses of millipede genomes reveal that a series of unique adaptations have occurred in this major lineage of arthropod diversity. The 2 high-quality millipede genomes provided here shed new light on the conserved and lineage-specific features of millipedes and centipedes. These findings demonstrate the importance of the consideration of both centipede and millipede genomes-and in particular the reconstruction of the myriapod ancestral situation-for future research to improve understanding of arthropod evolution, and animal evolutionary genomics more widely.


Subject(s)
Adaptation, Biological/genetics , Arthropods , Evolution, Molecular , Genome/genetics , Animals , Arthropods/classification , Arthropods/genetics , Base Sequence , DNA Transposable Elements/genetics , Genes, Homeobox , Genome, Insect , Insecta/classification , Insecta/genetics , MicroRNAs/genetics , Phylogeny , Synteny
17.
Mob DNA ; 11: 21, 2020.
Article in English | MEDLINE | ID: mdl-32612713

ABSTRACT

BACKGROUND: Tc1/mariner transposons are widespread DNA transposable elements (TEs) that have made important contributions to the evolution of host genomic complexity in metazoans. However, the evolution and diversity of the Tc1/mariner superfamily remains poorly understood. Following recent developments in genome sequencing and the availability of a wealth of new genomes, Tc1/mariner TEs have been identified in many new taxa across the eukaryotic tree of life. To date, the majority of studies focussing on Tc1/mariner elements have considered only a single host lineage or just a small number of host lineages. Thus, much remains to be learnt about the evolution of Tc1/mariner TEs by performing analyses that consider elements that originate from across host diversity. RESULTS: We mined the non-redundant database of NCBI using BLASTp searches, with transposase sequences from a diverse set of reference Tc1/mariner elements as queries. A total of 5158 Tc1/mariner elements were retrieved and used to reconstruct evolutionary relationships within the superfamily. The resulting phylogeny is well resolved and includes several new groups of Tc1/mariner elements. In particular, we identify a new family of plant-genome restricted Tc1/mariner elements, which we call PlantMar. We also show that the pogo family is much larger and more diverse than previously appreciated, and we review evidence for a potential revision of its status to become a separate superfamily. CONCLUSIONS: Our study provides an overview of Tc1-mariner phylogeny and summarises the impressive diversity of Tc1-mariner TEs among sequenced eukaryotes. Tc1/mariner TEs are successful in a wide range of eukaryotes, especially unikonts (the taxonomic supergroup containing Amoebozoa, Opisthokonta, Breviatea, and Apusomonadida). In particular, ecdysozoa, and especially arthropods, emerge as important hosts for Tc1/mariner elements (except the PlantMar family). Meanwhile, the pogo family, which is by far the largest Tc1/mariner family, also includes many elements from fungal and chordate genomes. Moreover, there is evidence of the repeated exaptation of pogo elements in vertebrates, including humans, in addition to the well-known example of CENP-B. Collectively, our findings provide a considerable advancement in understanding of Tc1/mariner elements, and more generally they suggest that much work remains to improve understanding of the diversity and evolution of DNA TEs.

18.
Mol Ecol Resour ; 20(4): 971-979, 2020 Jul.
Article in English | MEDLINE | ID: mdl-32157789

ABSTRACT

Trees in the genus Aquilaria (Thymelaeaceae) are known as lign aloes, and are native to the forests of southeast Asia. Lign aloes produce agarwood as an antimicrobial defence. Agarwood has a long history of cultural and medicinal use, and is of considerable commercial value. However, due to habitat destruction and over collection, lign aloes are threatened in the wild. We present a chromosomal-level assembly for Aquilaria sinensis, a lign aloe endemic to China known as the incense tree, based on Illumina short-read, 10X Genomics linked-read, and Hi-C sequencing data. Our 783.8 Mbp A. sinensis genome assembly is of high physical contiguity, with a scaffold N50 of 87.6 Mbp, and high completeness, with a 95.8% BUSCO score for eudicotyledon genes. We include 17 transcriptomes from various plant tissues, providing a total of 35,965 gene models. We reveal the first complete set of genes involved in sesquiterpenoid production, plant defence, and agarwood production for the genus Aquilaria, including genes involved in the biosynthesis of sesquiterpenoids via the mevalonic acid (MVA), 1-deoxy-D-xylulose-5-phosphate (DXP), and methylerythritol phosphate (MEP) pathways. We perform a detailed repeat content analysis, revealing that transposable elements account for ~61% of the genome, with major contributions from gypsy-like and copia-like LTR retroelements. We also provide a comparative analysis of repeat content across sequenced species in the order Malvales. Our study reveals the first chromosomal-level genome assembly for a tree in the genus Aquilaria and provides an unprecedented opportunity to address a variety of applied, genomic and evolutionary questions in the Thymelaeaceae more widely.


Subject(s)
Chromosomes, Plant/genetics , Genome, Plant/genetics , Thymelaeaceae/genetics , Trees/genetics , Genes, Plant/genetics , High-Throughput Nucleotide Sequencing/methods , Transcriptome/genetics
19.
Gigascience ; 9(1)2020 01 01.
Article in English | MEDLINE | ID: mdl-31942620

ABSTRACT

BACKGROUND: The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusc with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea-dwelling species will allow several pending evolutionary questions to be unlocked. FINDINGS: We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long reads, and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from 3 different tissue types from 3 other species of squid (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein-coding genes supported by evidence, and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome. CONCLUSIONS: This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments.


Subject(s)
Decapodiformes/genetics , Genome , Genomics , Animals , Biological Evolution , Chromatography, Liquid , Computational Biology/methods , DNA Transposable Elements , Gene Expression Profiling , Genomics/methods , Molecular Sequence Annotation , Multigene Family , RNA, Untranslated , Tandem Mass Spectrometry , Transcriptome , Whole Genome Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL
...