Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
G3 (Bethesda) ; 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38845594

ABSTRACT

We present a reference genome for the federally endangered Gaviota tarplant, Deinandra increscens subsp. villosa (Madiinae, Asteraceae), an annual herb endemic to the Central California coast. Generating PacBio Hifi, Oxford Nanopore Technologies, and Dovetail Omni-C data, we assembled a haploid consensus genome of 1.67 Gbp as 28.7 K scaffolds with a scaffold N50 of 74.9 Mb. We annotated repeat content in 74.8% of the genome. Long terminal repeats (LTR) covered 44.0% of the genome with Copia families predominant at 22.9% followed by Gypsy at 14.2%. Both Gypsy and Copia elements were common in ancestral peaks of LTR, and the most abundant element was a Gypsy element containing nested Copia/Angela sequence similarity, reflecting a complex evolutionary history of repeat activity. Gene annotation produced 33,257 genes and 68,942 transcripts, of which 99% were functionally annotated. BUSCO scores for the annotated proteins were 96.0% complete of which 77.6% was single copy and 18.4% duplicates. Whole genome duplication (WGD) synonymous mutation rates of Gaviota tarplant and sunflower (Helianthus annuus) shared peaks that correspond to the last Asteraceae polyploidization event and subsequent divergence from a common ancestor at ∼27 mya. Regions of high-density tandem genes were identified, pointing to potentially important loci of environmental adaptation in this species.

2.
Evol Appl ; 17(4): e13669, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38633133

ABSTRACT

DNA methylation is critical to the regulation of transposable elements and gene expression and can play an important role in the adaptation of stress response mechanisms in plants. Traditional methods of methylation quantification rely on bisulfite conversion that can compromise accuracy. Recent advances in long-read sequencing technologies allow for methylation detection in real time. The associated algorithms that interpret these modifications have evolved from strictly statistical approaches to Hidden Markov Models and, recently, deep learning approaches. Much of the existing software focuses on methylation in the CG context, but methylation in other contexts is important to quantify, as it is extensively leveraged in plants. Here, we present methylation profiles for two maple species across the full range of 5mC sequence contexts using Oxford Nanopore Technologies (ONT) long-reads. Hybrid and reference-guided assemblies were generated for two new Acer accessions: Acer negundo (box elder; 65x ONT and 111X Illumina) and Acer saccharum (sugar maple; 93x ONT and 148X Illumina). The ONT reads generated for these assemblies were re-basecalled, and methylation detection was conducted in a custom pipeline with the published Acer references (PacBio assemblies) and hybrid assemblies reported herein to generate four epigenomes. Examination of the transposable element landscape revealed the dominance of LTR Copia elements and patterns of methylation associated with different classes of TEs. Methylation distributions were examined at high resolution across gene and repeat density and described within the broader angiosperm context, and more narrowly in the context of gene family dynamics and candidate nutrient stress genes.

3.
Appl Plant Sci ; 11(4): e11533, 2023.
Article in English | MEDLINE | ID: mdl-37601314

ABSTRACT

Premise: Robust standards to evaluate quality and completeness are lacking in eukaryotic structural genome annotation, as genome annotation software is developed using model organisms and typically lacks benchmarking to comprehensively evaluate the quality and accuracy of the final predictions. The annotation of plant genomes is particularly challenging due to their large sizes, abundant transposable elements, and variable ploidies. This study investigates the impact of genome quality, complexity, sequence read input, and method on protein-coding gene predictions. Methods: The impact of repeat masking, long-read and short-read inputs, and de novo and genome-guided protein evidence was examined in the context of the popular BRAKER and MAKER workflows for five plant genomes. The annotations were benchmarked for structural traits and sequence similarity. Results: Benchmarks that reflect gene structures, reciprocal similarity search alignments, and mono-exonic/multi-exonic gene counts provide a more complete view of annotation accuracy. Transcripts derived from RNA-read alignments alone are not sufficient for genome annotation. Gene prediction workflows that combine evidence-based and ab initio approaches are recommended, and a combination of short and long reads can improve genome annotation. Adding protein evidence from de novo assemblies, genome-guided transcriptome assemblies, or full-length proteins from OrthoDB generates more putative false positives as implemented in the current workflows. Post-processing with functional and structural filters is highly recommended. Discussion: While the annotation of non-model plant genomes remains complex, this study provides recommendations for inputs and methodological approaches. We discuss a set of best practices to generate an optimal plant genome annotation and present a more robust set of metrics to evaluate the resulting predictions.

4.
J Hered ; 114(5): 561-569, 2023 08 23.
Article in English | MEDLINE | ID: mdl-37262429

ABSTRACT

Dittrichia graveolens (L.) Greuter, or stinkwort, is a weedy annual plant within the family Asteraceae. The species is recognized for the rapid expansion of both its native and introduced ranges: in Europe, it has expanded its native distribution northward from the Mediterranean basin by nearly 7 °C latitude since the mid-20th century, while in California and Australia the plant is an invasive weed of concern. Here, we present the first de novo D. graveolens genome assembly (1N = 9 chromosomes), including complete chloroplast (151,013 bp) and partial mitochondrial genomes (22,084 bp), created using Pacific Biosciences HiFi reads and Dovetail Omni-C data. The final primary assembly is 835 Mbp in length, of which 98.1% are represented by 9 scaffolds ranging from 66 to 119 Mbp. The contig N50 is 74.9 Mbp and the scaffold N50 is 96.9 Mbp, which, together with a 98.8% completeness based on the BUSCO embryophyta10 database containing 1,614 orthologs, underscores the high quality of this assembly. This pseudo-molecule-scale genome assembly is a valuable resource for our fundamental understanding of the genomic consequences of range expansion under global change, as well as comparative genomic studies in the Asteraceae.


Subject(s)
Genome , Genomics , Chromosomes , Biological Evolution , Phylogeny
6.
Appl Plant Sci ; 9(6): e11439, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34268018

ABSTRACT

PREMISE: An informatics approach was used for the construction of an Axiom genotyping array from heterogeneous, high-throughput sequence data to assess the complex genome of loblolly pine (Pinus taeda). METHODS: High-throughput sequence data, sourced from exome capture and whole genome reduced-representation approaches from 2698 trees across five sequence populations, were analyzed with the improved genome assembly and annotation for the loblolly pine. A variant detection, filtering, and probe design pipeline was developed to detect true variants across and within populations. From 8.27 million variants, a total of 642,275 were evaluated and 423,695 of those were screened across a range-wide population. RESULTS: The final informatics and screening approach delivered an Axiom array representing 46,439 high-confidence variants to the forest tree breeding and genetics community. Based on the annotated reference genome, 34% were located in or directly upstream or downstream of genic regions. DISCUSSION: The Pita50K array represents a genome-wide resource developed from sequence data for an economically important conifer, loblolly pine. It uniquely integrates independent projects that assessed trees sampled across the native range. The challenges associated with the large and repetitive genome are addressed in the development of this resource.

7.
BMC Genomics ; 21(1): 9, 2020 Jan 03.
Article in English | MEDLINE | ID: mdl-31900111

ABSTRACT

BACKGROUND: In forest trees, genetic markers have been used to understand the genetic architecture of natural populations, identify quantitative trait loci, infer gene function, and enhance tree breeding. Recently, new, efficient technologies for genotyping thousands to millions of single nucleotide polymorphisms (SNPs) have finally made large-scale use of genetic markers widely available. These methods will be exceedingly valuable for improving tree breeding and understanding the ecological genetics of Douglas-fir, one of the most economically and ecologically important trees in the world. RESULTS: We designed SNP assays for 55,766 potential SNPs that were discovered from previous transcriptome sequencing projects. We tested the array on ~ 2300 related and unrelated coastal Douglas-fir trees (Pseudotsuga menziesii var. menziesii) from Oregon and Washington, and 13 trees of interior Douglas-fir (P. menziesii var. glauca). As many as ~ 28 K SNPs were reliably genotyped and polymorphic, depending on the selected SNP call rate. To increase the number of SNPs and improve genome coverage, we developed protocols to 'rescue' SNPs that did not pass the default Affymetrix quality control criteria (e.g., 97% SNP call rate). Lowering the SNP call rate threshold from 97 to 60% increased the number of successful SNPs from 20,669 to 28,094. We used a subset of 395 unrelated trees to calculate SNP population genetic statistics for coastal Douglas-fir. Over a range of call rate thresholds (97 to 60%), the median call rate for SNPs in Hardy-Weinberg equilibrium ranged from 99.2 to 99.7%, and the median minor allele frequency ranged from 0.198 to 0.233. The successful SNPs also worked well on interior Douglas-fir. CONCLUSIONS: Based on the original transcriptome assemblies and comparisons to version 1.0 of the Douglas-fir reference genome, we conclude that these SNPs can be used to genotype about 10 K to 15 K loci. The Axiom genotyping array will serve as an excellent foundation for studying the population genomics of Douglas-fir and for implementing genomic selection. We are currently using the array to construct a linkage map and test genomic selection in a three-generation breeding program for coastal Douglas-fir.


Subject(s)
Genome, Plant/genetics , Polymorphism, Single Nucleotide/genetics , Pseudotsuga/genetics , Trees/genetics , Adaptation, Physiological/genetics , Breeding , Forests , Genotype , Genotyping Techniques , Humans , Oregon , Washington
SELECTION OF CITATIONS
SEARCH DETAIL
...