Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
G3 (Bethesda) ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38781445

ABSTRACT

The first chromosome-scale reference genome of the rare narrow-endemic African moss Physcomitrellopsis africana is presented here. Assembled from 73x nanopore long reads and 163x BGI-seq short reads, the 414 Mb reference comprises 26 chromosomes and 22,925 protein-coding genes (BUSCO: C:94.8%[D:13.9%]). This genome holds two genes that withstood rigorous filtration of microbial contaminants, have no homolog in other land plants and are thus interpreted as resulting from two unique horizontal gene transfers from microbes. Further, Physcomitrellopsis africana shares 176 of the 273 published HGT candidates identified in Physcomitrium patens, but lacks 98 of these, highlighting that perhaps as many as 91 genes were acquired in P. patens in the last 40 million years following its divergence from its common ancestor with P. africana. These observations suggest rather continuous gene gains via HGT followed by potential losses, during the diversification of the Funariaceae. Our findings showcase both dynamic flux in plant HGTs over evolutionarily "short" timescales, alongside enduring impacts of successful integrations, like those still functionally maintained in extant Physcomitrellopsis africana. Furthermore, this study describes the informatic processes employed to distinguish contaminants from candidate HGT events.

2.
Appl Plant Sci ; 11(4): e11533, 2023.
Article in English | MEDLINE | ID: mdl-37601314

ABSTRACT

Premise: Robust standards to evaluate quality and completeness are lacking in eukaryotic structural genome annotation, as genome annotation software is developed using model organisms and typically lacks benchmarking to comprehensively evaluate the quality and accuracy of the final predictions. The annotation of plant genomes is particularly challenging due to their large sizes, abundant transposable elements, and variable ploidies. This study investigates the impact of genome quality, complexity, sequence read input, and method on protein-coding gene predictions. Methods: The impact of repeat masking, long-read and short-read inputs, and de novo and genome-guided protein evidence was examined in the context of the popular BRAKER and MAKER workflows for five plant genomes. The annotations were benchmarked for structural traits and sequence similarity. Results: Benchmarks that reflect gene structures, reciprocal similarity search alignments, and mono-exonic/multi-exonic gene counts provide a more complete view of annotation accuracy. Transcripts derived from RNA-read alignments alone are not sufficient for genome annotation. Gene prediction workflows that combine evidence-based and ab initio approaches are recommended, and a combination of short and long reads can improve genome annotation. Adding protein evidence from de novo assemblies, genome-guided transcriptome assemblies, or full-length proteins from OrthoDB generates more putative false positives as implemented in the current workflows. Post-processing with functional and structural filters is highly recommended. Discussion: While the annotation of non-model plant genomes remains complex, this study provides recommendations for inputs and methodological approaches. We discuss a set of best practices to generate an optimal plant genome annotation and present a more robust set of metrics to evaluate the resulting predictions.

SELECTION OF CITATIONS
SEARCH DETAIL
...