RESUMO
Introduction: The coconut tree crop (Cocos nucifera L.) provides vital resources for millions of people worldwide. Coconut germplasm is largely classified into 'Tall' (Typica) and 'Dwarf' (Nana) types. While Tall coconuts are outcrossing, stress tolerant, and late flowering, Dwarf coconuts are inbred and flower early with a high rate of bunch emission. Precocity determines the earlier production of a plantation and facilitates management and harvest. Methods: A unique outbred F2 population was used, generated by intercrossing F1 hybrids between Brazilian Green Dwarf from Jiqui (BGDJ) and West African Tall (WAT) cultivars. Single-nucleotide polymorphism (SNP) markers fixed for alternative alleles in the two varieties, segregating in an F2 configuration, were used to build a high-density linkage map with ~3,000 SNPs, anchored to the existing chromosome-level genome assemblies, and a quantitative trait locus (QTL) mapping analysis was carried out. Results: The linkage map established the chromosome numbering correspondence between the two reference genome versions and the relationship between recombination rate, physical distance, and gene density in the coconut genomes. Leveraging the strong segregation for precocity inherited from the Dwarf cultivar in the F2, a major effect QTL with incomplete dominance was mapped for flowering time. FLOWERING LOCUS T (FT) gene homologs of coconut previously described as putatively involved in flowering time by alternative splice variant analysis were colocalized within a ~200-kb window of the major effect QTL [logarithm of the odds (LOD) = 11.86]. Discussion: Our work provides strong phenotype-based evidence for the role of the FT locus as the putative underlying functional variant for the flowering time difference between Dwarf and Tall coconuts. Major effect QTLs were also detected for developmental traits of the palm, plausibly suggesting pleiotropism of the FT locus for other precocity traits. Haplotypes of the two SNPs flanking the flowering time QTL inherited from the Dwarf parent BGDJ caused a reduction in the time to flower of approximately 400 days. These SNPs could be used for high-throughput marker-assisted selection of early-flowering and higher-productivity recombinant lines, providing innovative genetic material to the coconut industry.
RESUMO
Eucalyptus dunnii is one of the most important Eucalyptus species for short-fiber pulp production in regions where other species of the genus are affected by poor soil and climatic conditions. In this context, E. dunnii holds promise as a resource to address and adapt to the challenges of climate change. Despite its rapid growth and favorable wood properties for solid wood products, the advancement of its improvement remains in its early stages. In this work, we evaluated the performance of two single nucleotide polymorphism, (SNP), genotyping methods for population genetics analysis and Genomic Selection in E. dunnii. Double digest restriction-site associated DNA sequencing (ddRADseq) was compared with the EUChip60K array in 308 individuals from a provenance-progeny trial. The compared SNP set included 8,011 and 19,008 informative SNPs distributed along the 11 chromosomes, respectively. Although the two datasets differed in the percentage of missing data, genome coverage, minor allele frequency and estimated genetic diversity parameters, they revealed a similar genetic structure, showing two subpopulations with little differentiation between them, and low linkage disequilibrium. GS analyses were performed for eleven traits using Genomic Best Linear Unbiased Prediction (GBLUP) and a conventional pedigree-based model (ABLUP). Regardless of the SNP dataset, the predictive ability (PA) of GBLUP was better than that of ABLUP for six traits (Cellulose content, Total and Ethanolic extractives, Total and Klason lignin content and Syringyl and Guaiacyl lignin monomer ratio). When contrasting the SNP datasets used to estimate PAs, the GBLUP-EUChip60K model gave higher and significant PA values for six traits, meanwhile, the values estimated using ddRADseq gave higher values for three other traits. The PAs correlated positively with narrow sense heritabilities, with the highest correlations shown by the ABLUP and GBLUP-EUChip60K. The two genotyping methods, ddRADseq and EUChip60K, are generally comparable for population genetics and genomic prediction, demonstrating the utility of the former when subjected to rigorous SNP filtering. The results of this study provide a basis for future whole-genome studies using ddRADseq in non-model forest species for which SNP arrays have not yet been developed.
RESUMO
Introduction: Genomic selection (GS) experiments in forest trees have largely reported estimates of predictive abilities from cross-validation among individuals in the same breeding generation. In such conditions, no effects of recombination, selection, drift, and environmental changes are accounted for. Here, we assessed the effectively realized predictive ability (RPA) for volume growth at harvest age by GS across generations in an operational reciprocal recurrent selection (RRS) program of hybrid Eucalyptus. Methods: Genomic best linear unbiased prediction with additive (GBLUP_G), additive plus dominance (GBLUP_G+D), and additive single-step (HBLUP) models were trained with different combinations of growth data of hybrids and pure species individuals (N = 17,462) of the G1 generation, 1,944 of which were genotyped with ~16,000 SNPs from SNP arrays. The hybrid G2 progeny trial (HPT267) was the GS target, with 1,400 selection candidates, 197 of which were genotyped still at the seedling stage, and genomically predicted for their breeding and genotypic values at the operational harvest age (6 years). Seedlings were then grown to harvest and measured, and their pedigree-based breeding and genotypic values were compared to their originally predicted genomic counterparts. Results: Genomic RPAs ≥0.80 were obtained as the genetic relatedness between G1 and G2 increased, especially when the direct parents of selection candidates were used in training. GBLUP_G+D reached RPAs ≥0.70 only when hybrid or pure species data of G1 were included in training. HBLUP was only marginally better than GBLUP. Correlations ≥0.80 were obtained between pedigree and genomic individual ranks. Rank coincidence of the top 2.5% selections was the highest for GBLUP_G (45% to 60%) compared to GBLUP_G+D. To advance the pure species RRS populations, GS models were best when trained on pure species than hybrid data, and HBLUP yielded ~20% higher predictive abilities than GBLUP, but was not better than ABLUP for ungenotyped trees. Discussion: We demonstrate that genomic data effectively enable accurate ranking of eucalypt hybrid seedlings for their yet-to-be observed volume growth at harvest age. Our results support a two-stage GS approach involving family selection by average genomic breeding value, followed by within-top-families individual GS, significantly increasing selection intensity, optimizing genotyping costs, and accelerating RRS breeding.
RESUMO
Eucalyptus is an economically important genus comprising more than 890 species in different subgenera and sections. Approximately twenty species of subgenus Symphyomyrtus account for 95% of the world's planted eucalypts. Discrimination of closely related eucalypt taxa is challenging, consistent with their recent phylogenetic divergence and occasional hybridization in nature. Admixture, misclassification or mislabeling of Eucalyptus germplasm resources maintained as exotics have been suggested, although no reports are available. Moreover, hybrids with increased productivity and traits complementarity are planted worldwide, but little is known about their actual genomic ancestry. In this study we examined a set of 440 trees of 16 different Eucalyptus species and 44 interspecific hybrids of multi-species origin conserved in germplasm banks in Brazil. We used genome-wide SNP data to evaluate the agreement between the alleged phylogenetic classification of species and provenances as registered in their historical records, and their observed genetic clustering derived from SNP data. Genetic structure analyses correctly assigned each of the 16 species to a different cluster although the PCA positioning of E. longirostrata was inconsistent with its current taxonomy. Admixture was present for closely related species' materials derived from local germplasm banks, indicating unintended hybridization following germplasm introduction. Provenances could be discriminated for some species, indicating that SNP-based discrimination was directly proportional to geographical distance, consistent with an isolation-by-distance model. SNP-based genomic ancestry analysis showed that the majority of the hybrids displayed realized genomic composition deviating from the expected ones based on their pedigree records, consistent with admixture in their parents and pervasive genome-wide directional selection toward the fast-growing E. grandis genome. SNP data in support of tree breeding provide precise germplasm identity verification, and allow breeders to objectively recognize the actual ancestral origin of superior hybrids to more realistically guide the program toward the development of the desired genetic combinations.
Assuntos
Eucalyptus , Polimorfismo de Nucleotídeo Único , Filogenia , Genoma de Planta , Melhoramento Vegetal , GenômicaRESUMO
Tropical fruit tree species constitute a yet untapped supply of outstanding diversity of taste and nutritional value, barely developed from the genetics standpoint, with scarce or no genomic resources to tackle the challenges arising in modern breeding practice. We generated a de novo genome assembly of the Psidium guajava, the super fruit "apple of the tropics", and successfully transferred 14,268 SNP probesets from Eucalyptus to Psidium at the nucleotide level, to detect genomic loci linked to resistance to the root knot nematode (RKN) Meloidogyne enterolobii derived from the wild relative P. guineense. Significantly associated loci with resistance across alternative analytical frameworks, were detected at two SNPs on chromosome 3 in a pseudo-assembly of Psidium guajava genome built using a syntenic path approach with the Eucalyptus grandis genome to determine the order and orientation of the contigs. The P. guineense-derived resistance response to RKN and disease onset is conceivably triggered by mineral nutrients and phytohormone homeostasis or signaling with the involvement of the miRNA pathway. Hotspots of mapped resistance quantitative trait loci and functional annotation in the same genomic region of Eucalyptus provide further indirect support to our results, highlighting the evolutionary conservation of genomes across genera of Myrtaceae in the adaptation to pathogens. Marker assisted introgression of the resistance loci mapped should accelerate the development of improved guava cultivars and hybrid rootstocks.
Assuntos
Eucalyptus , Myrtaceae , Psidium , Tylenchoidea , Animais , Tylenchoidea/genética , Psidium/genética , Eucalyptus/genética , Myrtaceae/genética , Polimorfismo de Nucleotídeo Único , Melhoramento Vegetal , GenômicaRESUMO
Cacao is a globally important crop with a long history of domestication and selective breeding. Despite the increased use of elite clones by cacao farmers, worldwide plantations are established mainly using hybrid progeny material derived from heterozygous parents, therefore displaying high tree-to-tree variability. The deliberate development of hybrids from advanced inbred lines produced by successive generations of self-pollination has not yet been fully considered in cacao breeding. This is largely due to the self-incompatibility of the species, the long generation cycles (3-5 years) and the extensive trial areas needed to accomplish the endeavor. We propose a simple and accessible approach to develop inbred lines based on accelerating the buildup of homozygosity based on regular selfing assisted by genome-wide SNP genotyping. In this study we genotyped 90 clones from the Brazilian CEPEC´s germplasm collection and 49 inbred offspring of six S1 or S2 cacao families derived from self-pollinating clones CCN-51, PS-13.19, TSH-1188 and SIAL-169. A set of 3,380 SNPs distributed across the cacao genome were interrogated on the EMBRAPA multi-species 65k Infinium chip. The 90 cacao clones showed considerable variation in genome-wide SNP homozygosity (mean 0.727± 0.182) and 19 of them with homozygosity ≥90%. By assessing the increase in homozygosity across two generations of self-pollinations, SNP data revealed the wide variability in homozygosity within and between S1 and S2 families. Even in small families (<10 sibs), individuals were identified with up to ~1.5 standard deviations above the family mean homozygosity. From baseline homozygosities of 0.476 and 0.454, offspring with homozygosities of 0.862 and 0.879 were recovered for clones TSH-1188 and CCN-51 respectively, in only two generations of selfing (81-93% increase). SNP marker assisted monitoring and selection of inbred individuals can be a practical tool to optimize and accelerate the development of inbred lines of outbred tree species. This approach will allow a faster and more accurate exploitation of hybrid breeding strategies in cacao improvement programs and potentially in other perennial fruit and forest trees.
Assuntos
Cacau , Humanos , Cacau/genética , Árvores , Genótipo , Melhoramento Vegetal , Tireotropina/genéticaRESUMO
KEY MESSAGE: We propose the application of enviromics to breeding practice, by which the similarity among sites assessed on an "omics" scale of environmental attributes drives the prediction of unobserved genotype performances. Genotype by environment interaction (GEI) studies in plant breeding have focused mainly on estimating genetic parameters over a limited number of experimental trials. However, recent geographic information system (GIS) techniques have opened new frontiers for better understanding and dealing with GEI. These advances allow increasing selection accuracy across all sites of interest, including those where experimental trials have not yet been deployed. Here, we introduce the term enviromics, within an envirotypic-assisted breeding framework. In summary, likewise genotypes at DNA markers, any particular site is characterized by a set of "envirotypes" at multiple "enviromic" markers corresponding to environmental variables that may interact with the genetic background, thus providing informative breeding re-rankings for optimized decisions over different environments. Based on simulated data, we illustrate an index-based enviromics method (the "GIS-GEI") which, due to its higher granular resolution than standard methods, allows for: (1) accurate matching of sites to their most appropriate genotypes; (2) better definition of breeding areas that have high genetic correlation to ensure selection gains across environments; and (3) efficient determination of the best sites to carry out experiments for further analyses. Environmental scenarios can also be optimized for productivity improvement and genetic resources management, especially in the current outlook of dynamic climate change. Envirotyping provides a new class of markers for genetic studies, which are fairly inexpensive, increasingly available and transferable across species. We envision a promising future for the integration of enviromics approaches into plant breeding when coupled with next-generation genotyping/phenotyping and powerful statistical modeling of genetic diversity.
Assuntos
Meio Ambiente , Interação Gene-Ambiente , Melhoramento Vegetal/métodos , Seleção Genética , Algoritmos , Simulação por Computador , Produtos Agrícolas/genética , Marcadores Genéticos , Genótipo , Sistemas de Informação GeográficaRESUMO
MAIN CONCLUSION: Bignoniaceae species have conserved chloroplast structure, with hotspots of nucleotide diversity. Several genes are under positive selection, and can be targets for evolutionary studies. Bignoniaceae is one of the most species-rich family of woody plants in Neotropical seasonally dry forests. Here we report the assembly of Handroanthus impetiginosus chloroplast genome and evolutionary comparative analyses of ten Bignoniaceae species representing the genera for which whole-genome chloroplast sequences were available. The chloroplast genome of H. impetiginosus is 159,462 bp in size and has a similar structure compared to the other nine species. The total number of genes was slightly variable amongst the Bignoniaceae, ranging from 124 in H. impetiginosus to 144 in Anemopaegma acutifolium. The inverted repeat (IR) size was variable, ranging from 24,657 bp (Tecomaria capensis) to 40,481 bp (A. acutifolium), due to the contraction and retraction at its boundaries. However, gene boundaries were very similar among the ten species. We found 98 forward and palindromic dispersed repeats, and 85 simple sequence repeats (SSRs). In general, chloroplast sequences were highly conserved, with few nucleotide diversity hotspots in the genes accD, clpP, rpoA, ycf1, ycf2. The phylogenetic analysis based on 77 coding genes was highly consistent with Angiosperm Phylogeny Group (APG) IV. Our results also indicate that most genes are under negative selection or neutral evolution. We found no evidence of branch-site selection, implying that H. impetiginosus is not evolving faster than the other species analyzed, notwithstanding we found site positive selection signal in several genes. These genes can provide targets for evolutionary studies in Bignoniaceae and Lamiales species.
Assuntos
Bignoniaceae , Evolução Molecular , Genoma de Cloroplastos , Tabebuia , Bignoniaceae/classificação , Bignoniaceae/genética , Genoma de Cloroplastos/genética , Repetições de Microssatélites/genética , Filogenia , Tabebuia/classificação , Tabebuia/genéticaRESUMO
High-throughput SNP genotyping has become a precondition to move to higher precision and wider genome coverage genetic analysis of natural and breeding populations of non-model species. We developed a 44,318 annotated SNP catalog for Araucaria angustifolia, a grandiose subtropical conifer tree, one of the only two native Brazilian gymnosperms, critically endangered due to its valuable wood and seeds. Following transcriptome assembly and annotation, SNPs were discovered from RNA-seq and pooled RAD-seq data. From the SNP catalog, an Axiom® SNP array with 3,038 validated SNPs was developed and used to provide a comprehensive look at the genetic diversity and structure of 15 populations across the natural range of the species. RNA-seq was a far superior source of SNPs when compared to RAD-seq in terms of conversion rate to polymorphic markers on the array, likely due to the more efficient complexity reduction of the huge conifer genome. By matching microsatellite and SNP data on the same set of A. angustifolia individuals, we show that SNPs reflect more precisely the actual genome-wide patterns of genetic diversity and structure, challenging previous microsatellite-based assessments. Moreover, SNPs corroborated the known major north-south genetic cline, but allowed a more accurate attribution to regional versus among-population differentiation, indicating the potential to select ancestry-informative markers. The availability of a public, user-friendly 3K SNP array for A. angustifolia and a catalog of 44,318 SNPs predicted to provide ~29,000 informative SNPs across ~20,000 loci across the genome, will allow tackling still unsettled questions on its evolutionary history, toward a more comprehensive picture of the origin, past dynamics and future trend of the species' genetic resources. Additionally, but not less importantly, the SNP array described, unlocks the potential to adopt genomic prediction methods to accelerate the still very timid efforts of systematic tree breeding of A. angustifolia.
Assuntos
Araucaria/genética , Brasil , Genoma de Planta/genética , Genômica/métodos , Genótipo , Repetições de Microssatélites/genética , Polimorfismo de Nucleotídeo Único/genética , Traqueófitas/genética , Transcriptoma/genética , Árvores/genéticaRESUMO
Several studies suggest the relation of DNA methylation to diseases in humans and important phenotypes in plants drawing attention to this epigenetic mark as an important source of variability. In the last decades, several methodologies were developed to assess the methylation state of a genome. However, there is still a lack of affordable and precise methods for genome wide analysis in large sample size studies. Methyl sensitive double digestion MS-DArT sequencing method emerges as a promising alternative for methylation profiling. We developed a computational pipeline for the identification of DNA methylation using MS-DArT-seq data and carried out a pilot study using the Eucalyptus grandis tree sequenced for the species reference genome. Using a statistic framework as in differential expression analysis, 72,515 genomic sites were investigated and 5,846 methylated sites identified, several tissue specific, distributed along the species 11 chromosomes. We highlight a bias towards identification of DNA methylation in genic regions and the identification of 2,783 genes and 842 transposons containing methylated sites. Comparison with WGBS, DNA sequencing after treatment with bisulfite, data demonstrated a precision rate higher than 95% for our approach. The availability of a reference genome is useful for determining the genomic context of methylated sites but not imperative, making this approach suitable for any species. Our approach provides a cost effective, broad and reliable examination of DNA methylation profile on MspI/HpaII restriction sites, is fully reproducible and the source code is available on GitHub (https://github.com/wendelljpereira/ms-dart-seq).
Assuntos
Análise Custo-Benefício , Metilação de DNA/genética , Eucalyptus/genética , Técnicas de Genotipagem/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Folhas de Planta/genética , Análise de Sequência de DNA/métodos , Árvores/genética , Cromossomos de Plantas/genética , Enzimas de Restrição do DNA/genética , Elementos de DNA Transponíveis/genética , Genes de Plantas/genética , Técnicas de Genotipagem/economia , Sequenciamento de Nucleotídeos em Larga Escala/economia , Projetos Piloto , Reprodutibilidade dos Testes , Mapeamento por Restrição , Análise de Sequência de DNA/economia , Sulfitos/farmacologiaRESUMO
A thorough understanding of the heritability, genetic correlations and additive and non-additive variance components of tree growth and wood properties is a requisite for effective tree breeding. This knowledge is essential to maximize genetic gain, that is, the amount of increase in trait performance achieved annually through directional selection. Understanding the genetic attributes of traits targeted by breeding is also important to sustain decade-long genetic progress, that is, the progress made by increasing the average genetic value of the offspring as compared to that of the parental generation. In this study, we report quantitative genetic parameters for fifteen growth, wood chemical and physical traits for the world-famous Eucalyptus urograndis hybrid (E. grandis × E. urophylla). These traits directly impact the optimal use of wood for cellulose pulp, paper, and energy production. A population of 1,000 trees sampled in a progeny trial was phenotyped directly or following the development and use of near-infrared spectroscopy calibration models. Trees were genotyped with 33,398 SNPs and 24,001 DArT-seq genome-wide markers and genomic realized relationship matrices (GRM) were used for parameter estimation with an individual-tree additive-dominant mixed model. Wood chemical properties and wood density showed stronger genetic control than growth, cellulose and fiber traits. Additive effects are the main drivers of genetic variation for all traits, but dominance plays an equally or more important role for growth, singularly in this hybrid. GRM´s with >10,000 markers provided stable relationships estimates and more accurate parameters than pedigrees by capturing the full genetic relationships among individuals and disentangling the non-additive from the additive genetic component. Low correlations between growth and wood properties indicate that simultaneous selection for wood traits can be applied with minor effects on genetic gain for growth. Conversely, moderate to strong correlations between wood density and chemical traits exist, likely due to their interdependency on cell wall structure such that responses to selection will be connected for these traits. Our results illustrate the advantage of using genome-wide marker data to inform tree breeding in general and have important consequences for operational breeding of eucalypt urograndis hybrids.
Assuntos
Eucalyptus/crescimento & desenvolvimento , Eucalyptus/genética , Brasil , Eucalyptus/química , Genoma de Planta , Genótipo , Hibridização Genética , Modelos Genéticos , Fenótipo , Melhoramento Vegetal/métodos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Especificidade da Espécie , Espectroscopia de Luz Próxima ao Infravermelho , Árvores/química , Árvores/genética , Árvores/crescimento & desenvolvimento , Madeira/química , Madeira/genética , Madeira/crescimento & desenvolvimentoRESUMO
Genomic Best Linear Unbiased Prediction (GBLUP) in tree breeding typically only uses information from genotyped trees. However, information from phenotyped but non-genotyped trees can also be highly valuable. The single-step GBLUP approach (ssGBLUP) allows genomic prediction to take into account both genotyped and non-genotyped trees simultaneously in a single evaluation. In this study, we investigated the advantage, in terms of breeding value accuracy and bias, of including phenotypic observation from non-genotyped trees in a standard tree GBLUP evaluation. We compared the efficiency of the conventional pedigree-based (ABLUP), GBLUP and ssGBLUP approaches to evaluate eight growth and wood quality traits in a Eucalyptus hybrid population, genotyped with 33,398 single nucleotide polymorphisms (SNPs) using the EucHIP60k. Theoretical accuracies, predictive ability and bias were calculated by ten-fold cross validation on all traits. The use of additional phenotypic information from non-genotyped trees by means of ssGBLUP provided higher predictive ability (from 37% to 75%) and lower prediction bias (from 21% to 73%) for the genetic component of non-phenotyped but genotyped trees when compared to GBLUP. The increase (decrease) in the prediction accuracy (bias) became stronger as trait heritability decreased. We concluded that ssGBLUP is a promising breeding tool to improve accuracies and bias over classical GBLUP for genomic evaluation in Eucalyptus breeding practice.
Assuntos
Eucalyptus/genética , Madeira/genética , Eucalyptus/anatomia & histologia , Eucalyptus/crescimento & desenvolvimento , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Melhoramento Vegetal/métodos , Característica Quantitativa Herdável , Madeira/anatomia & histologia , Madeira/crescimento & desenvolvimentoRESUMO
The role of natural selection in shaping patterns of diversity is still poorly understood in the Neotropics. We carried out the first genome-wide population genomics study in a Neotropical tree, Handroanthus impetiginosus (Bignoniaceae), sampling 75,838 SNPs by sequence capture in 128 individuals across 13 populations. We found evidences for local adaptation using Bayesian correlations of allele frequency and environmental variables (32 loci in 27 genes) complemented by an analysis of selective sweeps and genetic hitchhiking events using SweepFinder2 (81 loci in 47 genes). Fifteen genes were identified by both approaches. By accounting for population genetic structure, we also found 14 loci with selection signal in a STRUCTURE-defined lineage comprising individuals from five populations, using Outflank. All approaches pinpointed highly diverse and structurally conserved genes affecting plant development and primary metabolic processes. Spatial interpolation forecasted differences in the expected allele frequencies at loci under selection over time, suggesting that H. impetiginosus may track its habitat during climate changes. However, local adaptation through natural selection may also take place, allowing species persistence due to niche evolution. A high genetic differentiation was seen among the H. impetiginosus populations, which, together with the limited power of the experiment, constrains the improved detection of other types of soft selective forces, such as background, balanced, and purifying selection. Small differences in allele frequency distribution among widespread populations and the low number of loci with detectable adaptive sweeps advocate for a polygenic model of adaptation involving a potentially large number of small genome-wide effects.
Assuntos
Adaptação Fisiológica/genética , Variação Genética/genética , Seleção Genética/genética , Árvores/genética , Alelos , Teorema de Bayes , Florestas , Frequência do Gene/genética , Deriva Genética , Genética Populacional/métodos , Genoma de Planta/genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Metagenômica/métodos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Genome-wide association studies (GWAS) in plants typically suffer from limited statistical power. An alternative to the logistical and cost challenge of increasing sample sizes is to gain power by meta-analysis using information from independent studies. We carried out GWAS for growth traits with six single-marker models and regional heritability mapping (RHM) in four Eucalyptus breeding populations independently and by Joint-GWAS, using gene and segment-based models, with data for 3373 individuals genotyped with a communal EUChip60KSNP platform. While single-single nucleotide polymorphism (SNP) GWAS hardly detected significant associations at high-stringency in each population, gene-based Joint-GWAS revealed nine genes significantly associated with tree height. Associations detected using single-SNP GWAS, RHM and Joint-GWAS set-based models explained on average 3-20% of the phenotypic variance. Whole-genome regression, conversely, captured 64-89% of the pedigree-based heritability in all populations. Several associations independently detected for the same SNPs in different populations provided unprecedented GWAS validation results in forest trees. Rare and common associations were discovered in eight genes involved in cell wall biosynthesis and lignification. With the increasing adoption of genomic prediction of complex phenotypes using shared SNPs and much larger tree breeding populations, Joint-GWAS approaches should provide increasing power to pinpoint discrete associations potentially useful toward tree breeding and molecular applications.
Assuntos
Eucalyptus/genética , Genoma de Planta , Estudo de Associação Genômica Ampla , Melhoramento Vegetal , Característica Quantitativa Herdável , Padrões de Herança/genética , Desequilíbrio de Ligação/genética , Polimorfismo de Nucleotídeo Único/genética , Análise de Componente PrincipalRESUMO
Forest tree breeding has been successful at delivering genetically improved material for multiple traits based on recurrent cycles of selection, mating, and testing. However, long breeding cycles, late flowering, variable juvenile-mature correlations, emerging pests and diseases, climate, and market changes, all pose formidable challenges. Genetic dissection approaches such as quantitative trait mapping and association genetics have been fruitless to effectively drive operational marker-assisted selection (MAS) in forest trees, largely because of the complex multifactorial inheritance of most, if not all traits of interest. The convergence of high-throughput genomics and quantitative genetics has established two new paradigms that are changing contemporary tree breeding dogmas. Genomic selection (GS) uses large number of genome-wide markers to predict complex phenotypes. It has the potential to accelerate breeding cycles, increase selection intensity and improve the accuracy of breeding values. Realized genomic relationships matrices, on the other hand, provide innovations in genetic parameters' estimation and breeding approaches by tracking the variation arising from random Mendelian segregation in pedigrees. In light of a recent flow of promising experimental results, here we briefly review the main concepts, analytical tools and remaining challenges that currently underlie the application of genomics data to tree breeding. With easy and cost-effective genotyping, we are now at the brink of extensive adoption of GS in tree breeding. Areas for future GS research include optimizing strategies for updating prediction models, adding validated functional genomics data to improve prediction accuracy, and integrating genomic and multi-environment data for forecasting the performance of genetic material in untested sites or under changing climate scenarios. The buildup of phenotypic and genome-wide data across large-scale breeding populations and advances in computational prediction of discrete genomic features should also provide opportunities to enhance the application of genomics to tree breeding.
RESUMO
Modern genotyping techniques, such as SNP analysis and genotyping by sequencing (GBS), are hampered by poor DNA quality and purity, particularly in challenging plant species, rich in secondary metabolites. We therefore investigated the utility of a pre-wash step using a buffered sorbitol solution, prior to DNA extraction using a high salt CTAB extraction protocol, in a high throughput or miniprep setting. This pre-wash appears to remove interfering metabolites, such as polyphenols and polysaccharides, from tissue macerates. We also investigated the adaptability of the sorbitol pre-wash for RNA extraction using a lithium chloride-based protocol. The method was successfully applied to a variety of tissues, including leaf, cambium and fruit of diverse plant species including annual crops, forest and fruit trees, herbarium leaf material and lyophilized fungal mycelium. We consistently obtained good yields of high purity DNA or RNA in all species tested. The protocol has been validated for thousands of DNA samples by generating high data quality in dense SNP arrays. DNA extracted from Eucalyptus spp. leaf and cambium as well as mycelium from Trichoderma spp. was readily digested with restriction enzymes and performed consistently in AFLP assays. Scaled-up DNA extractions were also suitable for long read sequencing. Successful RNA quality control and good RNA-Seq data for Eucalyptus and cashew confirms the effectiveness of the sorbitol buffer pre-wash for high quality RNA extraction.
Assuntos
DNA/normas , Eucalyptus/genética , Polimorfismo de Nucleotídeo Único , RNA/normas , Trichoderma/genética , Soluções Tampão , Câmbio/genética , DNA/isolamento & purificação , DNA Fúngico/isolamento & purificação , DNA Fúngico/normas , DNA de Plantas/isolamento & purificação , DNA de Plantas/normas , Técnicas de Genotipagem , Micélio/genética , Folhas de Planta/genética , RNA/isolamento & purificação , RNA Fúngico/normas , RNA de Plantas/isolamento & purificação , RNA de Plantas/normas , Análise de Sequência de DNA , Análise de Sequência de RNA , Sorbitol/químicaRESUMO
Targeted sequence capture coupled to high-throughput sequencing has become a powerful method for the study of genome-wide sequence variation. Following our recent development of a genome assembly for the Pink Ipê tree (Handroanthus impetiginosus), a widely distributed Neotropical timber species, we now report the development of a set of 24,751 capture probes for single-nucleotide polymorphisms (SNPs) characterization and genotyping across 18,216 distinct loci, sampling more than 10 Mbp of the species genome. This system identifies nearly 200,000 SNPs located inside or in close proximity to almost 14,000 annotated protein-coding genes, generating quality genotypic data in populations spanning wide geographic distances across the species native range. To provide recommendations for future developments of similar systems for highly heterozygous plant genomes we investigated issues such as probe design, sequencing coverage and bioinformatics, including the evaluation of the capture efficiency and a reassessment of the technical reproducibility of the assay for SNPs recall and genotyping precision. Our results highlight the value of a detailed probe screening on a preliminary genome assembly to produce reliable data for downstream genetic studies. This work should inspire and assist the development of similar genomic resources for other orphan crops and forest trees with highly heterozygous genomes.
Assuntos
Evolução Molecular , Genoma de Planta , Genômica , Polimorfismo de Nucleotídeo Único , Tabebuia/genética , Árvores/genética , Variação Genética , Genômica/métodos , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala , Locos de Características Quantitativas , Reprodutibilidade dos TestesRESUMO
Background: Handroanthus impetiginosus (Mart. ex DC.) Mattos is a keystone Neotropical hardwood tree widely distributed in seasonally dry tropical forests of South and Mesoamerica. Regarded as the "new mahogany," it is the second most expensive timber, the most logged species in Brazil, and currently under significant illegal trading pressure. The plant produces large amounts of quinoids, specialized metabolites with documented antitumorous and antibiotic effects. The development of genomic resources is needed to better understand and conserve the diversity of the species, to empower forensic identification of the origin of timber, and to identify genes for important metabolic compounds. Findings: The genome assembly covers 503.7 Mb (N50 = 81 316 bp), 90.4% of the 557-Mbp genome, with 13 206 scaffolds. A repeat database with 1508 sequences was developed, allowing masking of â¼31% of the assembly. Depth of coverage indicated that consensus determination adequately removed haplotypes assembled separately due to the extensive heterozygosity of the species. Automatic gene prediction provided 31 688 structures and 35 479 messenger RNA transcripts, while external evidence supported a well-curated set of 28 603 high-confidence models (90% of total). Finally, we used the genomic sequence and the comprehensive gene content annotation to identify genes related to the production of specialized metabolites. Conclusions: This genome assembly is the first well-curated resource for a Neotropical forest tree and the first one for a member of the Bignoniaceae family, opening exceptional opportunities to empower molecular, phytochemical, and breeding studies. This work should inspire the development of similar genomic resources for the largely neglected forest trees of the mega-diverse tropical biomes.
Assuntos
Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Genoma de Planta , Quinonas/metabolismo , Tabebuia/genética , Árvores/genética , Brasil , Elementos de DNA Transponíveis , Florestas , Tamanho do Genoma , Haplótipos , Heterozigoto , Sequenciamento de Nucleotídeos em Larga Escala , Tabebuia/crescimento & desenvolvimento , Árvores/crescimento & desenvolvimento , Clima TropicalRESUMO
Streptococcus pyogenes, also known as group A Streptococcus (GAS), is a human pathogen that causes diverse human diseases including streptococcal toxic shock syndrome (STSS). A GAS outbreak occurred in Brasilia, Brazil, during the second half of the year 2011, causing 26 deaths. Whole genome sequencing was performed using Illumina platform. The sequences were assembled and genes were predicted for comparative analysis with emm type 1 strains: MGAS5005 and M1 GAS. Genomics comparison revealed one of the invasive strains that differ from others isolates and from emm 1 reference genomes. Also, the new invasive strain showed differences in the content of virulence factors compared to other isolated in the same outbreak. The evolution of contemporary GAS strains is strongly associated with horizontal gene transfer. This is the first genomic study of a Streptococcal emm 1 outbreak in Brazil, and revealed the rapid bacterial evolution leading to new clones. The emergence of new invasive strains can be a consequence of the injudicious use of antibiotics in Brazil during the past decades.
RESUMO
BACKGROUND: The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. RESULTS: Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. CONCLUSIONS: This study provides further experimental data supporting positive prospects of using genome-wide data to capture large proportions of trait heritability and predict growth traits in trees with accuracies equal or better than those attainable by phenotypic selection. Additionally, our results document the superiority of the whole-genome regression approach in accounting for large proportions of the heritability of complex traits such as growth in contrast to the limited value of the local GWAS approach toward breeding applications in forest trees.