Search | VHL Regional Portal

A method for partitioning trends in genetic mean and variance to understand breeding practices.

Oliveira, Thiago P; Obsteter, Jana; Pocrnic, Ivan; Heslot, Nicolas; Gorjanc, Gregor.

Genet Sel Evol ; 55(1): 36, 2023 Jun 02.

Article in English | MEDLINE | ID: mdl-37268883

ABSTRACT

BACKGROUND: In breeding programmes, the observed genetic change is a sum of the contributions of different selection paths represented by groups of individuals. Quantifying these sources of genetic change is essential for identifying the key breeding actions and optimizing breeding programmes. However, it is difficult to disentangle the contribution of individual paths due to the inherent complexity of breeding programmes. Here we extend the previously developed method for partitioning genetic mean by paths of selection to work both with the mean and variance of breeding values. METHODS: First, we extended the partitioning method to quantify the contribution of different paths to genetic variance assuming that the breeding values are known. Second, we combined the partitioning method with the Markov Chain Monte Carlo approach to draw samples from the posterior distribution of breeding values and use these samples for computing the point and interval estimates of partitions for the genetic mean and variance. We implemented the method in the R package AlphaPart. We demonstrated the method with a simulated cattle breeding programme. RESULTS: We show how to quantify the contribution of different groups of individuals to genetic mean and variance and that the contributions of different selection paths to genetic variance are not necessarily independent. Finally, we observed that the partitioning method under the pedigree-based model has some limitations, which suggests the need for a genomic extension. CONCLUSIONS: We presented a partitioning method to quantify sources of change in genetic mean and variance in breeding programmes. The method can help breeders and researchers understand the dynamics in genetic mean and variance in a breeding programme. The developed method for partitioning genetic mean and variance is a powerful method for understanding how different selection paths interact within a breeding programme and how they can be optimised.

Subject(s)

Genome , Genomics , Animals , Cattle/genetics , Monte Carlo Method , Pedigree , Markov Chains , Models, Genetic , Selection, Genetic

An alternative covariance estimator to investigate genetic heterogeneity in populations.

Heslot, Nicolas; Jannink, Jean-Luc.

Genet Sel Evol ; 47: 93, 2015 Nov 26.

Article in English | MEDLINE | ID: mdl-26612537

ABSTRACT

BACKGROUND: For genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship. RESULTS: We propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy. CONCLUSIONS: This alternative covariance estimator can be used to gain insight into trait-specific genetic heterogeneity by identifying relevant sub-populations that lack genetic correlation between them. Genetic correlation can be 0 between identified sub-populations by performing automatic selection of relevant sets of individuals to be included in the training population. It may also increase statistical power in GWAS.

Subject(s)

Genetic Heterogeneity , Genetics, Population , Models, Genetic , Algorithms , Breeding , Computer Simulation , Datasets as Topic , Genetic Association Studies , Genetic Markers , Genetics, Population/methods , Genome-Wide Association Study/methods , Genomics/methods , Models, Statistical , Quantitative Trait, Heritable , Reproducibility of Results

Training set optimization under population structure in genomic selection.

Isidro, Julio; Jannink, Jean-Luc; Akdemir, Deniz; Poland, Jesse; Heslot, Nicolas; Sorrells, Mark E.

Theor Appl Genet ; 128(1): 145-58, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25367380

ABSTRACT

KEY MESSAGE: Population structure must be evaluated before optimization of the training set population. Maximizing the phenotypic variance captured by the training set is important for optimal performance. The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determination (CDmean), mean of predictor error variance (PEVmean), stratified CDmean (StratCDmean) and random sampling, were evaluated for prediction accuracy in the presence of different levels of population structure. In the presence of population structure, the most phenotypic variation captured by a sampling method in the TRS is desirable. The wheat dataset showed mild population structure, and CDmean and stratified CDmean methods showed the highest accuracies for all the traits except for test weight and heading date. The rice dataset had strong population structure and the approach based on stratified sampling showed the highest accuracies for all traits. In general, CDmean minimized the relationship between genotypes in the TRS, maximizing the relationship between TRS and the test set. This makes it suitable as an optimization criterion for long-term selection. Our results indicated that the best selection criterion used to optimize the TRS seems to depend on the interaction of trait architecture and population structure.

Subject(s)

Genetics, Population/methods , Genomics/methods , Selection, Genetic , Breeding , Cluster Analysis , Genotype , Models, Statistical , Oryza/genetics , Phenotype , Principal Component Analysis , Triticum/genetics

Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions.

Heslot, Nicolas; Akdemir, Deniz; Sorrells, Mark E; Jannink, Jean-Luc.

Theor Appl Genet ; 127(2): 463-80, 2014 Feb.

Article in English | MEDLINE | ID: mdl-24264761

ABSTRACT

KEY MESSAGE: Development of models to predict genotype by environment interactions, in unobserved environments, using environmental covariates, a crop model and genomic selection. Application to a large winter wheat dataset. Genotype by environment interaction (G*E) is one of the key issues when analyzing phenotypes. The use of environment data to model G*E has long been a subject of interest but is limited by the same problems as those addressed by genomic selection methods: a large number of correlated predictors each explaining a small amount of the total variance. In addition, non-linear responses of genotypes to stresses are expected to further complicate the analysis. Using a crop model to derive stress covariates from daily weather data for predicted crop development stages, we propose an extension of the factorial regression model to genomic selection. This model is further extended to the marker level, enabling the modeling of quantitative trait loci (QTL) by environment interaction (Q*E), on a genome-wide scale. A newly developed ensemble method, soft rule fit, was used to improve this model and capture non-linear responses of QTL to stresses. The method is tested using a large winter wheat dataset, representative of the type of data available in a large-scale commercial breeding program. Accuracy in predicting genotype performance in unobserved environments for which weather data were available increased by 11.1% on average and the variability in prediction accuracy decreased by 10.8%. By leveraging agronomic knowledge and the large historical datasets generated by breeding programs, this new model provides insight into the genetic architecture of genotype by environment interactions and could predict genotype performance based on past and future weather scenarios.

Subject(s)

Crops, Agricultural/genetics , Gene-Environment Interaction , Genotype , Models, Genetic

Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity.

Heslot, Nicolas; Rutkoski, Jessica; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

PLoS One ; 8(9): e74612, 2013.

Article in English | MEDLINE | ID: mdl-24040295

ABSTRACT

Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

Subject(s)

Genetic Markers , Genetic Variation , Genome, Plant , Triticum/genetics , Gene Frequency , Genotype , Models, Statistical , Phenotype , Polymorphism, Single Nucleotide , Principal Component Analysis , Reproducibility of Results , Sequence Analysis, DNA

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL