Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
Add more filters










Publication year range
1.
Theor Appl Genet ; 136(12): 252, 2023 Nov 21.
Article in English | MEDLINE | ID: mdl-37987845

ABSTRACT

KEY MESSAGE: Simulations demonstrated that estimates of realized genetic gain from linear mixed models using regional trials are biased to some degree. Thus, we recommend multiple selected models to obtain a range of reasonable estimates. Genetic improvements of discrete characteristics are obvious and easy to demonstrate, while quantitative traits require reliable and accurate methods to disentangle the confounding genetic and non-genetic components. Stochastic simulations of soybean [Glycine max (L.) Merr.] breeding programs were performed to evaluate linear mixed models to estimate the realized genetic gain (RGG) from annual multi-environment trials (MET). True breeding values were simulated under an infinitesimal model to represent the genetic contributions to soybean seed yield under various MET conditions. Estimators were evaluated using objective criteria of bias and linearity. Covariance modeling and direct versus indirect estimation-based models resulted in a substantial range of estimated values, all of which were biased to some degree. Although no models produced unbiased estimates, the three best-performing models resulted in an average bias of [Formula: see text] kg/ha[Formula: see text]/yr[Formula: see text] ([Formula: see text] bu/ac[Formula: see text]/yr[Formula: see text]). Rather than relying on a single model to estimate RGG, we recommend the application of several models with minimal and directional bias. Further, based on the parameters used in the simulations, we do not think it is appropriate to use any single model to compare breeding programs or quantify the efficiency of proposed new breeding strategies. Lastly, for public soybean programs breeding for maturity groups II and III in North America, the estimated RGG values ranged from 18.16 to 39.68 kg/ha[Formula: see text]/yr[Formula: see text] (0.27-0.59 bu/ac[Formula: see text]/yr[Formula: see text]) from 1989 to 2019. These results provide strong evidence that public breeders have significantly improved soybean germplasm for seed yield in the primary production areas of North America.


Subject(s)
Glycine max , Plant Breeding , Glycine max/genetics , Cytoplasm , Linear Models , Seeds/genetics
2.
Front Plant Sci ; 14: 1218042, 2023.
Article in English | MEDLINE | ID: mdl-37860246

ABSTRACT

In maize, doubled haploid (DH) lines are created in vivo through crosses with maternal haploid inducers. Their induction ability, usually expressed as haploid induction rate (HIR), is known to be under polygenic control. Although two major genes (MTL and ZmDMP) affecting this trait were recently described, many others remain unknown. To identify them, we designed and performed a SNP based (~9007) genome-wide association study using a large and diverse panel of 159 maternal haploid inducers. Our analyses identified a major gene near MTL, which is present in all inducers and necessary to disrupt haploid induction. We also found a significant quantitative trait loci (QTL) on chromosome 10 using a case-control mapping approach, in which 793 noninducers were used as controls. This QTL harbors a kokopelli ortholog, whose role in maternal haploid induction was recently described in Arabidopsis. QTL with smaller effects were identified on six of the ten maize chromosomes, confirming the polygenic nature of this trait. These QTL could be incorporated into inducer breeding programs through marker-assisted selection approaches. Further improving HIR is important to reduce the cost of DH line production.

3.
Plant Genome ; 16(1): e20308, 2023 03.
Article in English | MEDLINE | ID: mdl-36744727

ABSTRACT

Soybean is grown primarily for the protein and oil extracted from its seed and its value is influenced by these components. The objective of this study was to map marker-trait associations (MTAs) for the concentration of seed protein, oil, and meal protein using the soybean nested association mapping (SoyNAM) population. The composition traits were evaluated on seed harvested from over 5000 inbred lines of the SoyNAM population grown in 10 field locations across 3 years. Estimated heritabilities were at least 0.85 for all three traits. The genotyping of lines with single nucleotide polymorphism markers resulted in the identification of 107 MTAs for the three traits. When MTAs for the three traits that mapped within 5 cM intervals were binned together, the MTAs were mapped to 64 intervals on 19 of the 20 soybean chromosomes. The majority of the MTA effects were small and of the 107 MTAs, 37 were for protein content, 39 for meal protein, and 31 for oil content. For cases where a protein and oil MTAs mapped to the same interval, most (94%) significant effects were opposite for the two traits, consistent with the negative correlation between these traits. A coexpression analysis identified candidate genes linked to MTAs and 18 candidate genes were identified. The large number of small effect MTAs for the composition traits suggest that genomic prediction would be more effective in improving these traits than marker-assisted selection.


Subject(s)
Glycine max , Quantitative Trait Loci , Glycine max/genetics , Chromosome Mapping/methods , Genome, Plant , Seeds/genetics
4.
Genetics ; 221(2)2022 05 31.
Article in English | MEDLINE | ID: mdl-35451475

ABSTRACT

Photosynthesis is a key target to improve crop production in many species including soybean [Glycine max (L.) Merr.]. A challenge is that phenotyping photosynthetic traits by traditional approaches is slow and destructive. There is proof-of-concept for leaf hyperspectral reflectance as a rapid method to model photosynthetic traits. However, the crucial step of demonstrating that hyperspectral approaches can be used to advance understanding of the genetic architecture of photosynthetic traits is untested. To address this challenge, we used full-range (500-2,400 nm) leaf reflectance spectroscopy to build partial least squares regression models to estimate leaf traits, including the rate-limiting processes of photosynthesis, maximum Rubisco carboxylation rate, and maximum electron transport. In total, 11 models were produced from a diverse population of soybean sampled over multiple field seasons to estimate photosynthetic parameters, chlorophyll content, leaf carbon and leaf nitrogen percentage, and specific leaf area (with R2 from 0.56 to 0.96 and root mean square error approximately <10% of the range of calibration data). We explore the utility of these models by applying them to the soybean nested association mapping population, which showed variability in photosynthetic and leaf traits. Genetic mapping provided insights into the underlying genetic architecture of photosynthetic traits and potential improvement in soybean. Notably, the maximum Rubisco carboxylation rate mapped to a region of chromosome 19 containing genes encoding multiple small subunits of Rubisco. We also mapped the maximum electron transport rate to a region of chromosome 10 containing a fructose 1,6-bisphosphatase gene, encoding an important enzyme in the regeneration of ribulose 1,5-bisphosphate and the sucrose biosynthetic pathway. The estimated rate-limiting steps of photosynthesis were low or negatively correlated with yield suggesting that these traits are not influenced by the same genetic mechanisms and are not limiting yield in the soybean NAM population. Leaf carbon percentage, leaf nitrogen percentage, and specific leaf area showed strong correlations with yield and may be of interest in breeding programs as a proxy for yield. This work is among the first to use hyperspectral reflectance to model and map the genetic architecture of the rate-limiting steps of photosynthesis.


Subject(s)
Glycine max , Ribulose-Bisphosphate Carboxylase , Carbon , Nitrogen/metabolism , Photosynthesis/genetics , Plant Breeding , Plant Leaves/genetics , Plant Leaves/metabolism , Ribulose-Bisphosphate Carboxylase/genetics , Ribulose-Bisphosphate Carboxylase/metabolism , Glycine max/genetics
5.
Genetics ; 219(3)2021 11 05.
Article in English | MEDLINE | ID: mdl-34740243

ABSTRACT

The Beavis effect in quantitative trait locus (QTL) mapping describes a phenomenon that the estimated effect size of a statistically significant QTL (measured by the QTL variance) is greater than the true effect size of the QTL if the sample size is not sufficiently large. This is a typical example of the Winners' curse applied to molecular quantitative genetics. Theoretical evaluation and correction for the Winners' curse have been studied for interval mapping. However, similar technologies have not been available for current models of QTL mapping and genome-wide association studies where a polygene is often included in the linear mixed models to control the genetic background effect. In this study, we developed the theory of the Beavis effect in a linear mixed model using a truncated noncentral Chi-square distribution. We equated the observed Wald test statistic of a significant QTL to the expectation of a truncated noncentral Chi-square distribution to obtain a bias-corrected estimate of the QTL variance. The results are validated from replicated Monte Carlo simulation experiments. We applied the new method to the grain width (GW) trait of a rice population consisting of 524 homozygous varieties with over 300 k single nucleotide polymorphism markers. Two loci were identified and the estimated QTL heritability were corrected for the Beavis effect. Bias correction for the larger QTL on chromosome 5 (GW5) with an estimated heritability of 12% did not change the QTL heritability due to the extremely large test score and estimated QTL effect. The smaller QTL on chromosome 9 (GW9) had an estimated QTL heritability of 9% reduced to 6% after the bias-correction.


Subject(s)
Chromosome Mapping/methods , Models, Genetic , Oryza/genetics , Quantitative Trait Loci , Chromosomes, Plant/genetics , Computer Simulation , Genome-Wide Association Study , Monte Carlo Method , Multifactorial Inheritance , Multivariate Analysis , Seeds/genetics
6.
Front Genet ; 12: 675500, 2021.
Article in English | MEDLINE | ID: mdl-34630507

ABSTRACT

Plant breeding is a decision-making discipline based on understanding project objectives. Genetic improvement projects can have two competing objectives: maximize the rate of genetic improvement and minimize the loss of useful genetic variance. For commercial plant breeders, competition in the marketplace forces greater emphasis on maximizing immediate genetic improvements. In contrast, public plant breeders have an opportunity, perhaps an obligation, to place greater emphasis on minimizing the loss of useful genetic variance while realizing genetic improvements. Considerable research indicates that short-term genetic gains from genomic selection are much greater than phenotypic selection, while phenotypic selection provides better long-term genetic gains because it retains useful genetic diversity during the early cycles of selection. With limited resources, must a soybean breeder choose between the two extreme responses provided by genomic selection or phenotypic selection? Or is it possible to develop novel breeding strategies that will provide a desirable compromise between the competing objectives? To address these questions, we decomposed breeding strategies into decisions about selection methods, mating designs, and whether the breeding population should be organized as family islands. For breeding populations organized into islands, decisions about possible migration rules among family islands were included. From among 60 possible strategies, genetic improvement is maximized for the first five to 10 cycles using genomic selection and a hub network mating design, where the hub parents with the largest selection metric make large parental contributions. It also requires that the breeding populations be organized as fully connected family islands, where every island is connected to every other island, and migration rules allow the exchange of two lines among islands every other cycle of selection. If the objectives are to maximize both short-term and long-term gains, then the best compromise strategy is similar except that the mating design could be hub network, chain rule, or a multi-objective optimization method-based mating design. Weighted genomic selection applied to centralized populations also resulted in the realization of the greatest proportion of the genetic potential of the founders but required more cycles than the best compromise strategy.

7.
Front Plant Sci ; 12: 544854, 2021.
Article in English | MEDLINE | ID: mdl-34220873

ABSTRACT

Trait introgression is a complex process that plant breeders use to introduce desirable alleles from one variety or species to another. Two of the major types of decisions that must be made during this sophisticated and uncertain workflow are: parental selection and resource allocation. We formulated the trait introgression problem as an engineering process and proposed a Markov Decision Processes (MDP) model to optimize the resource allocation procedure. The efficiency of the MDP model was compared with static resource allocation strategies and their trade-offs among budget, deadline, and probability of success are demonstrated. Simulation results suggest that dynamic resource allocation strategies from the MDP model significantly improve the efficiency of the trait introgression by allocating the right amount of resources according to the genetic outcome of previous generations.

8.
PLoS One ; 16(7): e0240948, 2021.
Article in English | MEDLINE | ID: mdl-34242220

ABSTRACT

In soybean variety development and genetic improvement projects, iron deficiency chlorosis (IDC) is visually assessed as an ordinal response variable. Linear Mixed Models for Genomic Prediction (GP) have been developed, compared, and used to select continuous plant traits such as yield, height, and maturity, but can be inappropriate for ordinal traits. Generalized Linear Mixed Models have been developed for GP of ordinal response variables. However, neither approach addresses the most important questions for cultivar development and genetic improvement: How frequently are the 'wrong' genotypes retained, and how often are the 'correct' genotypes discarded? The research objective reported herein was to compare outcomes from four data modeling and six algorithmic modeling GP methods applied to IDC using decision metrics appropriate for variety development and genetic improvement projects. Appropriate metrics for decision making consist of specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. Data modeling methods for GP included ridge regression, logistic regression, penalized logistic regression, and Bayesian generalized linear regression. Algorithmic modeling methods include Random Forest, Gradient Boosting Machine, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, and Artificial Neural Network. We found that a Support Vector Machine model provided the most specific decisions of correctly discarding IDC susceptible genotypes, while a Random Forest model resulted in the best decisions of retaining IDC tolerant genotypes, as well as the best outcomes when considering all decision metrics. Overall, the predictions from algorithmic modeling result in better decisions than from data modeling methods applied to soybean IDC.


Subject(s)
Algorithms , Glycine max/metabolism , Iron Deficiencies , Models, Statistical , Bayes Theorem , Cluster Analysis , Logistic Models , Machine Learning
9.
Theor Appl Genet ; 132(3): 817-849, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30798332

ABSTRACT

Maize has for many decades been both one of the most important crops worldwide and one of the primary genetic model organisms. More recently, maize breeding has been impacted by rapid technological advances in sequencing and genotyping technology, transformation including genome editing, doubled haploid technology, parallelled by progress in data sciences and the development of novel breeding approaches utilizing genomic information. Herein, we report on past, current and future developments relevant for maize breeding with regard to (1) genome analysis, (2) germplasm diversity characterization and utilization, (3) manipulation of genetic diversity by transformation and genome editing, (4) inbred line development and hybrid seed production, (5) understanding and prediction of hybrid performance, (6) breeding methodology and (7) synthesis of opportunities and challenges for future maize breeding.


Subject(s)
Plant Breeding/methods , Zea mays/genetics , Chromosome Mapping , Genetic Variation , Genome, Plant , Genomics
10.
Heredity (Edinb) ; 122(5): 672-683, 2019 05.
Article in English | MEDLINE | ID: mdl-30262841

ABSTRACT

The purpose of breeding programs is to obtain sustainable gains in multiple traits while controlling the loss of genetic variation. The decisions at each breeding cycle involve multiple, usually competing, objectives; these complex decisions can be supported by the insights that are gained by applying multi-objective optimization principles to breeding. The discussion in this manuscript includes the definition of several multi-objective optimized breeding approaches within the phenotypic or genomic breeding frameworks and the comparison of these approaches with the standard multi-trait breeding schemes such as tandem selection, independent culling and index selection. Proposed methods are demonstrated with two empirical data sets and simulations. In addition, we have described several graphical tools that can aid breeders in arriving at a compromise decision. The results show that the proposed methodology is a viable approach to answer several real breeding problems. In simulations, the newly proposed methods resulted in gains larger than the methods previously proposed including index selection: Compared to the best alternative breeding strategy, the gains from multi-objective optimized parental proportions approaches were about 20-30% higher at the end of long-term simulations of breeding cycles. In addition, the flexibility of the multi-objective optimized breeding strategies were displayed with methods and examples covering non-dominated selection, assignment of optimal parental proportions, using genomewide marker effects in producing optimal mating designs, and finally in selection of training populations for genomic prediction.


Subject(s)
Breeding , Genome/genetics , Computer Simulation , Genetic Markers/genetics , Genetic Variation , Genomics , Models, Genetic , Phenotype , Quantitative Trait, Heritable , Selection, Genetic
11.
Bioinformatics ; 35(14): 2512-2514, 2019 07 15.
Article in English | MEDLINE | ID: mdl-30508039

ABSTRACT

SUMMARY: We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. AVAILABILITY AND IMPLEMENTATION: GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study , Software , Computers
12.
G3 (Bethesda) ; 8(10): 3367-3375, 2018 10 03.
Article in English | MEDLINE | ID: mdl-30131329

ABSTRACT

Soybean is the world's leading source of vegetable protein and demand for its seed continues to grow. Breeders have successfully increased soybean yield, but the genetic architecture of yield and key agronomic traits is poorly understood. We developed a 40-mating soybean nested association mapping (NAM) population of 5,600 inbred lines that were characterized by single nucleotide polymorphism (SNP) markers and six agronomic traits in field trials in 22 environments. Analysis of the yield, agronomic, and SNP data revealed 23 significant marker-trait associations for yield, 19 for maturity, 15 for plant height, 17 for plant lodging, and 29 for seed mass. A higher frequency of estimated positive yield alleles was evident from elite founder parents than from exotic founders, although unique desirable alleles from the exotic group were identified, demonstrating the value of expanding the genetic base of US soybean breeding.


Subject(s)
Glycine max/genetics , Quantitative Trait Loci , Quantitative Trait, Heritable , Chromosome Mapping , Chromosomes, Plant , Gene Expression Regulation, Plant , Genetics, Population , Genome, Plant , Phenotype , Polymorphism, Single Nucleotide
13.
G3 (Bethesda) ; 8(2): 519-529, 2018 02 02.
Article in English | MEDLINE | ID: mdl-29217731

ABSTRACT

Genetic improvement toward optimized and stable agronomic performance of soybean genotypes is desirable for food security. Understanding how genotypes perform in different environmental conditions helps breeders develop sustainable cultivars adapted to target regions. Complex traits of importance are known to be controlled by a large number of genomic regions with small effects whose magnitude and direction are modulated by environmental factors. Knowledge of the constraints and undesirable effects resulting from genotype by environmental interactions is a key objective in improving selection procedures in soybean breeding programs. In this study, the genetic basis of soybean grain yield responsiveness to environmental factors was examined in a large soybean nested association population. For this, a genome-wide association to performance stability estimates generated from a Finlay-Wilkinson analysis and the inclusion of the interaction between marker genotypes and environmental factors was implemented. Genomic footprints were investigated by analysis and meta-analysis using a recently published multiparent model. Results indicated that specific soybean genomic regions were associated with stability, and that multiplicative interactions were present between environments and genetic background. Seven genomic regions in six chromosomes were identified as being associated with genotype-by-environment interactions. This study provides insight into genomic assisted breeding aimed at achieving a more stable agronomic performance of soybean, and documented opportunities to exploit genomic regions that were specifically associated with interactions involving environments and subpopulations.


Subject(s)
Edible Grain/genetics , Gene-Environment Interaction , Genome, Plant/genetics , Genome-Wide Association Study/methods , Glycine max/genetics , Chromosome Mapping , Chromosomes, Plant/genetics , Genes, Plant/genetics , Genotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics , Seeds/genetics
14.
Plant Genome ; 10(2)2017 07.
Article in English | MEDLINE | ID: mdl-28724064

ABSTRACT

A set of nested association mapping (NAM) families was developed by crossing 40 diverse soybean [ (L.) Merr.] genotypes to the common cultivar. The 41 parents were deeply sequenced for SNP discovery. Based on the polymorphism of the single-nucleotide polymorphisms (SNPs) and other selection criteria, a set of SNPs was selected to be included in the SoyNAM6K BeadChip for genotyping the parents and 5600 RILs from the 40 families. Analysis of the SNP profiles of the RILs showed a low average recombination rate. We constructed genetic linkage maps for each family and a composite linkage map based on recombinant inbred lines (RILs) across the families and identified and annotated 525,772 high confidence SNPs that were used to impute the SNP alleles in the RILs. The segregation distortion in most families significantly favored the alleles from the female parent, and there was no significant difference of residual heterozygosity in the euchromatic vs. heterochromatic regions. The genotypic datasets for the RILs and parents are publicly available and are anticipated to be useful to map quantitative trait loci (QTL) controlling important traits in soybean.


Subject(s)
Genes, Plant , Glycine max/genetics , Alleles , Genetic Linkage , Genotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Recombination, Genetic
15.
G3 (Bethesda) ; 7(9): 3103-3113, 2017 09 07.
Article in English | MEDLINE | ID: mdl-28720710

ABSTRACT

An epistatic genetic architecture can have a significant impact on prediction accuracies of genomic prediction (GP) methods. Machine learning methods predict traits comprised of epistatic genetic architectures more accurately than statistical methods based on additive mixed linear models. The differences between these types of GP methods suggest a diagnostic for revealing genetic architectures underlying traits of interest. In addition to genetic architecture, the performance of GP methods may be influenced by the sample size of the training population, the number of QTL, and the proportion of phenotypic variability due to genotypic variability (heritability). Possible values for these factors and the number of combinations of the factor levels that influence the performance of GP methods can be large. Thus, efficient methods for identifying combinations of factor levels that produce most accurate GPs is needed. Herein, we employ response surface methods (RSMs) to find the experimental conditions that produce the most accurate GPs. We illustrate RSM with an example of simulated doubled haploid populations and identify the combination of factors that maximize the difference between prediction accuracies of best linear unbiased prediction (BLUP) and support vector machine (SVM) GP methods. The greatest impact on the response is due to the genetic architecture of the population, heritability of the trait, and the sample size. When epistasis is responsible for all of the genotypic variance and heritability is equal to one and the sample size of the training population is large, the advantage of using the SVM method vs. the BLUP method is greatest. However, except for values close to the maximum, most of the response surface shows little difference between the methods. We also determined that the conditions resulting in the greatest prediction accuracy for BLUP occurred when genetic architecture consists solely of additive effects, and heritability is equal to one.


Subject(s)
Computational Biology/methods , Genomics/methods , Models, Genetic , Algorithms , Computer Simulation , Epistasis, Genetic , Haploidy , Machine Learning , Quantitative Trait Loci , Reproducibility of Results
16.
PLoS One ; 12(6): e0179191, 2017.
Article in English | MEDLINE | ID: mdl-28598989

ABSTRACT

The objective of this study was to explore the potential of genomic prediction (GP) for soybean resistance against Sclerotinia sclerotiorum (Lib.) de Bary, the causal agent of white mold (WM). A diverse panel of 465 soybean plant introduction accessions was phenotyped for WM resistance in replicated field and greenhouse tests. All plant accessions were previously genotyped using the SoySNP50K BeadChip. The predictive ability of six GP models were compared, and the impact of marker density and training population size on the predictive ability was investigated. Cross-prediction among environments was tested to determine the effectiveness of the prediction models. GP models had similar prediction accuracies for all experiments. Predictive ability did not improve significantly by using more than 5k SNPs, or by increasing the training population size (from 50% to 90% of the total of individuals). The GP model effectively predicted WM resistance across field and greenhouse experiments when each was used as either the training or validation population. The GP model was able to identify WM-resistant accessions in the USDA soybean germplasm collection that had previously been reported and were not included in the study panel. This study demonstrated the applicability of GP to identify useful genetic sources of WM resistance for soybean breeding. Further research will confirm the applicability of the proposed approach to other complex disease resistance traits and in other crops.


Subject(s)
Crops, Agricultural/genetics , Genetic Association Studies , Genome, Plant , Genomics , Seeds/genetics , Disease Resistance/genetics , Genetic Markers , Genetics, Population , Genomics/methods , Genotype , Phenotype , Polymorphism, Single Nucleotide , Glycine max/genetics
17.
Theor Appl Genet ; 130(10): 1993-2004, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28647895

ABSTRACT

KEY MESSAGE: Using an Operations Research approach, we demonstrate design of optimal trait introgression projects with respect to competing objectives. We demonstrate an innovative approach for designing Trait Introgression (TI) projects based on optimization principles from Operations Research. If the designs of TI projects are based on clear and measurable objectives, they can be translated into mathematical models with decision variables and constraints that can be translated into Pareto optimality plots associated with any arbitrary selection strategy. The Pareto plots can be used to make rational decisions concerning the trade-offs between maximizing the probability of success while minimizing costs and time. The systematic rigor associated with a cost, time and probability of success (CTP) framework is well suited to designing TI projects that require dynamic decision making. The CTP framework also revealed that previously identified 'best' strategies can be improved to be at least twice as effective without increasing time or expenses.


Subject(s)
Models, Genetic , Plant Breeding/methods , Alleles , Computer Simulation , Crosses, Genetic , Genetic Loci , Genotype , Selection, Genetic
18.
Evol Bioinform Online ; 13: 1176934316688663, 2017.
Article in English | MEDLINE | ID: mdl-28469375

ABSTRACT

We introduce software, Numericware i, to compute identical by state (IBS) matrix based on genotypic data. Calculating an IBS matrix with a large dataset requires large computer memory and takes lengthy processing time. Numericware i addresses these challenges with 2 algorithmic methods: multithreading and forward chopping. The multithreading allows computational routines to concurrently run on multiple central processing unit (CPU) processors. The forward chopping addresses memory limitation by dividing a dataset into appropriately sized subsets. Numericware i allows calculation of the IBS matrix for a large genotypic dataset using a laptop or a desktop computer. For comparison with different software, we calculated genetic relationship matrices using Numericware i, SPAGeDi, and TASSEL with the same genotypic dataset. Numericware i calculates IBS coefficients between 0 and 2, whereas SPAGeDi and TASSEL produce different ranges of values including negative values. The Pearson correlation coefficient between the matrices from Numericware i and TASSEL was high at .9972, whereas SPAGeDi showed low correlation with Numericware i (.0505) and TASSEL (.0587). With a high-dimensional dataset of 500 entities by 10 000 000 SNPs, Numericware i spent 382 minutes using 19 CPU threads and 64 GB memory by dividing the dataset into 3 pieces, whereas SPAGeDi and TASSEL failed with the same dataset. Numericware i is freely available for Windows and Linux under CC-BY 4.0 license at https://figshare.com/s/f100f33a8857131eb2db.

19.
Genetics ; 205(4): 1409-1423, 2017 04.
Article in English | MEDLINE | ID: mdl-28122824

ABSTRACT

We consider the plant genetic improvement challenge of introgressing multiple alleles from a homozygous donor to a recipient. First, we frame the project as an algorithmic process that can be mathematically formulated. We then introduce a novel metric for selecting breeding parents that we refer to as the predicted cross value (PCV). Unlike estimated breeding values, which represent predictions of general combining ability, the PCV predicts specific combining ability. The PCV takes estimates of recombination frequencies as an input vector and calculates the probability that a pair of parents will produce a gamete with desirable alleles at all specified loci. We compared the PCV approach with existing estimated-breeding-value approaches in two simulation experiments, in which 7 and 20 desirable alleles were to be introgressed from a donor line into a recipient line. Results suggest that the PCV is more efficient and effective for multi-allelic trait introgression. We also discuss how operations research can be used for other crop genetic improvement projects and suggest several future research directions.


Subject(s)
Alleles , Crops, Agricultural/genetics , Models, Genetic , Plant Breeding/methods , Algorithms , Genetic Loci , Hybridization, Genetic , Recombination, Genetic , Selection, Genetic
20.
J Hered ; 107(7): 686-690, 2016.
Article in English | MEDLINE | ID: mdl-27729447

ABSTRACT

We present the generalized numerator relationship matrix (GNRM) algorithm and Numericware N as a software tool for calculating the numerator relationship matrix (NRM). The GNRM algorithm aims to build the NRM based on plant pedigrees. Customary plant pedigrees have a sparse format representing multiple ancestors and offspring. Applying the existing NRM algorithm to plant pedigrees requires transforming the pedigree statements from sparse (multi-founders to offspring) to dense (bi-parents to offspring). The GNRM algorithm enables the computation of the NRM using sparse pedigrees. Because sparse pedigrees can be used, Numericware N produces smaller dimensions of the NRM, thus making computing time much faster. Moreover, Numericware N enables expansion of identical by state (IBS) matrix for scheduled pedigrees, which allows prediction of IBS matrix.


Subject(s)
Computational Biology/methods , Models, Genetic , Software , Algorithms , Pedigree
SELECTION OF CITATIONS
SEARCH DETAIL
...