Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Genet Sel Evol ; 55(1): 57, 2023 Aug 07.
Article in English | MEDLINE | ID: mdl-37550618

ABSTRACT

BACKGROUND: Most genomic prediction applications in animal breeding use genotypes with tens of thousands of single nucleotide polymorphisms (SNPs). However, modern sequencing technologies and imputation algorithms can generate ultra-high-density genotypes (including millions of SNPs) at an affordable cost. Empirical studies have not produced clear evidence that using ultra-high-density genotypes can significantly improve prediction accuracy. However, (whole-genome) prediction accuracy is not very informative about the ability of a model to capture the genetic signals from specific genomic regions. To address this problem, we propose a simple methodology that detects chromosome regions for which a specific model (e.g., single-step genomic best linear unbiased prediction (ssGBLUP)) may fail to fully capture the genetic signal present in such segments-a phenomenon that we refer to as signal leakage. We propose to detect regions with evidence of signal leakage by testing the association of residuals from a pedigree or a genomic model with SNP genotypes. We discuss how this approach can be used to map regions with signals that are poorly captured by a model and to identify strategies to fix those problems (e.g., using a different prior or increasing marker density). Finally, we explored the proposed approach to scan for signal leakage of different models (pedigree-based, ssGBLUP, and various Bayesian models) applied to growth-related phenotypes (average daily gain and backfat thickness) in pigs. RESULTS: We report widespread evidence of signal leakage for pedigree-based models. Including a percentage of animals with SNP data in ssGBLUP reduced the extent of signal leakage. However, local peaks of missed signals remained in some regions, even when all animals were genotyped. Using variable selection priors solves leakage points that are caused by excessive shrinkage of marker effects. Nevertheless, these models still miss signals in some regions due to low linkage disequilibrium between the SNPs on the array used and causal variants. Thus, we discuss how such problems could be addressed by adding sequence SNPs from those regions to the prediction model. CONCLUSIONS: Residual single-marker regression analysis is a simple approach that can be used to detect regional genomic signals that are poorly captured by a model and to indicate ways to fix such problems.


Subject(s)
Genome , Genomics , Animals , Swine , Bayes Theorem , Genomics/methods , Genotype , Phenotype , Polymorphism, Single Nucleotide , Pedigree , Models, Genetic
2.
Genet Sel Evol ; 54(1): 65, 2022 Sep 24.
Article in English | MEDLINE | ID: mdl-36153511

ABSTRACT

BACKGROUND: Early simulations indicated that whole-genome sequence data (WGS) could improve the accuracy of genomic predictions within and across breeds. However, empirical results have been ambiguous so far. Large datasets that capture most of the genomic diversity in a population must be assembled so that allele substitution effects are estimated with high accuracy. The objectives of this study were to use a large pig dataset from seven intensely selected lines to assess the benefits of using WGS for genomic prediction compared to using commercial marker arrays and to identify scenarios in which WGS provides the largest advantage. METHODS: We sequenced 6931 individuals from seven commercial pig lines with different numerical sizes. Genotypes of 32.8 million variants were imputed for 396,100 individuals (17,224 to 104,661 per line). We used BayesR to perform genomic prediction for eight complex traits. Genomic predictions were performed using either data from a standard marker array or variants preselected from WGS based on association tests. RESULTS: The accuracies of genomic predictions based on preselected WGS variants were not robust across traits and lines and the improvements in prediction accuracy that we achieved so far with WGS compared to standard marker arrays were generally small. The most favourable results for WGS were obtained when the largest training sets were available and standard marker arrays were augmented with preselected variants with statistically significant associations to the trait. With this method and training sets of around 80k individuals, the accuracy of within-line genomic predictions was on average improved by 0.025. With multi-line training sets, improvements of 0.04 compared to marker arrays could be expected. CONCLUSIONS: Our results showed that WGS has limited potential to improve the accuracy of genomic predictions compared to marker arrays in intensely selected pig lines. Thus, although we expect that larger improvements in accuracy from the use of WGS are possible with a combination of larger training sets and optimised pipelines for generating and analysing such datasets, the use of WGS in the current implementations of genomic prediction should be carefully evaluated against the cost of large-scale WGS data on a case-by-case basis.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Alleles , Animals , Genomics/methods , Genotype , Swine/genetics
3.
Genet Sel Evol ; 54(1): 39, 2022 Jun 03.
Article in English | MEDLINE | ID: mdl-35659233

ABSTRACT

BACKGROUND: It is expected that functional, mainly missense and loss-of-function (LOF), and regulatory variants are responsible for most phenotypic differences between breeds and genetic lines of livestock species that have undergone diverse selection histories. However, there is still limited knowledge about the existing missense and LOF variation in commercial livestock populations, in particular regarding population-specific variation and how it can affect applications such as across-breed genomic prediction. METHODS: We re-sequenced the whole genome of 7848 individuals from nine commercial pig lines (average sequencing coverage: 4.1×) and imputed whole-genome genotypes for 440,610 pedigree-related individuals. The called variants were categorized according to predicted functional annotation (from LOF to intergenic) and prevalence level (number of lines in which the variant segregated; from private to widespread). Variants in each category were examined in terms of their distribution along the genome, alternative allele frequency, per-site Wright's fixation index (FST), individual load, and association to production traits. RESULTS: Of the 46 million called variants, 28% were private (called in only one line) and 21% were widespread (called in all nine lines). Genomic regions with a low recombination rate were enriched with private variants. Low-prevalence variants (called in one or a few lines only) were enriched for lower allele frequencies, lower FST, and putatively functional and regulatory roles (including LOF and deleterious missense variants). On average, individuals carried fewer private deleterious missense alleles than expected compared to alleles with other predicted consequences. Only a small subset of the low-prevalence variants had intermediate allele frequencies and explained small fractions of phenotypic variance (up to 3.2%) of production traits. The significant low-prevalence variants had higher per-site FST than the non-significant ones. These associated low-prevalence variants were tagged by other more widespread variants in high linkage disequilibrium, including intergenic variants. CONCLUSIONS: Most low-prevalence variants have low minor allele frequencies and only a small subset of low-prevalence variants contributed detectable fractions of phenotypic variance of production traits. Accounting for low-prevalence variants is therefore unlikely to noticeably benefit across-breed analyses, such as the prediction of genomic breeding values in a population using reference populations of a different genetic background.


Subject(s)
Genome , Polymorphism, Single Nucleotide , Animals , Gene Frequency , Genetic Variation , Genomics , Genotype , Swine/genetics
4.
J Anim Sci ; 99(1)2021 Jan 01.
Article in English | MEDLINE | ID: mdl-33313883

ABSTRACT

In the pig industry, purebred animals are raised in nucleus herds and selected to produce crossbred progeny to perform in commercial environments. Crossbred and purebred performances are different, correlated traits. All purebreds in a pen have their performance assessed together at the end of a performance test. However, only selected crossbreds are removed (based on visual inspection) and measured at different times creating many small contemporary groups (CGs). This may reduce estimated breeding value (EBV) prediction accuracies. Considering this sequential recording of crossbreds, the objective was to investigate the impact of different CG definitions on genetic parameters and EBV prediction accuracy for crossbred traits. Growth rate (GP) and ultrasound backfat (BFP) records were available for purebreds. Lifetime growth (GX) and backfat (BFX) were recorded on crossbreds. Different CGs were tested: CG_all included farm, sex, birth year, and birth week; CG_week added slaughter week; and CG_day used slaughter day instead of week. Data of 124,709 crossbreds were used. The purebred phenotypes (62,274 animals) included three generations of purebred ancestors of these crossbreds and their CG mates. Variance components for four-trait models with different CG definitions were estimated with average information restricted maximum likelihood. Purebred traits' variance components remained stable across CG definitions and varied slightly for BFX. Additive genetic variances (and heritabilities) for GX fluctuated more: 812 ± 36 (0.28 ± 0.01), 257 ± 15 (0.17 ± 0.01), and 204 ± 13 (0.15 ± 0.01) for CG_all, CG_week, and CG_day, respectively. Age at slaughter (AAS) and hot carcass weight (HCW) adjusted for age were investigated as alternatives for GX. Both have potential for selection but lower heritabilities compared with GX: 0.21 ± 0.01 (0.18 ± 0.01), 0.16 ± 0.02 (0.16 + 0.01), and 0.10 ± 0.01 (0.14 ± 0.01) for AAS (HCW) using CG_all, CG_week, and CG_day, respectively. The predictive ability, linear regression (LR) accuracy, bias, and dispersion of crossbred traits in crossbreds favored CG_day, but correlations with unadjusted phenotypes favored CG_all. In purebreds, CG_all showed the best LR accuracy, while showing small relative differences in bias and dispersion. Different CG scenarios showed no relevant impact on BFX EBV. This study shows that different CG definitions may affect evaluation stability and animal ranking. Results suggest that ignoring slaughter dates in CG is more appropriate for estimating crossbred trait EBV for purebred animals.


Subject(s)
Hybridization, Genetic , Models, Genetic , Animals , Phenotype , Swine/genetics
5.
Prev Vet Med ; 174: 104856, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31786406

ABSTRACT

Pig production in the United States is based on multi-site systems in which pigs are transported between farms after the conclusion of each particular production phase. Although ground transportation is a critical component of the pork supply chain, it might constitute a potential route of infectious disease dissemination. Here, we used a time series network analysis to: (1) describe pig movement flow in a multi-site production system in Iowa, USA, (2) conduct percolation analysis to investigate network robustness to interventions for diseases with different transmissibility, and (3) assess the potential impact of each farm type on disease dissemination across the system. Movement reports from 2014-2016 were provided by Iowa Select Farms, Iowa Fall, IA. A total of 76,566 shipments across sites was analyzed, and time series network analyses with temporal resolution of 1, 3, 6, 12, and 36 months were considered. The general topological properties of networks with resolution of 1, 3, 6, and 12 months were compared with the whole period static network (36 months) and included the following features: number of nodes and edges, degree assortativity, density, average path length, diameter, clustering coefficients, giant strongly connected component, giant weakly connected component, giant in component, and giant out component. Small-world and scale-free topologies, centrality parameters, and percolation analysis were investigated for the networks with 1-month window. Networks' robustness to interventions was assessed by using the Basic Reproduction Number (R0). Centrality parameters indicate that gilt development units (GDU), nursery, and sow farms have more central role in the pig production hierarchical structure. Therefore, they are potentially major factors of introduction and spread of diseases over the system. Wean-to-finishing and finishing sites displayed high in-degree values, indicating that they are more susceptible to be infected. Percolation analysis combined with general properties (i.e. heavy-tailed distributions and degree disassortative) suggested that networks with 1-month time resolution were highly responsive to interventions. Furthermore, the characteristics of a disease should have strong implications in the biosecurity practices across production sites. For instance, biosecurity practices should be focused on sow farms for highly contagious disease (e.g., foot and mouth disease), while it should target nursery sites in the case of a less contagious diseases (i.e. mycobacterial infections). Understanding the patterns of swine movements is crucial for the swine industry decision-making in the case of an epidemic, as well as to design cost-effective approaches to monitor, prevent, control and eradicate infectious diseases in multi-site systems.


Subject(s)
Sus scrofa , Swine Diseases/transmission , Transportation , Animal Husbandry , Animals , Iowa , Swine
6.
Front Genet ; 9: 455, 2018.
Article in English | MEDLINE | ID: mdl-30356716

ABSTRACT

Network based statistical models accounting for putative causal relationships among multiple phenotypes can be used to infer single-nucleotide polymorphism (SNP) effect which transmitting through a given causal path in genome-wide association studies (GWAS). In GWAS with multiple phenotypes, reconstructing underlying causal structures among traits and SNPs using a single statistical framework is essential for understanding the entirety of genotype-phenotype maps. A structural equation model (SEM) can be used for such purposes. We applied SEM to GWAS (SEM-GWAS) in chickens, taking into account putative causal relationships among breast meat (BM), body weight (BW), hen-house production (HHP), and SNPs. We assessed the performance of SEM-GWAS by comparing the model results with those obtained from traditional multi-trait association analyses (MTM-GWAS). Three different putative causal path diagrams were inferred from highest posterior density (HPD) intervals of 0.75, 0.85, and 0.95 using the inductive causation algorithm. A positive path coefficient was estimated for BM → BW, and negative values were obtained for BM → HHP and BW → HHP in all implemented scenarios. Further, the application of SEM-GWAS enabled the decomposition of SNP effects into direct, indirect, and total effects, identifying whether a SNP effect is acting directly or indirectly on a given trait. In contrast, MTM-GWAS only captured overall genetic effects on traits, which is equivalent to combining the direct and indirect SNP effects from SEM-GWAS. Although MTM-GWAS and SEM-GWAS use the similar probabilistic models, we provide evidence that SEM-GWAS captures complex relationships in terms of causal meaning and mediation and delivers a more comprehensive understanding of SNP effects compared to MTM-GWAS. Our results showed that SEM-GWAS provides important insight regarding the mechanism by which identified SNPs control traits by partitioning them into direct, indirect, and total SNP effects.

7.
J Dairy Sci ; 100(10): 8443-8450, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28780093

ABSTRACT

In animal production, it is often important to investigate causal relationships among variables. The gold standard tool for such investigation is randomized experiments. However, randomized experiments may not always be feasible, possible, or cost effective or reflect real-world farm conditions. Sometimes it is necessary to infer effects from farm-recorded data. Inferring causal effects between variables from field data is challenging because the association between them may arise not only from the effect of one on another but also from confounding background factors. Propensity score (PS) methods address this issue by correcting for confounding in different levels of the causal variable, which allows unbiased inference of causal effects. Here the objective was to estimate the causal effect of prolificacy on milk yield (MY) in dairy sheep using PS based on matched samples. Data consisted of 4,319 records from 1,534 crossbred ewes. Confounders were lactation number (first, second, and third through sixth) and dairy breed composition (<0.5, 0.5-0.75, and >0.75 of East Friesian or Lacaune). The causal variable prolificacy was considered as 2 levels (single or multiple lambs at birth). The outcome MY represented the volume of milk produced in the whole lactation. Pairs of single- and multiple-birth ewes (1,166) with similar PS were formed. The matching process diminished major discrepancies in the distribution of prolificacy for each confounder variable indicating bias reduction (cutoff standardized bias = 20%). The causal effect was estimated as the average difference within pairs. The effect of prolificacy on MY per lactation was 20.52 L of milk with a simple matching estimator and 12.62 L after correcting for remaining biases. A core advantage of causal over probabilistic approaches is that they allow inference of how variables would react as a result of external interventions (e.g., changes in the production system). Therefore, results imply that management and decision-making practices increasing prolificacy would positively affect MY, which is important knowledge at the farm level. Farm-recorded data can be a valuable source of information given its low cost, and it reflects real-world herd conditions. In this context, PS methods can be extremely useful as an inference tool for investigating causal effects. In addition, PS analysis can be implemented as a preliminary evaluation or a hypothesis generator for future randomized trials (if the trait analyzed allows randomization).


Subject(s)
Lactation , Milk/metabolism , Propensity Score , Animals , Breeding , Confounding Factors, Epidemiologic , Female , Litter Size/physiology , Sheep
8.
Genet Sel Evol ; 49(1): 16, 2017 02 01.
Article in English | MEDLINE | ID: mdl-28148241

ABSTRACT

BACKGROUND: Genomic selection has been successfully implemented in plant and animal breeding programs to shorten generation intervals and accelerate genetic progress per unit of time. In practice, genomic selection can be used to improve several correlated traits simultaneously via multiple-trait prediction, which exploits correlations between traits. However, few studies have explored multiple-trait genomic selection. Our aim was to infer genetic correlations between three traits measured in broiler chickens by exploring kinship matrices based on a linear combination of measures of pedigree and marker-based relatedness. A predictive assessment was used to gauge genetic correlations. METHODS: A multivariate genomic best linear unbiased prediction model was designed to combine information from pedigree and genome-wide markers in order to assess genetic correlations between three complex traits in chickens, i.e. body weight at 35 days of age (BW), ultrasound area of breast meat (BM) and hen-house egg production (HHP). A dataset with 1351 birds that were genotyped with the 600 K Affymetrix platform was used. A kinship kernel (K) was constructed as K = λ G + (1 - λ)A, where A is the numerator relationship matrix, measuring pedigree-based relatedness, and G is a genomic relationship matrix. The weight (λ) assigned to each source of information varied over the grid λ = (0, 0.2, 0.4, 0.6, 0.8, 1). Maximum likelihood estimates of heritability and genetic correlations were obtained at each λ, and the "optimum" λ was determined using cross-validation. RESULTS: Estimates of genetic correlations were affected by the weight placed on the source of information used to build K. For example, the genetic correlation between BW-HHP and BM-HHP changed markedly when λ varied from 0 (only A used for measuring relatedness) to 1 (only genomic information used). As λ increased, predictive correlations (correlation between observed phenotypes and predicted breeding values) increased and mean-squared predictive error decreased. However, the improvement in predictive ability was not monotonic, with an optimum found at some 0 < λ < 1, i.e., when both sources of information were used together. CONCLUSIONS: Our findings indicate that multiple-trait prediction may benefit from combining pedigree and marker information. Also, it appeared that expected correlated responses to selection computed from standard theory may differ from realized responses. The predictive assessment provided a metric for performance evaluation as well as a means for expressing uncertainty of outcomes of multiple-trait selection.


Subject(s)
Chickens/genetics , Genetic Association Studies , Genetic Markers , Quantitative Trait Loci , Quantitative Trait, Heritable , Animals , Body Weight/genetics , Genome-Wide Association Study , Genotype , Models, Genetic , Phenotype
9.
Genet Sel Evol ; 48: 10, 2016 Feb 03.
Article in English | MEDLINE | ID: mdl-26842494

ABSTRACT

BACKGROUND: Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. METHODS: A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. RESULTS: Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. CONCLUSIONS: All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.


Subject(s)
Chickens/genetics , Genome , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Animals , Body Weight/genetics , Datasets as Topic , Eggs , Female , Genomics , Genotype , Meat/analysis , Phenotype , Selection, Genetic
10.
Genet Sel Evol ; 48: 7, 2016 Jan 29.
Article in English | MEDLINE | ID: mdl-26830208

ABSTRACT

BACKGROUND: The objective of this study was to evaluate the accuracy of genomic predictions for rib eye area (REA), backfat thickness (BFT), and hot carcass weight (HCW) in Nellore beef cattle from Brazilian commercial herds using different prediction models. METHODS: Phenotypic data from 1756 Nellore steers from ten commercial herds in Brazil were used. Animals were offspring of 294 sires and 1546 dams, reared on pasture, feedlot finished, and slaughtered at approximately 2 years of age. All animals were genotyped using a 777k Illumina Bovine HD SNP chip. Accuracy of genomic predictions of breeding values was evaluated by using a 5-fold cross-validation scheme and considering three models: Bayesian ridge regression (BRR), Bayes C (BC) and Bayesian Lasso (BL), and two types of response variables: traditional estimated breeding value (EBV), and phenotype adjusted for fixed effects (Y*). RESULTS: The prediction accuracies achieved with the BRR model were equal to 0.25 (BFT), 0.33 (HCW) and 0.36 (REA) when EBV was used as response variable, and 0.21 (BFT), 0.37 (HCW) and 0.46 (REA) when using Y*. Results obtained with the BC and BL models were similar. Accuracies increased for traits with a higher heritability, and using Y* instead of EBV as response variable resulted in higher accuracy when heritability was higher. CONCLUSIONS: Our results indicate that the accuracy of genomic prediction of carcass traits in Nellore cattle is moderate to high. Prediction of genomic breeding values from adjusted phenotypes Y* was more accurate than from EBV, especially for highly heritable traits. The three models considered (BRR, BC and BL) led to similar predictive abilities and, thus, either one could be used to implement genomic prediction for carcass traits in Nellore cattle.


Subject(s)
Cattle/genetics , Models, Genetic , Quantitative Trait, Heritable , Red Meat , Selective Breeding , Animals , Bayes Theorem , Brazil , Genomics/methods , Genotype , Male , Phenotype , Polymorphism, Single Nucleotide
11.
BMC Syst Biol ; 9: 58, 2015 Sep 16.
Article in English | MEDLINE | ID: mdl-26376630

ABSTRACT

BACKGROUND: Joint modeling and analysis of phenotypic, genotypic and transcriptomic data have the potential to uncover the genetic control of gene activity and phenotypic variation, as well as shed light on the manner and extent of connectedness among these variables. Current studies mainly report associations, i.e. undirected connections among variables without causal interpretation. Knowledge regarding causal relationships among genes and phenotypes can be used to predict the behavior of complex systems, as well as to optimize management practices and selection strategies. Here, we performed a multistep procedure for inferring causal networks underlying carcass fat deposition and muscularity in pigs using multi-omics data obtained from an F2 Duroc x Pietrain resource pig population. RESULTS: We initially explored marginal associations between genotypes and phenotypic and expression traits through whole-genome scans, and then, in genomic regions with multiple significant hits, we assessed gene-phenotype network reconstruction using causal structural learning algorithms. One genomic region on SSC6 showed significant associations with three relevant phenotypes, off-midline10th-rib backfat thickness, loin muscle weight, and average intramuscular fat percentage, and also with the expression of seven genes, including ZNF24, SSX2IP, and AKR7A2. The inferred network indicated that the genotype affects the three phenotypes mainly through the expression of several genes. Among the phenotypes, fat deposition traits negatively affected loin muscle weight. CONCLUSIONS: Our findings shed light on the antagonist relationship between carcass fat deposition and lean meat content in pigs. In addition, the procedure described in this study has the potential to unravel gene-phenotype networks underlying complex phenotypes.


Subject(s)
Adipose Tissue/metabolism , Gene Expression Profiling , Genotype , Meat , Muscles/metabolism , Phenotype , Swine/genetics , Adipose Tissue/cytology , Algorithms , Animals , Female , Male , Muscles/anatomy & histology , Organ Size , Quantitative Trait Loci/genetics , Swine/anatomy & histology
12.
Genet Sel Evol ; 47: 45, 2015 May 13.
Article in English | MEDLINE | ID: mdl-25968045

ABSTRACT

BACKGROUND: Recently, selection for milk technological traits was initiated in the Italian dairy cattle industry based on direct measures of milk coagulation properties (MCP) such as rennet coagulation time (RCT) and curd firmness 30 min after rennet addition (a30) and on some traditional milk quality traits that are used as predictors, such as somatic cell score (SCS) and casein percentage (CAS). The aim of this study was to shed light on the causal relationships between traditional milk quality traits and MCP. Different structural equation models that included causal effects of SCS and CAS on RCT and a30 and of RCT on a30 were implemented in a Bayesian framework. RESULTS: Our results indicate a non-zero magnitude of the causal relationships between the traits studied. Causal effects of SCS and CAS on RCT and a30 were observed, which suggests that the relationship between milk coagulation ability and traditional milk quality traits depends more on phenotypic causal pathways than directly on common genetic influence. While RCT does not seem to be largely controlled by SCS and CAS, some of the variation in a30 depends on the phenotypes of these traits. However, a30 depends heavily on coagulation time. Our results also indicate that, when direct effects of SCS, CAS and RCT are considered simultaneously, most of the overall genetic variability of a30 is mediated by other traits. CONCLUSIONS: This study suggests that selection for RCT and a30 should not be performed on correlated traits such as SCS or CAS but on direct measures because the ability of milk to coagulate is improved through the causal effect that the former play on the latter, rather than from a common source of genetic variation. Breaking the causal link (e.g. standardizing SCS or CAS before the milk is processed into cheese) would reduce the impact of the improvement due to selective breeding. Since a30 depends heavily on RCT, the relative emphasis that is put on this trait should be reconsidered and weighted for the fact that the pure measure of a30 almost double-counts RCT.


Subject(s)
Milk , Animals , Caseins/analysis , Cattle , Chymosin , Dairying , Genetic Variation , Italy , Milk/chemistry , Milk/standards
13.
Genetics ; 200(2): 483-94, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25908318

ABSTRACT

The term "effect" in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability.


Subject(s)
Genetics, Population , Genomics , Models, Genetic , Selection, Genetic , Algorithms , Computer Simulation , Genomics/methods , Humans
14.
Poult Sci ; 94(4): 772-80, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25713397

ABSTRACT

The prediction of total egg production (TEP) potential in poultry is an important task to aid optimized management decisions in commercial enterprises. The objective of the present study was to compare different modeling approaches for prediction of TEP in meat type quails (Coturnix coturnix coturnix) using phenotypes such as weight, weight gain, egg production and egg quality measurements. Phenotypic data on 30 traits from two lines (L1, n=180; and L2, n=205) of quail were modeled to predict TEP. Prediction models included multiple linear regression and artificial neural network (ANN). Moreover, Bayesian network (BN) and a stepwise approach were used as variable selection methods. BN results showed that TEP is independent from other earlier expressed traits when conditioned on egg production from 35 to 80 days of age (EP1). In addition, the prediction accuracy was much lower when EP1 was not included in the model. The best predictive model was ANN, after feature selection, showing prediction correlations of r=0.792 and r=0.714 for L1 and L2, respectively. In conclusion, machine learning methods may be useful, but reasonable prediction accuracies are obtained only when partial egg production measurements are included in the model.


Subject(s)
Animal Husbandry/methods , Coturnix/physiology , Reproduction , Animals , Bayes Theorem , Brazil , Models, Biological , Neural Networks, Computer , Regression Analysis
15.
Genet Sel Evol ; 46: 2, 2014 Jan 17.
Article in English | MEDLINE | ID: mdl-24438068

ABSTRACT

BACKGROUND: Knowledge regarding causal relationships among traits is important to understand complex biological systems. Structural equation models (SEM) can be used to quantify the causal relations between traits, which allow prediction of outcomes to interventions applied to such a network. Such models are fitted conditionally on a causal structure among traits, represented by a directed acyclic graph and an Inductive Causation (IC) algorithm can be used to search for causal structures. The aim of this study was to explore the space of causal structures involving bovine milk fatty acids and to select a network supported by data as the structure of a SEM. RESULTS: The IC algorithm adapted to mixed models settings was applied to study 14 correlated bovine milk fatty acids, resulting in an undirected network. The undirected pathway from C4:0 to C12:0 resembled the de novo synthesis pathway of short and medium chain saturated fatty acids. By using prior knowledge, directions were assigned to that part of the network and the resulting structure was used to fit a SEM that led to structural coefficients ranging from 0.85 to 1.05. The deviance information criterion indicated that the SEM was more plausible than the multi-trait model. CONCLUSIONS: The IC algorithm output pointed towards causal relations between the studied traits. This changed the focus from marginal associations between traits to direct relationships, thus towards relationships that may result in changes when external interventions are applied. The causal structure can give more insight into underlying mechanisms and the SEM can predict conditional changes due to such interventions.


Subject(s)
Algorithms , Fatty Acids/analysis , Milk/chemistry , Animals , Cattle , Fatty Acids/genetics , Models, Genetic , Phenotype
16.
Genetics ; 194(3): 561-72, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23608193

ABSTRACT

Structural equation models (SEMs) are multivariate specifications capable of conveying causal relationships among traits. Although these models offer insights into how phenotypic traits relate to each other, it is unclear whether and how they can improve multiple-trait selection. Here, we explored concepts involved in SEMs, seeking for benefits that could be brought to breeding programs, relative to the standard multitrait model (MTM) commonly used. Genetic effects pertaining to SEMs and MTMs have distinct meanings. In SEMs, they represent genetic effects acting directly on each trait, without mediation by other traits in the model; in MTMs they express overall genetic effects on each trait, equivalent to lumping together direct and indirect genetic effects discriminated by SEMs. However, in breeding programs the goal is selecting candidates that produce offspring with best phenotypes, regardless of how traits are causally associated, so overall additive genetic effects are the matter. Thus, no information is lost in standard settings by using MTM-based predictions, even if traits are indeed causally associated. Nonetheless, causal information allows predicting effects of external interventions. One may be interested in predictions for scenarios where interventions are performed, e.g., artificially defining the value of a trait, blocking causal associations, or modifying their magnitudes. We demonstrate that with information provided by SEMs, predictions for these scenarios are possible from data recorded under no interventions. Contrariwise, MTMs do not provide information for such predictions. As livestock and crop production involves interventions such as management practices, SEMs may be advantageous in many settings.


Subject(s)
Genotype , Models, Genetic , Phenotype , Quantitative Trait Loci/genetics , Algorithms , Animals , Models, Theoretical , Plants/genetics
17.
Genet Sel Evol ; 43: 37, 2011 Nov 02.
Article in English | MEDLINE | ID: mdl-22047591

ABSTRACT

BACKGROUND: Structural equation models (SEM) are used to model multiple traits and the casual links among them. The number of different causal structures that can be used to fit a SEM is typically very large, even when only a few traits are studied. In recent applications of SEM in quantitative genetics mixed model settings, causal structures were pre-selected based on prior beliefs alone. Alternatively, there are algorithms that search for structures that are compatible with the joint distribution of the data. However, such a search cannot be performed directly on the joint distribution of the phenotypes since causal relationships are possibly masked by genetic covariances. In this context, the application of the Inductive Causation (IC) algorithm to the joint distribution of phenotypes conditional to unobservable genetic effects has been proposed. METHODS: Here, we applied this approach to five traits in European quail: birth weight (BW), weight at 35 days of age (W35), age at first egg (AFE), average egg weight from 77 to 110 days of age (AEW), and number of eggs laid in the same period (NE). We have focused the discussion on the challenges and difficulties resulting from applying this method to field data. Statistical decisions regarding partial correlations were based on different Highest Posterior Density (HPD) interval contents and models based on the selected causal structures were compared using the Deviance Information Criterion (DIC). In addition, we used temporal information to perform additional edge orienting, overriding the algorithm output when necessary. RESULTS: As a result, the final causal structure consisted of two separated substructures: BW→AEW and W35→AFE→NE, where an arrow represents a direct effect. Comparison between a SEM with the selected structure and a Multiple Trait Animal Model using DIC indicated that the SEM is more plausible. CONCLUSIONS: Coupling prior knowledge with the output provided by the IC algorithm allowed further learning regarding phenotypic causal structures when compared to standard mixed effects SEM applications.


Subject(s)
Quail/genetics , Quantitative Trait, Heritable , Algorithms , Animals , Female , Models, Genetic , Ovum/growth & development , Phenotype , Quail/growth & development , Quail/physiology , Reproduction
18.
Genet Sel Evol ; 43: 6, 2011 Feb 10.
Article in English | MEDLINE | ID: mdl-21310061

ABSTRACT

Phenotypic traits may exert causal effects between them. For example, on the one hand, high yield in dairy cows may increase the liability to certain diseases and, on the other hand, the incidence of a disease may affect yield negatively. Likewise, the transcriptome may be a function of the reproductive status in mammals and the latter may depend on other physiological variables. Knowledge of phenotype networks describing such interrelationships can be used to predict the behavior of complex systems, e.g. biological pathways underlying complex traits such as diseases, growth and reproduction. Structural Equation Models (SEM) can be used to study recursive and simultaneous relationships among phenotypes in multivariate systems such as genetical genomics, system biology, and multiple trait models in quantitative genetics. Hence, SEM can produce an interpretation of relationships among traits which differs from that obtained with traditional multiple trait models, in which all relationships are represented by symmetric linear associations among random variables, such as covariances and correlations. In this review, we discuss the application of SEM and related techniques for the study of multiple phenotypes. Two basic scenarios are considered, one pertaining to genetical genomics studies, in which QTL or molecular marker information is used to facilitate causal inference, and another related to quantitative genetic analysis in livestock, in which only phenotypic and pedigree information is available. Advantages and limitations of SEM compared to traditional approaches commonly used for the analysis of multiple traits, as well as some indication of future research in this area are presented in a concluding section.


Subject(s)
Models, Genetic , Phenotype , Quantitative Trait Loci , Algorithms , Animals , Humans
19.
Genetics ; 185(2): 633-44, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20351220

ABSTRACT

Biology is characterized by complex interactions between phenotypes, such as recursive and simultaneous relationships between substrates and enzymes in biochemical systems. Structural equation models (SEMs) can be used to study such relationships in multivariate analyses, e.g., with multiple traits in a quantitative genetics context. Nonetheless, the number of different recursive causal structures that can be used for fitting a SEM to multivariate data can be huge, even when only a few traits are considered. In recent applications of SEMs in mixed-model quantitative genetics settings, causal structures were preselected on the basis of prior biological knowledge alone. Therefore, the wide range of possible causal structures has not been properly explored. Alternatively, causal structure spaces can be explored using algorithms that, using data-driven evidence, can search for structures that are compatible with the joint distribution of the variables under study. However, the search cannot be performed directly on the joint distribution of the phenotypes as it is possibly confounded by genetic covariance among traits. In this article we propose to search for recursive causal structures among phenotypes using the inductive causation (IC) algorithm after adjusting the data for genetic effects. A standard multiple-trait model is fitted using Bayesian methods to obtain a posterior covariance matrix of phenotypes conditional to unobservable additive genetic effects, which is then used as input for the IC algorithm. As an illustrative example, the proposed methodology was applied to simulated data related to multiple traits measured on a set of inbred lines.


Subject(s)
Algorithms , Factor Analysis, Statistical , Bayes Theorem , Humans , Multivariate Analysis , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL
...