Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 131
Filter
1.
G3 (Bethesda) ; 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38979923

ABSTRACT

Substantial functional metabolic diversity exists within species of cultivated grain crops that directly or indirectly provide more than half of all calories consumed by humans around the globe. While such diversity is the molecular currency used for improving agronomic traits, diversity is poorly characterized for its effects on human nutrition and utilization by gut microbes. Moreover, we know little about agronomic traits' potential trade-offs and pleiotropic effects on human nutritional traits. Here we applied a quantitative genetics approach using a meta-analysis and parallel genome-wide association studies of Sorghum bicolor traits describing changes in the composition and function of human gut microbe communities and any of 200 sorghum seed and agronomic traits across a diverse sorghum population to identify associated genetic variants. A total of fifteen multiple-effect loci (MEL) were initially found where different alleles in the sorghum genome produced changes in seed that affected the abundance of multiple bacterial taxa across two human microbiomes in automated in vitro fermentations. Next, parallel genome-wide studies conducted for seed, biochemical, and agronomic traits in the same population identified significant associations within the boundaries of 13/15 MEL for microbiome traits. In several instances, the co-localization of variation affecting gut microbiome and agronomic traits provided hypotheses for causal mechanisms through which variation could affect both agronomic traits and human gut microbes. This work demonstrates that genetic factors affecting agronomic traits in sorghum seed can also drive significant effects on human gut microbes, particularly bacterial taxa considered beneficial. Understanding these pleiotropic relationships will inform future strategies for crop improvement toward yield, sustainability, and human health.

2.
Planta ; 260(2): 44, 2024 Jul 04.
Article in English | MEDLINE | ID: mdl-38963439

ABSTRACT

MAIN CONCLUSION: The pilot-scale genome-wide association study in the US proso millet identified twenty marker-trait associations for five morpho-agronomic traits identifying genomic regions for future studies (e.g. molecular breeding and map-based cloning). Proso millet (Panicum miliaceum L.) is an ancient grain recognized for its excellent water-use efficiency and short growing season. It is an indispensable part of the winter wheat-based dryland cropping system in the High Plains of the USA. Its grains are endowed with high nutritional and health-promoting properties, making it increasingly popular in the global market for healthy grains. There is a dearth of genomic resources in proso millet for developing molecular tools to complement conventional breeding for developing high-yielding varieties. Genome-wide association study (GWAS) is a widely used method to dissect the genetics of complex traits. In this pilot study of the first-ever GWAS in the US proso millet, 71 globally diverse genotypes of 109 the US proso millet core collection were evaluated for five major morpho-agronomic traits at two locations in western Nebraska, and GWAS was conducted to identify single nucleotide polymorphisms (SNPs) associated with these traits. Analysis of variance showed that there was a significant difference among the genotypes, and all five traits were also found to be highly correlated with each other. Sequence reads from genotyping-by-sequencing (GBS) were used to identify 11,147 high-quality bi-allelic SNPs. Population structure analysis with those SNPs showed stratification within the core collection. The GWAS identified twenty marker-trait associations (MTAs) for the five traits. Twenty-nine putative candidate genes associated with the five traits were also identified. These genomic regions can be used to develop genetic markers for marker-assisted selection in proso millet breeding.


Subject(s)
Genome-Wide Association Study , Panicum , Polymorphism, Single Nucleotide , Panicum/genetics , Polymorphism, Single Nucleotide/genetics , Genetic Markers , Genotype , Phenotype , Quantitative Trait Loci/genetics , Pilot Projects , Genome, Plant/genetics , Plant Breeding/methods
3.
Plant Commun ; : 101010, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38918950

ABSTRACT

Genome-wide association study (GWAS) identifies trait-associated loci, but due in part to slow decay of linkage disequilibrium (LD), identifying the causal genes can be a bottleneck. Transcriptome-wide association study (TWAS) addresses this by identifying gene expression-phenotype associations or integrating gene expression quantitative trait loci (eQTLs) with GWAS results. Here, we used self-pollinated soybean (Glycine max [L.] Merr.) as a model to evaluate the application of TWAS in the genetic dissection of traits in plant species with slow LD decay. We generated RNA-Seq data of a soybean diversity panel, and identified the genetic expression regulation of 29,286 genes in soybean. Different TWAS solutions were less affected by LD and robust with source of expression that identified known genes related to traits from different development stages and tissues. A novel gene named pod color L2 was identified via TWAS and functionally validated by genome editing. By introducing the new exon proportion feature, we significantly improved the detection of expression variations resulting from structural variations and alternative splicing. As a result, the genes identified by our TWAS approach exhibited a diverse range of causal variations, including SNP, insertion/deletion, gene fusion, copy number variation, and alternative splicing. Using our TWAS approach, we identified genes associated with flowering time, including both previously known genes and novel genes that had not previously linked to this trait before, providing complementary insights with GWAS. In summary, this study supports the application of TWAS for candidate gene identification in species with low rates of LD decay.

4.
Food Chem ; 456: 140062, 2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38876073

ABSTRACT

Differences in moisture and protein content impact both nutritional value and processing efficiency of corn kernels. Near-infrared (NIR) spectroscopy can be used to estimate kernel composition, but models trained on a few environments may underestimate error rates and bias. We assembled corn samples from diverse international environments and used NIR with chemometrics and partial least squares regression (PLSR) to determine moisture and protein. The potential of five feature selection methods to improve prediction accuracy was assessed by extracting sensitive wavelengths. Gradient boosting machines (GBMs), particularly CatBoost and LightGBM, were found to effectively select crucial wavelengths for moisture (1409, 1900, 1908, 1932, 1953, 2174 nm) and protein (887, 1212, 1705, 1891, 2097, 2456 nm). SHAP plots highlighted significant wavelength contributions to model prediction. These results illustrate GBMs' effectiveness in feature engineering for agricultural and food sector applications, including developing multi-country global calibration models for moisture and protein in corn kernels.

5.
J Exp Bot ; 2024 May 29.
Article in English | MEDLINE | ID: mdl-38808657

ABSTRACT

Chilling stress threatens plant growth and development, particularly affecting membrane fluidity and cellular integrity. Understanding plant membrane responses to chilling stress is important for unraveling the molecular mechanisms of stress tolerance. Whereas core transcriptional responses to chilling stress and stress tolerance are conserved across species, the associated changes in membrane lipids appear to be less conserved, as which lipids are affected by chilling stress varies by species. Here, we investigated changes in gene expression and membrane lipids in response to chilling stress during one 24 hour cycle in chilling-tolerant foxtail millet (Setaria italica), and chilling-sensitive sorghum (Sorghum bicolor), and Urochloa (browntop signal grass, Urochloa fusca, lipids only), leveraging their evolutionary relatedness and differing levels of chilling-stress tolerance. We show that most chilling-induced lipid changes are conserved across the three species, while we observed distinct, time-specific responses in chilling-tolerant foxtail millet, indicating the presence of a finely orchestrated adaptive mechanism. We detected rhythmicity in lipid responses to chilling stress in the three grasses, which were also present in Arabidopsis (Arabidopsis thaliana), suggesting the conservation of rhythmic patterns across species and highlighting the importance of accounting for time of day. When integrating lipid datasets with gene expression profiles, we identified potential candidate genes that showed corresponding transcriptional changes in response to chilling stress, providing insights into the differences in regulatory mechanisms between chilling-sensitive sorghum and chilling-tolerant foxtail millet.

6.
J Plant Physiol ; 297: 154261, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38705078

ABSTRACT

Non-photochemical quenching (NPQ) protects plants from photodamage caused by excess light energy. Substantial variation in NPQ has been reported among different genotypes of the same species. However, comparatively little is known about how environmental perturbations, including nutrient deficits, impact natural variation in NPQ kinetics. Here, we analyzed a natural variation in NPQ kinetics of a diversity panel of 225 maize (Zea mays L.) genotypes under nitrogen replete and nitrogen deficient field conditions. Individual maize genotypes from a diversity panel exhibited a range of changes in NPQ in response to low nitrogen. Replicated genotypes exhibited consistent responses across two field experiments conducted in different years. At the seedling and pre-flowering stages, a similar portion of the genotypes (∼33%) showed decrease, no-change or increase in NPQ under low nitrogen relative to control. Genotypes with increased NPQ under low nitrogen also showed greater reductions in dry biomass and photosynthesis than genotypes with stable NPQ when exposed to low nitrogen conditions. Maize genotypes where an increase in NPQ was observed under low nitrogen also exhibited a reduction in the ratio of chlorophyll a to chlorophyll b. Our results underline that since thermal dissipation of excess excitation energy measured via NPQ helps to balance the energy absorbed with energy utilized, the NPQ changes are the reflection of broader molecular and biochemical changes which occur under the stresses such as low soil fertility. Here, we have demonstrated that variation in NPQ kinetics resulted from genetic and environmental factors, are not independent of each other. Natural genetic variation controlling plastic responses of NPQ kinetics to environmental perturbation increases the likelihood it will be possible to optimize NPQ kinetics in crop plants for different environments.


Subject(s)
Chlorophyll A , Chlorophyll , Genotype , Nitrogen , Zea mays , Zea mays/genetics , Zea mays/metabolism , Zea mays/physiology , Nitrogen/metabolism , Nitrogen/deficiency , Chlorophyll/metabolism , Chlorophyll A/metabolism , Photosynthesis
7.
Plant J ; 119(2): 844-860, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38812347

ABSTRACT

Transcriptome-wide association studies (TWAS) can provide single gene resolution for candidate genes in plants, complementing genome-wide association studies (GWAS) but efforts in plants have been met with, at best, mixed success. We generated expression data from 693 maize genotypes, measured in a common field experiment, sampled over a 2-h period to minimize diurnal and environmental effects, using full-length RNA-seq to maximize the accurate estimation of transcript abundance. TWAS could identify roughly 10 times as many genes likely to play a role in flowering time regulation as GWAS conducted data from the same experiment. TWAS using mature leaf tissue identified known true-positive flowering time genes known to act in the shoot apical meristem, and trait data from a new environment enabled the identification of additional flowering time genes without the need for new expression data. eQTL analysis of TWAS-tagged genes identified at least one additional known maize flowering time gene through trans-eQTL interactions. Collectively these results suggest the gene expression resource described here can link genes to functions across different plant phenotypes expressed in a range of tissues and scored in different experiments.


Subject(s)
Flowers , Gene Expression Regulation, Plant , Genome-Wide Association Study , Quantitative Trait Loci , Transcriptome , Zea mays , Zea mays/genetics , Zea mays/physiology , Flowers/genetics , Flowers/physiology , Quantitative Trait Loci/genetics , Genotype , Phenotype , Genes, Plant/genetics , Plant Leaves/genetics , Plant Leaves/physiology , Plant Leaves/metabolism , Gene Expression Profiling
8.
Sensors (Basel) ; 24(7)2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38610383

ABSTRACT

Unmanned aerial vehicle (UAV)-based imagery has become widely used to collect time-series agronomic data, which are then incorporated into plant breeding programs to enhance crop improvements. To make efficient analysis possible, in this study, by leveraging an aerial photography dataset for a field trial of 233 different inbred lines from the maize diversity panel, we developed machine learning methods for obtaining automated tassel counts at the plot level. We employed both an object-based counting-by-detection (CBD) approach and a density-based counting-by-regression (CBR) approach. Using an image segmentation method that removes most of the pixels not associated with the plant tassels, the results showed a dramatic improvement in the accuracy of object-based (CBD) detection, with the cross-validation prediction accuracy (r2) peaking at 0.7033 on a detector trained with images with a filter threshold of 90. The CBR approach showed the greatest accuracy when using unfiltered images, with a mean absolute error (MAE) of 7.99. However, when using bootstrapping, images filtered at a threshold of 90 showed a slightly better MAE (8.65) than the unfiltered images (8.90). These methods will allow for accurate estimates of flowering-related traits and help to make breeding decisions for crop improvement.


Subject(s)
Inflorescence , Zea mays , Plant Breeding , Algorithms , Machine Learning
9.
MicroPubl Biol ; 20242024.
Article in English | MEDLINE | ID: mdl-38495581

ABSTRACT

Leaf chlorophyll concentration was measured for 84 publicly available maize hybrids grown under three nitrogen fertilizer treatments in two contrasting environments in Nebraska. The effect of nitrogen treatment on chlorophyll response was found to be significant (p < 0.05) for both locations. In Scottsbluff, chlorophyll concentrations increased significantly with increasing nitrogen rate, while no significant difference was found between medium and high nitrogen in Lincoln. Within equivalent nitrogen treatments, chlorophyll was more abundant in Lincoln than Scottsbluff for nearly every hybrid. Hybrid response was not consistent between environments, with approximately 11% of variance explained by genotype by environment interaction.

10.
Ann Bot ; 132(3): 413-428, 2023 11 23.
Article in English | MEDLINE | ID: mdl-37675505

ABSTRACT

BACKGROUND AND AIMS: Phosphoenolpyruvate (PEP) carboxylase (PEPC) catalyses the irreversible carboxylation of PEP with bicarbonate to produce oxaloacetate. This reaction powers the carbon-concentrating mechanism (CCM) in plants that perform C4 photosynthesis. This CCM is generally driven by a single PEPC gene product that is highly expressed in the cytosol of mesophyll cells. We found two C4 grasses, Panicum miliaceum and Echinochloa colona, that each have two highly expressed PEPC genes. We characterized the kinetic properties of the two most abundant PEPCs in E. colona and P. miliaceum to better understand how the enzyme's amino acid structure influences its function. METHODS: Coding sequences of the two most abundant PEPC proteins in E. colona and P. miliaceum were synthesized by GenScript and were inserted into bacteria expression plasmids. Point mutations resulting in substitutions at conserved amino acid residues (e.g. N-terminal serine and residue 890) were created via site-directed PCR mutagenesis. The kinetic properties of semi-purified plant PEPCs from Escherichia coli were analysed using membrane-inlet mass spectrometry and a spectrophotometric enzyme-coupled reaction. KEY RESULTS: The two most abundant P. miliaceum PEPCs (PmPPC1 and PmPPC2) have similar sequence identities (>95 %), and as a result had similar kinetic properties. The two most abundant E. colona PEPCs (EcPPC1 and EcPPC2) had identities of ~78 % and had significantly different kinetic properties. The PmPPCs and EcPPCs had different responses to allosteric inhibitors and activators, and substitutions at the conserved N-terminal serine and residue 890 resulted in significantly altered responses to allosteric regulators. CONCLUSIONS: The two, significantly expressed C4Ppc genes in P. miliaceum were probably the result of genomes combining from two closely related C4Panicum species. We found natural variation in PEPC's sensitivity to allosteric inhibition that seems to bypass the conserved 890 residue, suggesting alternative evolutionary pathways for increased malate tolerance and other kinetic properties.


Subject(s)
Phosphoenolpyruvate Carboxylase , Poaceae , Amino Acid Sequence , Poaceae/genetics , Poaceae/metabolism , Phosphoenolpyruvate Carboxylase/genetics , Phosphoenolpyruvate Carboxylase/chemistry , Phosphoenolpyruvate Carboxylase/metabolism , Biological Evolution , Plants/metabolism , Serine/genetics , Kinetics
11.
Methods Mol Biol ; 2698: 361-379, 2023.
Article in English | MEDLINE | ID: mdl-37682485

ABSTRACT

Leveraging existing resources in studied species to predict gene functions has the potential to rapidly expand understanding of annotated genes in other, less well-studied, species with assembled genomes. However, orthology is not a reliable predictor for the transcriptional responses of genes to stress. Machine learning methods can quantitatively estimate expression patterns and gene functions using known annotations and collections of features describing each gene. In this chapter, we describe a supervised machine learning framework to predict stress-responsive genes across species using only features derived from nucleotide sequences, using the example of cold stress-responsive genes in different Panicoid grass species.


Subject(s)
Machine Learning , Supervised Machine Learning , Cold-Shock Response , Poaceae/genetics
12.
BMC Res Notes ; 16(1): 219, 2023 Sep 14.
Article in English | MEDLINE | ID: mdl-37710302

ABSTRACT

OBJECTIVES: This release note describes the Maize GxE project datasets within the Genomes to Fields (G2F) Initiative. The Maize GxE project aims to understand genotype by environment (GxE) interactions and use the information collected to improve resource allocation efficiency and increase genotype predictability and stability, particularly in scenarios of variable environmental patterns. Hybrids and inbreds are evaluated across multiple environments and phenotypic, genotypic, environmental, and metadata information are made publicly available. DATA DESCRIPTION: The datasets include phenotypic data of the hybrids and inbreds evaluated in 30 locations across the US and one location in Germany in 2020 and 2021, soil and climatic measurements and metadata information for all environments (combination of year and location), ReadMe, and description files for each data type. A set of common hybrids is present in each environment to connect with previous evaluations. Each environment had a collaborator responsible for collecting and submitting the data, the GxE coordination team combined all the collected information and removed obvious erroneous data. Collaborators received the combined data to use, verify and declare that the data generated in their own environments was accurate. Combined data is released to the public with minimal filtering to maintain fidelity to the original data.


Subject(s)
Resource Allocation , Zea mays , Zea mays/genetics , Seasons , Genotype , Germany
13.
BMC Res Notes ; 16(1): 148, 2023 Jul 17.
Article in English | MEDLINE | ID: mdl-37461058

ABSTRACT

OBJECTIVES: The Genomes to Fields (G2F) 2022 Maize Genotype by Environment (GxE) Prediction Competition aimed to develop models for predicting grain yield for the 2022 Maize GxE project field trials, leveraging the datasets previously generated by this project and other publicly available data. DATA DESCRIPTION: This resource used data from the Maize GxE project within the G2F Initiative [1]. The dataset included phenotypic and genotypic data of the hybrids evaluated in 45 locations from 2014 to 2022. Also, soil, weather, environmental covariates data and metadata information for all environments (combination of year and location). Competitors also had access to ReadMe files which described all the files provided. The Maize GxE is a collaborative project and all the data generated becomes publicly available [2]. The dataset used in the 2022 Prediction Competition was curated and lightly filtered for quality and to ensure naming uniformity across years.


Subject(s)
Genome, Plant , Zea mays , Phenotype , Zea mays/genetics , Genotype , Genome, Plant/genetics , Edible Grain/genetics
14.
Nat Genet ; 55(7): 1221-1231, 2023 07.
Article in English | MEDLINE | ID: mdl-37322109

ABSTRACT

A complete telomere-to-telomere (T2T) finished genome has been the long pursuit of genomic research. Through generating deep coverage ultralong Oxford Nanopore Technology (ONT) and PacBio HiFi reads, we report here a complete genome assembly of maize with each chromosome entirely traversed in a single contig. The 2,178.6 Mb T2T Mo17 genome with a base accuracy of over 99.99% unveiled the structural features of all repetitive regions of the genome. There were several super-long simple-sequence-repeat arrays having consecutive thymine-adenine-guanine (TAG) tri-nucleotide repeats up to 235 kb. The assembly of the entire nucleolar organizer region of the 26.8 Mb array with 2,974 45S rDNA copies revealed the enormously complex patterns of rDNA duplications and transposon insertions. Additionally, complete assemblies of all ten centromeres enabled us to precisely dissect the repeat compositions of both CentC-rich and CentC-poor centromeres. The complete Mo17 genome represents a major step forward in understanding the complexity of the highly recalcitrant repetitive regions of higher plant genomes.


Subject(s)
Genomics , Zea mays , Zea mays/genetics , Repetitive Sequences, Nucleic Acid/genetics , Genome, Plant , Telomere/genetics , Sequence Analysis, DNA , High-Throughput Nucleotide Sequencing
15.
J Exp Bot ; 74(17): 5405-5417, 2023 09 13.
Article in English | MEDLINE | ID: mdl-37357909

ABSTRACT

Severe cold, defined as a damaging cold beyond acclimation temperatures, has unique responses, but the signaling and evolution of these responses are not well understood. Production of oligogalactolipids, which is triggered by cytosolic acidification in Arabidopsis (Arabidopsis thaliana), contributes to survival in severe cold. Here, we investigated oligogalactolipid production in species from bryophytes to angiosperms. Production of oligogalactolipids differed within each clade, suggesting multiple evolutionary origins of severe cold tolerance. We also observed greater oligogalactolipid production in control samples than in temperature-challenged samples of some species. Further examination of representative species revealed a tight association between temperature, damage, and oligogalactolipid production that scaled with the cold tolerance of each species. Based on oligogalactolipid production and transcript changes, multiple angiosperm species share a signal of oligogalactolipid production initially described in Arabidopsis, namely cytosolic acidification. Together, these data suggest that oligogalactolipid production is a severe cold response that originated from an ancestral damage response that remains in many land plant lineages and that cytosolic acidification may be a common signaling mechanism for its activation.


Subject(s)
Arabidopsis Proteins , Arabidopsis , Magnoliopsida , Arabidopsis/metabolism , Cold Temperature , Arabidopsis Proteins/metabolism , Temperature , Magnoliopsida/metabolism , Acclimatization/physiology , Gene Expression Regulation, Plant
16.
New Phytol ; 239(3): 1068-1082, 2023 08.
Article in English | MEDLINE | ID: mdl-37212042

ABSTRACT

Photoprotection against excess light via nonphotochemical quenching (NPQ) is indispensable for plant survival. However, slow NPQ relaxation under low light conditions can decrease yield of field-grown crops up to 40%. Using semi-high-throughput assay, we quantified the kinetics of NPQ and photosystem II operating efficiency (ΦPSII) in a replicated field trial of more than 700 maize (Zea mays) genotypes across 2 yr. Parametrized kinetics data were used to conduct genome-wide association studies. For six candidate genes involved in NPQ and ΦPSII kinetics in maize the loss of function alleles of orthologous genes in Arabidopsis (Arabidopsis thaliana) were characterized: two thioredoxin genes, and genes encoding a transporter in the chloroplast envelope, an initiator of chloroplast movement, a putative regulator of cell elongation and stomatal patterning, and a protein involved in plant energy homeostasis. Since maize and Arabidopsis are distantly related, we propose that genes involved in photoprotection and PSII function are conserved across vascular plants. The genes and naturally occurring functional alleles identified here considerably expand the toolbox to achieving a sustainable increase in crop productivity.


Subject(s)
Arabidopsis , Arabidopsis/genetics , Arabidopsis/metabolism , Photosystem II Protein Complex/genetics , Photosystem II Protein Complex/metabolism , Light , Genome-Wide Association Study , Chloroplasts/metabolism , Photosynthesis , Chlorophyll/metabolism
17.
BMC Genom Data ; 24(1): 29, 2023 05 25.
Article in English | MEDLINE | ID: mdl-37231352

ABSTRACT

OBJECTIVES: This report provides information about the public release of the 2018-2019 Maize G X E project of the Genomes to Fields (G2F) Initiative datasets. G2F is an umbrella initiative that evaluates maize hybrids and inbred lines across multiple environments and makes available phenotypic, genotypic, environmental, and metadata information. The initiative understands the necessity to characterize and deploy public sources of genetic diversity to face the challenges for more sustainable agriculture in the context of variable environmental conditions. DATA DESCRIPTION: Datasets include phenotypic, climatic, and soil measurements, metadata information, and inbred genotypic information for each combination of location and year. Collaborators in the G2F initiative collected data for each location and year; members of the group responsible for coordination and data processing combined all the collected information and removed obvious erroneous data. The collaborators received the data before the DOI release to verify and declare that the data generated in their own locations was accurate. ReadMe and description files are available for each dataset. Previous years of evaluation are already publicly available, with common hybrids present to connect across all locations and years evaluated since this project's inception.


Subject(s)
Genome, Plant , Zea mays , Phenotype , Zea mays/genetics , Seasons , Genotype , Genome, Plant/genetics
18.
J Exp Bot ; 74(14): 4050-4062, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37018460

ABSTRACT

Leaf-level hyperspectral reflectance has become an effective tool for high-throughput phenotyping of plant leaf traits due to its rapid, low-cost, multi-sensing, and non-destructive nature. However, collecting samples for model calibration can still be expensive, and models show poor transferability among different datasets. This study had three specific objectives: first, to assemble a large library of leaf hyperspectral data (n=2460) from maize and sorghum; second, to evaluate two machine-learning approaches to estimate nine leaf properties (chlorophyll, thickness, water content, nitrogen, phosphorus, potassium, calcium, magnesium, and sulfur); and third, to investigate the usefulness of this spectral library for predicting external datasets (n=445) including soybean and camelina using extra-weighted spiking. Internal cross-validation showed satisfactory performance of the spectral library to estimate all nine traits (mean R2=0.688), with partial least-squares regression outperforming deep neural network models. Models calibrated solely using the spectral library showed degraded performance on external datasets (mean R2=0.159 for camelina, 0.337 for soybean). Models improved significantly when a small portion of external samples (n=20) was added to the library via extra-weighted spiking (mean R2=0.574 for camelina, 0.536 for soybean). The leaf-level spectral library greatly benefits plant physiological and biochemical phenotyping, whilst extra-weight spiking improves model transferability and extends its utility.


Subject(s)
Chlorophyll , Edible Grain , Chlorophyll/metabolism , Phenotype , Edible Grain/metabolism , Plant Leaves/metabolism , Least-Squares Analysis , Glycine max/metabolism
19.
Genome Biol ; 24(1): 55, 2023 03 24.
Article in English | MEDLINE | ID: mdl-36964601

ABSTRACT

BACKGROUND: Transcription bridges genetic information and phenotypes. Here, we evaluated how changes in transcriptional regulation enable maize (Zea mays), a crop originally domesticated in the tropics, to adapt to temperate environments. RESULT: We generated 572 unique RNA-seq datasets from the roots of 340 maize genotypes. Genes involved in core processes such as cell division, chromosome organization and cytoskeleton organization showed lower heritability of gene expression, while genes involved in anti-oxidation activity exhibited higher expression heritability. An expression genome-wide association study (eGWAS) identified 19,602 expression quantitative trait loci (eQTLs) associated with the expression of 11,444 genes. A GWAS for alternative splicing identified 49,897 splicing QTLs (sQTLs) for 7614 genes. Genes harboring both cis-eQTLs and cis-sQTLs in linkage disequilibrium were disproportionately likely to encode transcription factors or were annotated as responding to one or more stresses. Independent component analysis of gene expression data identified loci regulating co-expression modules involved in oxidation reduction, response to water deprivation, plastid biogenesis, protein biogenesis, and plant-pathogen interaction. Several genes involved in cell proliferation, flower development, DNA replication, and gene silencing showed lower gene expression variation explained by genetic factors between temperate and tropical maize lines. A GWAS of 27 previously published phenotypes identified several candidate genes overlapping with genomic intervals showing signatures of selection during adaptation to temperate environments. CONCLUSION: Our results illustrate how maize transcriptional regulatory networks enable changes in transcriptional regulation to adapt to temperate regions.


Subject(s)
Transcriptome , Zea mays , Genome-Wide Association Study , Quantitative Trait Loci , Phenotype , Polymorphism, Single Nucleotide
20.
G3 (Bethesda) ; 13(4)2023 04 11.
Article in English | MEDLINE | ID: mdl-36625555

ABSTRACT

Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new methods applied toward this goal. Here we predict maize yield using deep neural networks, compare the efficacy of 2 model development methods, and contextualize model performance using conventional linear and machine learning models. We examine the usefulness of incorporating interactions between disparate data types. We find deep learning and best linear unbiased predictor (BLUP) models with interactions had the best overall performance. BLUP models achieved the lowest average error, but deep learning models performed more consistently with similar average error. Optimizing deep neural network submodules for each data type improved model performance relative to optimizing the whole model for all data types at once. Examining the effect of interactions in the best-performing model revealed that including interactions altered the model's sensitivity to weather and management features, including a reduction of the importance scores for timepoints expected to have a limited physiological basis for influencing yield-those at the extreme end of the season, nearly 200 days post planting. Based on these results, deep learning provides a promising avenue for the phenotypic prediction of complex traits in complex environments and a potential mechanism to better understand the influence of environmental and genetic factors.


Subject(s)
Deep Learning , Neural Networks, Computer , Machine Learning , Genotype , Multifactorial Inheritance
SELECTION OF CITATIONS
SEARCH DETAIL
...