Search | VHL Regional Portal

A parametric framework for multidimensional linear measurement error regression.

Luck, Stanley.

PLoS One ; 17(1): e0262148, 2022.

Article in English | MEDLINE | ID: mdl-35061791

ABSTRACT

The ordinary linear regression method is limited to bivariate data because it is based on the Cartesian representation y = f(x). Using the chain rule, we transform the method to the parametric representation (x(t), y(t)) and obtain a linear regression framework in which the weighted average is used as a parameter for a multivariate linear relation for a set of linearly related variable vectors (LRVVs). We confirm the proposed approach by a Monte Carlo simulation, where the minimum coefficient of variation for error (CVE) provides the optimal weights when forming a weighted average of LRVVs. Then, we describe a parametric linear regression (PLR) algorithm in which the Moore-Penrose pseudoinverse is used to estimate measurement error regression (MER) parameters individually for the given variable vectors. We demonstrate that MER parameters from the PLR and nonlinear ODRPACK methods are quite similar for a wide range of reliability ratios, but ODRPACK is formulated only for bivariate data. We identify scale invariant quantities for the PLR and weighted orthogonal regression (WOR) methods and their correspondences with the partitioned residual effects between the variable vectors. Thus, the specification of an error model for the data is essential for MER and we discuss the use of Monte Carlo methods for estimating the distributions and confidence intervals for MER slope and correlation coefficient. We distinguish between elementary covariance for the y = f(x) representation and covariance vector for the (x(t), y(t)) representation. We also discuss the multivariate generalization of the Pearson correlation as the contraction between Cartesian polyad alignment tensors for the LRVVs and weighted average. Finally, we demonstrate the use of multidimensional PLR in estimating the MER parameters for replicate RNA-Seq data and quadratic regression for estimating the parameters of the conical dispersion of read count data about the MER line.

Subject(s)

Algorithms , Linear Models , Monte Carlo Method

Nonoverlap proportion and the representation of point-biserial variation.

Luck, Stanley.

PLoS One ; 15(12): e0244517, 2020.

Article in English | MEDLINE | ID: mdl-33370394

ABSTRACT

We consider the problem of constructing a complete set of parameters that account for all of the degrees of freedom for point-biserial variation. We devise an algorithm where sort as an intrinsic property of both numbers and labels, is used to generate the parameters. Algebraically, point-biserial variation is represented by a Cartesian product of statistical parameters for two sets of [Formula: see text] data, and the difference between mean values (Î´) corresponds to the representation of variation in the center of mass coordinates, (Î´, µ). The existence of alternative effect size measures is explained by the fact that mathematical considerations alone do not specify a preferred coordinate system for the representation of point-biserial variation. We develop a novel algorithm for estimating the nonoverlap proportion (ρpb) of two sets of [Formula: see text] data. ρpb is obtained by sorting the labeled [Formula: see text] data and analyzing the induced order in the categorical data using a diagonally symmetric 2 × 2 contingency table. We examine the correspondence between ρpb and point-biserial correlation (rpb) for uniform and normal distributions. We identify the [Formula: see text], [Formula: see text], and [Formula: see text] representations for Pearson product-moment correlation, Cohen's d, and rpb. We compare the performance of rpb versus ρpb and the sample size proportion corrected correlation (rpbd), confirm that invariance with respect to the sample size proportion is important in the formulation of the effect size, and give an example where three parameters (rpbd, µ, ρpb) are needed to distinguish different forms of point-biserial variation in CART regression tree analysis. We discuss the importance of providing an assessment of cost-benefit trade-offs between relevant system parameters because 'substantive significance' is specified by mapping functional or engineering requirements into the effect size coordinates. Distributions and confidence intervals for the statistical parameters are obtained using Monte Carlo methods.

Subject(s)

Algorithms , Data Analysis , Regression Analysis , Monte Carlo Method , Sample Size

Factoring a 2 x 2 contingency table.

Luck, Stanley.

PLoS One ; 14(10): e0224460, 2019.

Article in English | MEDLINE | ID: mdl-31652283

ABSTRACT

We show that a two-component proportional representation provides the necessary framework to account for the properties of a 2 × 2 contingency table. This corresponds to the factorization of the table as a product of proportion and diagonal row or column sum matrices. The row and column sum invariant measures for proportional variation are obtained. Geometrically, these correspond to displacements of two point vectors in the standard one-simplex, which are reduced to a center-of-mass coordinate representation, [Formula: see text]. Then, effect size measures, such as the odds ratio and relative risk, correspond to different perspective functions for the mapping of (Î´, µ) to [Formula: see text]. Furthermore, variations in Î´ and µ will be associated with different cost-benefit trade-offs for a given application. Therefore, pure mathematics alone does not provide the specification of a general form for the perspective function. This implies that the question of the merits of the odds ratio versus relative risk cannot be resolved in a general way. Expressions are obtained for the marginal sum dependence and the relations between various effect size measures, including the simple matching coefficient, odds ratio, relative risk, Yule's Q, Ï, and Goodman and Kruskal's τc|r. We also show that Gini information gain (IGG) is equivalent to Ï2 in the classification and regression tree (CART) algorithm. Then, IGG can yield misleading results due to the dependence on marginal sums. Monte Carlo methods facilitate the detailed specification of stochastic effects in the data acquisition process and provide a practical way to estimate the confidence interval for an effect size.

Subject(s)

Statistics as Topic/methods , Confidence Intervals , Monte Carlo Method , Regression Analysis , Sample Size

Genotypic and Environmental Impact on Natural Variation of Nutrient Composition in 50 Non Genetically Modified Commercial Maize Hybrids in North America.

Cong, Bin; Maxwell, Carl; Luck, Stanley; Vespestad, Deanne; Richard, Keith; Mickelson, James; Zhong, Cathy.

J Agric Food Chem ; 63(22): 5321-34, 2015 Jun 10.

Article in English | MEDLINE | ID: mdl-25971869

ABSTRACT

This study was designed to assess natural variation in composition and metabolites in 50 genetically diverse non genetically modified maize hybrids grown at six locations in North America. Results showed that levels of compositional components in maize forage were affected by environment more than genotype. Crude protein, all amino acids except lysine, manganese, and ß-carotene in maize grain were affected by environment more than genotype; however, most proximates and fibers, all fatty acids, lysine, most minerals, vitamins, and secondary metabolites in maize grain were affected by genotype more than environment. A strong interaction between genotype and environment was seen for some analytes. The results could be used as reference values for future nutrient composition studies of genetically modified crops and to expand conventional compositional data sets. These results may be further used as a genetic basis for improvement of the nutritional value of maize grain by molecular breeding and biotechnology approaches.

Subject(s)

Zea mays/chemistry , Zea mays/genetics , Amino Acids/analysis , Amino Acids/metabolism , Ecosystem , Environment , Fatty Acids/analysis , Fatty Acids/metabolism , Gene-Environment Interaction , Genotype , Minerals/analysis , Minerals/metabolism , North America , Nutritive Value , Vitamins/analysis , Vitamins/metabolism , Zea mays/classification , beta Carotene/analysis , beta Carotene/metabolism

GIANT EMBRYO encodes CYP78A13, required for proper size balance between embryo and endosperm in rice.

Nagasawa, Nobuhiro; Hibara, Ken-ichiro; Heppard, Elmer P; Vander Velden, Kent A; Luck, Stanley; Beatty, Mary; Nagato, Yasuo; Sakai, Hajime.

Plant J ; 75(4): 592-605, 2013 Aug.

Article in English | MEDLINE | ID: mdl-23621326

ABSTRACT

Among angiosperms there is a high degree of variation in embryo/endosperm size in mature seeds. However, little is known about the molecular mechanism underlying size control between these neighboring tissues. Here we report the rice GIANT EMBRYO (GE) gene that is essential for controlling the size balance. The function of GE in each tissue is distinct, controlling cell size in the embryo and cell death in the endosperm. GE, which encodes CYP78A13, is predominantly expressed in the interfacing tissues of the both embryo and endosperm. GE expression is under negative feedback regulation; endogenous GE expression is upregulated in ge mutants. In contrast to the loss-of-function mutant with large embryo and small endosperm, GE overexpression causes a small embryo and enlarged endosperm. A complementation analysis coupled with heterofertilization showed that complementation of ge mutation in either embryo or endosperm failed to restore the wild-type embryo/endosperm ratio. Thus, embryo and endosperm interact in determining embryo/endosperm size balance. Among genes associated with embryo/endosperm size, REDUCED EMBRYO genes, whose loss-of-function causes a phenotype opposite to ge, are revealed to regulate endosperm size upstream of GE. To fully understand the embryo-endosperm size control, the genetic network of the related genes should be elucidated.

Subject(s)

Endosperm/genetics , Gene Expression Regulation, Developmental , Oryza/genetics , Plant Proteins/genetics , Alleles , Amino Acid Sequence , Chromosome Mapping , Cytochrome P-450 Enzyme System/genetics , Cytochrome P-450 Enzyme System/metabolism , Endosperm/cytology , Endosperm/growth & development , Endosperm/metabolism , Gene Expression Regulation, Plant , Genetic Complementation Test , Genotype , Molecular Sequence Data , Mutation , Organ Specificity , Oryza/cytology , Oryza/growth & development , Oryza/metabolism , Phenotype , Phylogeny , Plant Proteins/metabolism , Plants, Genetically Modified , Seeds/cytology , Seeds/genetics , Seeds/growth & development , Seeds/metabolism , Sequence Alignment , Up-Regulation

Genome-wide expression quantitative trait loci (eQTL) analysis in maize.

Holloway, Beth; Luck, Stanley; Beatty, Mary; Rafalski, J-Antoni; Li, Bailin.

BMC Genomics ; 12: 336, 2011 Jun 30.

Article in English | MEDLINE | ID: mdl-21718468

ABSTRACT

BACKGROUND: Expression QTL analyses have shed light on transcriptional regulation in numerous species of plants, animals, and yeasts. These microarray-based analyses identify regulators of gene expression as either cis-acting factors that regulate proximal genes, or trans-acting factors that function through a variety of mechanisms to affect transcript abundance of unlinked genes. RESULTS: A hydroponics-based genetical genomics study in roots of a Zea mays IBM2 Syn10 double haploid population identified tens of thousands of cis-acting and trans-acting eQTL. Cases of false-positive eQTL, which results from the lack of complete genomic sequences from both parental genomes, were described. A candidate gene for a trans-acting regulatory factor was identified through positional cloning. The unexpected regulatory function of a class I glutamine amidotransferase controls the expression of an ABA 8'-hydroxylase pseudogene. CONCLUSIONS: Identification of a candidate gene underlying a trans-eQTL demonstrated the feasibility of eQTL cloning in maize and could help to understand the mechanism of gene expression regulation. Lack of complete genome sequences from both parents could cause the identification of false-positive cis- and trans-acting eQTL.

Subject(s)

Genome , Quantitative Trait Loci , Zea mays/genetics , Cytochrome P-450 Enzyme System/genetics , Cytochrome P-450 Enzyme System/metabolism , Gene Expression Regulation, Plant , Haploidy , Hydroponics , Plant Proteins , Plant Roots/genetics

Folding mechanism of reduced Cytochrome c: equilibrium and kinetic properties in the presence of carbon monoxide.

Latypov, Ramil F; Maki, Kosuke; Cheng, Hong; Luck, Stanley D; Roder, Heinrich.

J Mol Biol ; 383(2): 437-53, 2008 Nov 07.

Article in English | MEDLINE | ID: mdl-18761351

ABSTRACT

Despite close structural similarity, the ferric and ferrous forms of cytochrome c differ greatly in terms of their ligand binding properties, stability, folding, and dynamics. The reduced heme iron binds diatomic ligands such as CO only under destabilizing conditions that promote weakening or disruption of native methionine-iron linkage. This makes CO a useful conformational probe for detecting partially structured states that cannot be observed in the absence of endogenous ligands. Heme absorbance, circular dichroism, and NMR were used to characterize the denaturant-induced unfolding equilibrium of ferrocytochrome c in the presence and in the absence of CO. In addition to the native state (N), which does not bind CO, and the unfolded CO complex (U-CO), a structurally distinct CO-bound form (M-CO) accumulates to high levels (approximately 75% of the population) at intermediate guanidine HCl concentrations. Comparison of the unfolding transitions for different conformational probes reveals that M-CO is a compact state containing a native-like helical core and regions of local disorder in the segment containing the native Met80 ligand and adjacent loops. Kinetic measurements of CO binding and dissociation under native, partially denaturing, and fully unfolded conditions indicate that a state M that is structurally analogous to M-CO is populated even in the absence of CO. The binding energy of the CO ligand lowers the free energy of this high-energy state to such an extent that it accumulates even under mildly denaturing equilibrium conditions. The thermodynamic and kinetic parameters obtained in this study provide a fully self-consistent description of the linked unfolding/CO binding equilibria of reduced cytochrome c.

Subject(s)

Carbon Monoxide/chemistry , Cytochromes c/chemistry , Animals , Binding Sites , Carbon Monoxide/metabolism , Circular Dichroism , Cytochromes c/metabolism , Horses , Kinetics , Ligands , Nuclear Magnetic Resonance, Biomolecular , Protein Folding , Thermodynamics

Whole genome scan detects an allelic variant of fad2 associated with increased oleic acid levels in maize.

Beló, André; Zheng, Peizhong; Luck, Stanley; Shen, Bo; Meyer, David J; Li, Bailin; Tingey, Scott; Rafalski, Antoni.

Mol Genet Genomics ; 279(1): 1-10, 2008 Jan.

Article in English | MEDLINE | ID: mdl-17934760

ABSTRACT

We used whole genome scan association mapping to identify loci with major effect on oleic acid content in maize kernels. Single nucleotide polymorphism haplotypes at 8,590 loci were tested for association with oleic acid content in 553 maize inbreds. A single locus with major effect on oleic acid was mapped between 380 and 384 cM in the IBM2 neighbors genetic map on chromosome 4 and confirmed in a biparental population. A fatty acid desaturase, fad2, identified approximately 2 kb from the associated genetic marker, is the most likely candidate gene responsible for the differences in the phenotype. The fad2 alleles with high- and low-oleic acid content were sequenced and allelic differences in fad2 RNA level in developing embryos was investigated. We propose that a non-conservative amino acid polymorphism near the active site of fad2 contributes to the effect on oleic acid content. This is the first report of the use of a high resolution whole genome scan association mapping where a putative gene responsible for a quantitative trait was identified in plants.

Subject(s)

Fatty Acid Desaturases/genetics , Fatty Acid Desaturases/metabolism , Oleic Acid/metabolism , Zea mays/genetics , Zea mays/metabolism , Alleles , Chromosome Mapping , DNA, Plant/genetics , Gene Expression , Genetic Variation , Genome, Plant , Molecular Sequence Data , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Plant/genetics , RNA, Plant/metabolism

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL