ABSTRACT
In this work, we propose an extension of a semiparametric nonlinear mixed-effects model for longitudinal data that incorporates more flexibility with penalized splines (P-splines) as smooth terms. The novelty of the proposed approach consists of the formulation of the model within the stochastic approximation version of the EM algorithm for maximum likelihood, the so-called SAEM algorithm. The proposed approach takes advantage of the formulation of a P-spline as a mixed-effects model and the use of the computational advantages of the existing software for the SAEM algorithm for the estimation of the random effects and the variance components. Additionally, we developed a supervised classification method for these non-linear mixed models using an adaptive importance sampling scheme. To illustrate our proposal, we consider two studies on pregnant women where two biomarkers are used as indicators of changes during pregnancy. In both studies, information about the women's pregnancy outcomes is known. Our proposal provides a unified framework for the classification of longitudinal profiles that may have important implications for the early detection and monitoring of pregnancy-related changes and contribute to improved maternal and fetal health outcomes. We show that the proposed models improve the analysis of this type of data compared to previous studies. These improvements are reflected both in the fit of the models and in the classification of the groups.
Subject(s)
Algorithms , Software , Female , Humans , Pregnancy , Pregnancy Outcome , Models, Statistical , Longitudinal StudiesABSTRACT
Spatially-referenced geostatistical responses that are collected in environmental sciences research are often subject to detection limits, where the measures are not fully quantifiable. This leads to censoring (left, right, interval, etc), and various ad hoc statistical methods (such as choosing arbitrary detection limits, or data augmentation) are routinely employed during subsequent statistical analysis for inference and prediction. However, inference may be imprecise and sensitive to the assumptions and approximations involved in those arbitrary choices. To circumvent this, we propose an exact maximum likelihood estimation framework of the fixed effects and variance components and related prediction via a novel application of the Stochastic Approximation of the Expectation Maximization (SAEM) algorithm, allowing for easy and elegant estimation of model parameters under censoring. Both simulation studies and application to a real dataset on arsenic concentration collected by the Michigan Department of Environmental Quality demonstrate the advantages of our method over the available naïve techniques in terms of finite sample properties of the estimates, prediction, and robustness. The proposed methods can be implemented using the R package CensSpatial.
ABSTRACT
This paper develops a likelihood-based approach to analyze quantile regression (QR) models for continuous longitudinal data via the asymmetric Laplace distribution (ALD). Compared to the conventional mean regression approach, QR can characterize the entire conditional distribution of the outcome variable and is more robust to the presence of outliers and misspecification of the error distribution. Exploiting the nice hierarchical representation of the ALD, our classical approach follows a Stochastic Approximation of the EM (SAEM) algorithm in deriving exact maximum likelihood estimates of the fixed-effects and variance components. We evaluate the finite sample performance of the algorithm and the asymptotic properties of the ML estimates through empirical experiments and applications to two real life datasets. Our empirical results clearly indicate that the SAEM estimates outperforms the estimates obtained via the combination of Gaussian quadrature and non-smooth optimization routines of the Geraci and Bottai (2014) approach in terms of standard errors and mean square error. The proposed SAEM algorithm is implemented in the R package qrLMM().