Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
1.
Am J Epidemiol ; 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38988237

ABSTRACT

The incubation period is of paramount importance in infectious disease epidemiology as it informs about the transmission potential of a pathogenic organism and helps to plan public health strategies to keep an epidemic outbreak under control. Estimation of the incubation period distribution from reported exposure times and symptom onset times is challenging as the underlying data is coarse. We develop a new Bayesian methodology using Laplacian-P-splines that provides a semi-parametric estimation of the incubation density based on a Langevinized Gibbs sampler. A finite mixture density smoother informs a set of parametric distributions via moment matching and an information criterion arbitrates between competing candidates. Algorithms underlying our method find a natural nest within the EpiLPS package, which has been extended to cover estimation of incubation times. Various simulation scenarios accounting for different levels of data coarseness are considered with encouraging results. Applications to real data on COVID-19, MERS and Mpox reveal results that are in alignment with what has been obtained in recent studies. The proposed flexible approach is an interesting alternative to classic Bayesian parametric methods for estimation of the incubation distribution.

2.
Stat Med ; 42(27): 4952-4971, 2023 11 30.
Article in English | MEDLINE | ID: mdl-37668286

ABSTRACT

In this work, we propose an extension of a semiparametric nonlinear mixed-effects model for longitudinal data that incorporates more flexibility with penalized splines (P-splines) as smooth terms. The novelty of the proposed approach consists of the formulation of the model within the stochastic approximation version of the EM algorithm for maximum likelihood, the so-called SAEM algorithm. The proposed approach takes advantage of the formulation of a P-spline as a mixed-effects model and the use of the computational advantages of the existing software for the SAEM algorithm for the estimation of the random effects and the variance components. Additionally, we developed a supervised classification method for these non-linear mixed models using an adaptive importance sampling scheme. To illustrate our proposal, we consider two studies on pregnant women where two biomarkers are used as indicators of changes during pregnancy. In both studies, information about the women's pregnancy outcomes is known. Our proposal provides a unified framework for the classification of longitudinal profiles that may have important implications for the early detection and monitoring of pregnancy-related changes and contribute to improved maternal and fetal health outcomes. We show that the proposed models improve the analysis of this type of data compared to previous studies. These improvements are reflected both in the fit of the models and in the classification of the groups.


Subject(s)
Algorithms , Software , Female , Humans , Pregnancy , Pregnancy Outcome , Models, Statistical , Longitudinal Studies
3.
Biom J ; 65(6): e2200024, 2023 08.
Article in English | MEDLINE | ID: mdl-36639234

ABSTRACT

In epidemic models, the effective reproduction number is of central importance to assess the transmission dynamics of an infectious disease and to orient health intervention strategies. Publicly shared data during an outbreak often suffers from two sources of misreporting (underreporting and delay in reporting) that should not be overlooked when estimating epidemiological parameters. The main statistical challenge in models that intrinsically account for a misreporting process lies in the joint estimation of the time-varying reproduction number and the delay/underreporting parameters. Existing Bayesian approaches typically rely on Markov chain Monte Carlo algorithms that are extremely costly from a computational perspective. We propose a much faster alternative based on Laplacian-P-splines (LPS) that combines Bayesian penalized B-splines for flexible and smooth estimation of the instantaneous reproduction number and Laplace approximations to selected posterior distributions for fast computation. Assuming a known generation interval distribution, the incidence at a given calendar time is governed by the epidemic renewal equation and the delay structure is specified through a composite link framework. Laplace approximations to the conditional posterior of the spline vector are obtained from analytical versions of the gradient and Hessian of the log-likelihood, implying a drastic speed-up in the computation of posterior estimates. Furthermore, the proposed LPS approach can be used to obtain point estimates and approximate credible intervals for the delay and reporting probabilities. Simulation of epidemics with different combinations for the underreporting rate and delay structure (one-day, two-day, and weekend delays) show that the proposed LPS methodology delivers fast and accurate estimates outperforming existing methods that do not take into account underreporting and delay patterns. Finally, LPS is illustrated in two real case studies of epidemic outbreaks.


Subject(s)
Communicable Diseases , Epidemics , Humans , Bayes Theorem , Lipopolysaccharides , Computer Simulation , Communicable Diseases/epidemiology , Monte Carlo Method
4.
Stat Comput ; 33(1): 1, 2023.
Article in English | MEDLINE | ID: mdl-36415568

ABSTRACT

The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps ( ≥  version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines. Supplementary Information: The online version contains supplementary material available at 10.1007/s11222-022-10178-z.

5.
Stat Methods Med Res ; 31(12): 2352-2367, 2022 12.
Article in English | MEDLINE | ID: mdl-36113153

ABSTRACT

The distribution of time-to-event outcomes is usually right-skewed. While for symmetric and moderately skewed data the mean and median are appropriate location measures, the mode is preferable for heavily skewed data as it better represents the center of the distribution. Mode regression has been introduced for uncensored data to model the relationship between covariates and the mode of the outcome. Starting from nonparametric kernel density based mode regression, we examine the use of inverse probability of censoring weights to extend mode regression to handle right-censored data. We add a semiparametric predictor to add further flexibility to the model and we construct a pseudo Akaike's information criterion to select the bandwidth and smoothing parameters. We use simulations to evaluate the performance of our proposed approach. We demonstrate the benefit of adding mode regression to one's toolbox for analyzing survival data on a pancreatic cancer data set from a prospectively maintained cancer registry.


Subject(s)
Models, Statistical , Computer Simulation , Probability
6.
Sensors (Basel) ; 22(16)2022 Aug 17.
Article in English | MEDLINE | ID: mdl-36015930

ABSTRACT

The rapid growth of digital information has produced massive amounts of time series data on rich features and most time series data are noisy and contain some outlier samples, which leads to a decline in the clustering effect. To efficiently discover the hidden statistical information about the data, a fast weighted fuzzy C-medoids clustering algorithm based on P-splines (PS-WFCMdd) is proposed for time series datasets in this study. Specifically, the P-spline method is used to fit the functional data related to the original time series data, and the obtained smooth-fitting data is used as the input of the clustering algorithm to enhance the ability to process the data set during the clustering process. Then, we define a new weighted method to further avoid the influence of outlier sample points in the weighted fuzzy C-medoids clustering process, to improve the robustness of our algorithm. We propose using the third version of mueen's algorithm for similarity search (MASS 3) to measure the similarity between time series quickly and accurately, to further improve the clustering efficiency. Our new algorithm is compared with several other time series clustering algorithms, and the performance of the algorithm is evaluated experimentally on different types of time series examples. The experimental results show that our new method can speed up data processing and the comprehensive performance of each clustering evaluation index are relatively good.


Subject(s)
Algorithms , Fuzzy Logic , Cluster Analysis , Time Factors
7.
J Comput Graph Stat ; 31(2): 553-562, 2022.
Article in English | MEDLINE | ID: mdl-35873662

ABSTRACT

This paper focuses on the problem of modeling and estimating interaction effects between covariates and a continuous treatment variable on an outcome, using a single-index regression. The primary motivation is to estimate an optimal individualized dose rule and individualized treatment effects. To model possibly nonlinear interaction effects between patients' covariates and a continuous treatment variable, we employ a two-dimensional penalized spline regression on an index-treatment domain, where the index is defined as a linear projection of the covariates. The method is illustrated using two applications as well as simulation experiments. A unique contribution of this work is in the parsimonious (single-index) parametrization specifically defined for the interaction effect term.

8.
Stat Med ; 41(14): 2602-2626, 2022 06 30.
Article in English | MEDLINE | ID: mdl-35699121

ABSTRACT

The mixture cure model for analyzing survival data is characterized by the assumption that the population under study is divided into a group of subjects who will experience the event of interest over some finite time horizon and another group of cured subjects who will never experience the event irrespective of the duration of follow-up. When using the Bayesian paradigm for inference in survival models with a cure fraction, it is common practice to rely on Markov chain Monte Carlo (MCMC) methods to sample from posterior distributions. Although computationally feasible, the iterative nature of MCMC often implies long sampling times to explore the target space with chains that may suffer from slow convergence and poor mixing. Furthermore, extra efforts have to be invested in diagnostic checks to monitor the reliability of the generated posterior samples. A sampling-free strategy for fast and flexible Bayesian inference in the mixture cure model is suggested in this article by combining Laplace approximations and penalized B-splines. A logistic regression model is assumed for the cure proportion and a Cox proportional hazards model with a P-spline approximated baseline hazard is used to specify the conditional survival function of susceptible subjects. Laplace approximations to the posterior conditional latent vector are based on analytical formulas for the gradient and Hessian of the log-likelihood, resulting in a substantial speed-up in approximating posterior distributions. The spline specification yields smooth estimates of survival curves and functions of latent variables together with their associated credible interval are estimated in seconds. A fully stochastic algorithm based on a Metropolis-Langevin-within-Gibbs sampler is also suggested as an alternative to the proposed Laplacian-P-splines mixture cure (LPSMC) methodology. The statistical performance and computational efficiency of LPSMC is assessed in a simulation study. Results show that LPSMC is an appealing alternative to MCMC for approximate Bayesian inference in standard mixture cure models. Finally, the novel LPSMC approach is illustrated on three applications involving real survival data.


Subject(s)
Algorithms , Bayes Theorem , Humans , Markov Chains , Monte Carlo Method , Proportional Hazards Models , Reproducibility of Results
9.
BMC Pediatr ; 21(1): 529, 2021 11 30.
Article in English | MEDLINE | ID: mdl-34847925

ABSTRACT

BACKGROUND: The geographical differences that cause anaemia can be partially explained by the variability in environmental factors, particularly nutrition and infections. The studies failed to explain the non-linear effect of the continuous covariates on childhood anaemia. The present paper aims to investigate the risk factors of childhood anaemia in India with focus on geographical spatial effect. METHODS: Geo-additive logistic regression models were fitted to the data to understand fixed as well as spatial effects of childhood anaemia. Logistic regression was fitted for the categorical variable with outcomes (anaemia (Hb < 11) and no anaemia (Hb ≥ 11)). Continuous covariates were modelled by the penalized spline and spatial effects were smoothed by the two-dimensional spline. RESULTS: At 95% posterior credible interval, the influence of unobserved factors on childhood anaemia is very strong in the Northern and Central part of India. However, most of the states in North Eastern part of India showed negative spatial effects. A U-shape non-linear relationship was observed between childhood anaemia and mother's age. This indicates that mothers of young and old ages are more likely to have anaemic children; in particular mothers aged 15 years to about 25 years. Then the risk of childhood anaemia starts declining after the age of 25 years and it continues till the age of around 37 years, thereafter again starts increasing. Further, the non-linear effects of duration of breastfeeding on childhood anaemia show that the risk of childhood anaemia decreases till 29 months thereafter increases. CONCLUSION: Strong evidence of residual spatial effect to childhood anaemia in India is observed. Government child health programme should gear up in treating childhood anaemia by focusing on known measurable factors such as mother's education, mother's anaemia status, family wealth status, child health (fever), stunting, underweight, and wasting which have been found to be significant in this study. Attention should also be given to effects of unknown or unmeasured factors to childhood anaemia at the community level. Special attention to unmeasurable factors should be focused in the states of central and northern India which have shown significant positive spatial effects.


Subject(s)
Anemia , Adolescent , Adult , Anemia/epidemiology , Anemia/etiology , Bayes Theorem , Female , Growth Disorders , Humans , India/epidemiology , Infant , Prevalence , Risk Factors , Thinness
10.
Stat Med ; 40(25): 5501-5520, 2021 11 10.
Article in English | MEDLINE | ID: mdl-34272749

ABSTRACT

Expectile regression can be used to analyze the entire conditional distribution of a response, omitting all distributional assumptions. Among its benefits are computational simplicity, efficiency, and the possibility to incorporate a semiparametric predictor. Due to its advantages in full data settings, we propose an extension to right-censored data situations, where conventional methods typically focus only on mean effects. We propose to extend expectile regression with inverse probability weights. Estimates are easy to implement and computationally simple. Expectiles can be converted to more easily interpreted tail expectations, that is, the expected residual life. It provides a meaningful effect measure, similar to the hazard rate. The results from an extensive simulation study are presented, evaluating consistency and sensitivity to violations of assumptions. We use the proposed method to analyze survival times of colorectal cancer patients from a regional certified high volume cancer center.


Subject(s)
Models, Statistical , Computer Simulation , Humans , Probability
11.
BMC Med Res Methodol ; 20(1): 299, 2020 12 09.
Article in English | MEDLINE | ID: mdl-33297980

ABSTRACT

BACKGROUND: Precise predictions of incidence and mortality rates due to breast cancer (BC) are required for planning of public health programs as well as for clinical services. A number of approaches has been established for prediction of mortality using stochastic models. The performance of these models intensely depends on different patterns shown by mortality data in different countries. METHODS: The BC mortality data is retrieved from the Global burden of disease (GBD) study 2017 database. This study include BC mortality rates from 1990 to 2017, with ages 20 to 80+ years old women, for different Asian countries. Our study extend the current literature on Asian BC mortality data, on both the number of considered stochastic mortality models and their rigorous evaluation using multivariate Diebold-Marino test and by range of graphical analysis for multiple countries. RESULTS: Study findings reveal that stochastic smoothed mortality models based on functional data analysis generally outperform on quadratic structure of BC mortality rates than the other lee-carter models, both in term of goodness of fit and on forecast accuracy. Besides, smoothed lee carter (SLC) model outperform the functional demographic model (FDM) in case of symmetric structure of BC mortality rates, and provides almost comparable results to FDM in within and outside data forecast accuracy for heterogeneous set of BC mortality rates. CONCLUSION: Considering the SLC model in comparison to the other can be obliging to forecast BC mortality and life expectancy at birth, since it provides even better results in some cases. In the current situation, we can assume that there is no single model, which can truly outperform all the others on every population. Therefore, we also suggest generating BC mortality forecasts using multiple models rather than relying upon any single model.


Subject(s)
Breast Neoplasms , Adult , Aged , Aged, 80 and over , Databases, Factual , Female , Forecasting , Humans , Incidence , Infant, Newborn , Life Expectancy , Middle Aged , Mortality , Young Adult
12.
BMC Med Res Methodol ; 20(1): 261, 2020 10 20.
Article in English | MEDLINE | ID: mdl-33081698

ABSTRACT

BACKGROUND: Network meta-analysis (NMA) provides a powerful tool for the simultaneous evaluation of multiple treatments by combining evidence from different studies, allowing for direct and indirect comparisons between treatments. In recent years, NMA is becoming increasingly popular in the medical literature and underlying statistical methodologies are evolving both in the frequentist and Bayesian framework. Traditional NMA models are often based on the comparison of two treatment arms per study. These individual studies may measure outcomes at multiple time points that are not necessarily homogeneous across studies. METHODS: In this article we present a Bayesian model based on B-splines for the simultaneous analysis of outcomes across time points, that allows for indirect comparison of treatments across different longitudinal studies. RESULTS: We illustrate the proposed approach in simulations as well as on real data examples available in the literature and compare it with a model based on P-splines and one based on fractional polynomials, showing that our approach is flexible and overcomes the limitations of the latter. CONCLUSIONS: The proposed approach is computationally efficient and able to accommodate a large class of temporal treatment effect patterns, allowing for direct and indirect comparisons of widely varying shapes of longitudinal profiles.


Subject(s)
Algorithms , Bayes Theorem , Humans , Longitudinal Studies , Network Meta-Analysis
13.
Biom J ; 62(7): 1670-1686, 2020 11.
Article in English | MEDLINE | ID: mdl-32520420

ABSTRACT

This paper focuses on the problems of estimation and variable selection in the functional linear regression model (FLM) with functional response and scalar covariates. To this end, two different types of regularization (L1 and L2 ) are considered in this paper. On the one hand, a sample approach for functional LASSO in terms of basis representation of the sample values of the response variable is proposed. On the other hand, we propose a penalized version of the FLM by introducing a P-spline penalty in the least squares fitting criterion. But our aim is to propose P-splines as a powerful tool simultaneously for variable selection and functional parameters estimation. In that sense, the importance of smoothing the response variable before fitting the model is also studied. In summary, penalized (L1 and L2 ) and nonpenalized regression are combined with a presmoothing of the response variable sample curves, based on regression splines or P-splines, providing a total of six approaches to be compared in two simulation schemes. Finally, the most competitive approach is applied to a real data set based on the graft-versus-host disease, which is one of the most frequent complications (30% -50%) in allogeneic hematopoietic stem-cell transplantation.


Subject(s)
Computer Simulation , Graft vs Host Disease , Linear Models , Graft vs Host Disease/diagnosis , Hematopoietic Stem Cell Transplantation/adverse effects , Humans , Least-Squares Analysis
14.
Acta sci., Health sci ; 42: e51437, 2020.
Article in English | LILACS | ID: biblio-1372266

ABSTRACT

Concerning the specificities of a longitudinal study, the trajectories of a subject's mean responses not always present a linear behavior, which calls for tools that take into account the non-linearity of individual trajectories and that describe them towards associating possible random effects with each individual. Generalized additive mixed models (GAMMs) have come to solve this problem, since, in this class of models, it is possible to assign specific random effects to individuals, in addition to rewriting the linear term by summing unknown smooth functions, not parametrically specified, then using the P-splines smoothing technique. Thus, this article aims to introduce this methodology applied to a dataset referring to an experiment involving 57 Swiss mice infected by Trypanosoma cruzi, which had their weights monitored for 12 weeks. The analyses showed significant differences in the weight trajectory of the individuals by treatment group; besides, the assumptions required to validate the model were met. Therefore, it is possible to conclude that this methodology is satisfactory in modeling data of longitudinal sort, because, with this approach, in addition to the possibility of including fixed and random effects, these models allow adding complex correlation structures to residuals.


Subject(s)
Animals , Male , Mice , Trypanosoma cruzi/immunology , Trypanosoma cruzi/parasitology , Biotherapics/antagonists & inhibitors , Serum/immunology , Serum/parasitology , Body-Weight Trajectory , Body Weights and Measures , Antibodies, Protozoan/immunology , Chickens , Chagas Disease/drug therapy , Randomized Controlled Trial, Veterinary , Mice , Antigens, Protozoan/immunology
15.
J Multivar Anal ; 171: 382-396, 2019 May.
Article in English | MEDLINE | ID: mdl-31588153

ABSTRACT

By optimizing index functions against different outcomes, we propose a multivariate single-index model (SIM) for development of medical indices that simultaneously work with multiple outcomes. Fitting of a multivariate SIM is not fundamentally different from fitting a univariate SIM, as the former can be written as a sum of multiple univariate SIMs with appropriate indicator functions. What have not been carefully studied are the theoretical properties of the parameter estimators. Because of the lack of asymptotic results, no formal inference procedure has been made available for multivariate SIMs. In this paper, we examine the asymptotic properties of the multivariate SIM parameter estimators. We show that, under mild regularity conditions, estimators for the multivariate SIM parameters are indeed n-consistent and asymptotically normal. We conduct a simulation study to investigate the finite-sample performance of the corresponding estimation and inference procedures. To illustrate its use in practice, we construct an index measure of urine electrolyte markers for assessing the risk of hypertension in individual subjects.

16.
J Am Stat Assoc ; 114(525): 48-60, 2019.
Article in English | MEDLINE | ID: mdl-31178611

ABSTRACT

Identifying patient-specific prognostic biomarkers is of critical importance in developing personalized treatment for clinically and molecularly heterogeneous diseases such as cancer. In this article, we propose a novel regression framework, Bayesian hierarchical varying-sparsity regression (BEHAVIOR) models to select clinically relevant disease markers by integrating proteogenomic (proteomic+genomic) and clinical data. Our methods allow flexible modeling of protein-gene relationships as well as induces sparsity in both protein-gene and protein-survival relationships, to select ge-nomically driven prognostic protein markers at the patient-level. Simulation studies demonstrate the superior performance of BEHAVIOR against competing method in terms of both protein marker selection and survival prediction. We apply BEHAV-IOR to The Cancer Genome Atlas (TCGA) proteogenomic pan-cancer data and find several interesting prognostic proteins and pathways that are shared across multiple cancers and some that exclusively pertain to specific cancers.

17.
Clin Epidemiol ; 11: 403-417, 2019.
Article in English | MEDLINE | ID: mdl-31191033

ABSTRACT

Background: Methodological challenges arise with the analysis of patient satisfaction as a measure of health care quality. One of them is the necessity to adjust for differences in patient characteristics or other variables. A combination of several helpful extensions to regression analysis is shown based on patients with inflammatory bowel disease (IBD) to help identify important covariates associated with the distribution of satisfaction. Patients and methods: Analyses were based on cross-sectional data from a postal survey on the health care of patients with IBD aged 15-25, with satisfaction assessed using a 32-item validated questionnaire weighing experience by perceived relevance. The weighted summary score was modeled using a Beta distribution in a generalized additive model for location, scale and shape. Covariates were distinguished in 3 groups and the model was entered in separate, consecutive analyses. First, demographic and disease-related variables were included. Next, information about the IBD specialist was added. The third step added care quality indicators. Results are presented as OR with 95% CI. Results: In the survey, 619 questionnaires were returned and the data set had 453 complete cases for analysis. Satisfaction appeared increased for patients working (OR 1.59, 95% CI: 1.19-2.11) or studying (1.25, 1.00-1.56) as compared to those still at school or in non-academic job training. High anxiety scores and an older age of onset were associated with lower satisfaction. The variation of satisfaction is higher for patients with Crohn's disease or who have statutory insurance (1.19, 1.01-1.40 and 1.22, 1.06-1.40). Conclusion: Modeling the entire distribution of the response uncovered additional influences on the variance of patient satisfaction not previously identified by classical regression. It also resulted in a richer model for the mean. The construction of a combined model for different features of the distribution also helped to improve the control of confounding.

18.
Stat Methods Med Res ; 28(7): 2112-2124, 2019 07.
Article in English | MEDLINE | ID: mdl-29278101

ABSTRACT

Alzheimer's disease is a firmly incurable and progressive disease. The pathology of Alzheimer's disease usually evolves from cognitive normal, to mild cognitive impairment, to Alzheimer's disease. The aim of this paper is to develop a Bayesian hidden Markov model to characterize disease pathology, identify hidden states corresponding to the diagnosed stages of cognitive decline, and examine the dynamic changes of potential risk factors associated with the cognitive normal-mild cognitive impairment-Alzheimer's disease transition. The hidden Markov model framework consists of two major components. The first one is a state-dependent semiparametric regression for delineating the complex associations between clinical outcomes of interest and a set of prognostic biomarkers across neurodegenerative states. The second one is a parametric transition model, while accounting for potential covariate effects on the cross-state transition. The inter-individual and inter-process differences are taken into account via correlated random effects in both components. Based on the Alzheimer's Disease Neuroimaging Initiative data set, we are able to identify four states of Alzheimer's disease pathology, corresponding to common diagnosed cognitive decline stages, including cognitive normal, early mild cognitive impairment, late mild cognitive impairment, and Alzheimer's disease and examine the effects of hippocampus, age, gender, and APOE- ε4 on degeneration of cognitive function across the four cognitive states.


Subject(s)
Alzheimer Disease/pathology , Bayes Theorem , Markov Chains , Age Factors , Biomarkers , Computer Simulation , Disease Progression , Female , Humans , Male , Prognosis , Risk Factors
19.
Ann Appl Stat ; 13(4): 2539-2563, 2019 Dec 01.
Article in English | MEDLINE | ID: mdl-33479569

ABSTRACT

One of the data structures generated by medical imaging technology is high resolution point clouds representing anatomical surfaces. Stereophotogrammetry and laser scanning are two widely available sources of this kind of data. A standardised surface representation is required to provide a meaningful correspondence across different images as a basis for statistical analysis. Point locations with anatomical definitions, referred to as landmarks, have been the traditional approach. Landmarks can also be taken as the starting point for more general surface representations, often using templates which are warped on to an observed surface by matching landmark positions and subsequent local adjustment of the surface. The aim of the present paper is to provide a new approach which places anatomical curves at the heart of the surface representation and its analysis. Curves provide intermediate structures which capture the principal features of the manifold (surface) of interest through its ridges and valleys. As landmarks are often available these are used as anchoring points, but surface curvature information is the principal guide in estimating the curve locations. The surface patches between these curves are relatively flat and can be represented in a standardised manner by appropriate surface transects to give a complete surface model. This new approach does not require the use of a template, reference sample or any external information to guide the method and, when compared with a surface based approach, the estimation of curves is shown to have improved performance. In addition, examples involving applications to mussel shells and human faces show that the analysis of curve information can deliver more targeted and effective insight than the use of full surface information.

20.
Stat Med ; 38(6): 1002-1012, 2019 03 15.
Article in English | MEDLINE | ID: mdl-30430613

ABSTRACT

In many global health analyses, it is of interest to examine countries' progress using indicators of socio-economic conditions based on national surveys from varying sources. This results in longitudinal data where heteroscedastic summary measures, rather than individual level data, are available. Administration of national surveys can be sporadic, resulting in sparse data measurements for some countries. Furthermore, the trend of the indicators over time is usually nonlinear and varies by country. It is of interest to track the current level of indicators to determine if countries are meeting certain thresholds, such as those indicated in the United Nations Sustainable Development Goals. In addition, estimation of confidence and prediction intervals are vital to determine true changes in prevalence and where data is low in quantity and/or quality. In this article, we use heteroscedastic penalized longitudinal models with survey summary data to estimate yearly prevalence of malnutrition quantities. We develop and compare methods to estimate confidence and prediction intervals using asymptotic and parametric bootstrap techniques. The intervals can incorporate data from multiple sources or other general data-smoothing steps. The methods are applied to African countries in the UNICEF-WHO-The World Bank joint child malnutrition data set. The properties of the intervals are demonstrated through simulation studies and cross-validation of real data.


Subject(s)
Child Nutrition Disorders/epidemiology , Longitudinal Studies , Models, Statistical , Africa/epidemiology , Child , Global Health/statistics & numerical data , Health Surveys , Humans , Prevalence , Sustainable Development , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...