Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
PLoS One ; 18(4): e0283798, 2023.
Article in English | MEDLINE | ID: mdl-37011065

ABSTRACT

In regression modelling, measurement error models are often needed to correct for uncertainty arising from measurements of covariates/predictor variables. The literature on measurement error (or errors-in-variables) modelling is plentiful, however, general algorithms and software for maximum likelihood estimation of models with measurement error are not as readily available, in a form that they can be used by applied researchers without relatively advanced statistical expertise. In this study, we develop a novel algorithm for measurement error modelling, which could in principle take any regression model fitted by maximum likelihood, or penalised likelihood, and extend it to account for uncertainty in covariates. This is achieved by exploiting an interesting property of the Monte Carlo Expectation-Maximization (MCEM) algorithm, namely that it can be expressed as an iteratively reweighted maximisation of complete data likelihoods (formed by imputing the missing values). Thus we can take any regression model for which we have an algorithm for (penalised) likelihood estimation when covariates are error-free, nest it within our proposed iteratively reweighted MCEM algorithm, and thus account for uncertainty in covariates. The approach is demonstrated on examples involving generalized linear models, point process models, generalized additive models and capture-recapture models. Because the proposed method uses maximum (penalised) likelihood, it inherits advantageous optimality and inferential properties, as illustrated by simulation. We also study the model robustness of some violations in predictor distributional assumptions. Software is provided as the refitME package on R, whose key function behaves like a refit() function, taking a fitted regression model object and re-fitting with a pre-specified amount of measurement error.


Subject(s)
Algorithms , Motivation , Likelihood Functions , Linear Models , Computer Simulation , Monte Carlo Method , Models, Statistical
2.
J Agric Biol Environ Stat ; 27(2): 303-320, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35813491

ABSTRACT

Population size estimation is an important research field in biological sciences. In practice, covariates are often measured upon capture on individuals sampled from the population. However, some biological measurements, such as body weight may vary over time within a subject's capture history. This can be treated as a population size estimation problem in the presence of covariate measurement error. We show that if the unobserved true covariate and measurement error are both normally distributed, then a naïve estimator without taking into account measurement error will under-estimate the population size. We then develop new methods to correct for the effect of measurement errors. In particular, we present a conditional score and a nonparametric corrected score approach that are both consistent for population size estimation. Importantly, the proposed approaches do not require the distribution assumption on the true covariates, furthermore the latter does not require normality assumptions on the measurement errors. This is highly relevant in biological applications, as the distribution of covariates is often non-normal or unknown. We investigate finite sample performance of the new estimators via extensive simulated studies. The methods are applied to real data from a capture-recapture study.

3.
Biometrics ; 78(2): 598-611, 2022 06.
Article in English | MEDLINE | ID: mdl-33527374

ABSTRACT

Spatial or temporal clustering commonly arises in various biological and ecological applications, for example, species or communities may cluster in groups. In this paper, we develop a new clustered occurrence data model where presence-absence data are modeled under a multivariate negative binomial framework. We account for spatial or temporal clustering by introducing a community parameter in the model that controls the strength of dependence between observations thereby enhancing the estimation of the mean and dispersion parameters. We provide conditions to show the existence of maximum likelihood estimates when cluster sizes are homogeneous and equal to 2 or 3 and consider a composite likelihood approach that allows for additional robustness and flexibility in fitting for clustered occurrence data. The proposed method is evaluated in a simulation study and demonstrated using forest plot data from the Center for Tropical Forest Science. Finally, we present several examples using multiple visit occupancy data to illustrate the difference between the proposed model and those of N-mixture models.


Subject(s)
Likelihood Functions , Cluster Analysis , Computer Simulation
4.
Stat Med ; 40(10): 2467-2497, 2021 05 10.
Article in English | MEDLINE | ID: mdl-33629367

ABSTRACT

Multiple imputation and maximum likelihood estimation (via the expectation-maximization algorithm) are two well-known methods readily used for analyzing data with missing values. While these two methods are often considered as being distinct from one another, multiple imputation (when using improper imputation) is actually equivalent to a stochastic expectation-maximization approximation to the likelihood. In this article, we exploit this key result to show that familiar likelihood-based approaches to model selection, such as Akaike's information criterion (AIC) and the Bayesian information criterion (BIC), can be used to choose the imputation model that best fits the observed data. Poor choice of imputation model is known to bias inference, and while sensitivity analysis has often been used to explore the implications of different imputation models, we show that the data can be used to choose an appropriate imputation model via conventional model selection tools. We show that BIC can be consistent for selecting the correct imputation model in the presence of missing data. We verify these results empirically through simulation studies, and demonstrate their practicality on two classical missing data examples. An interesting result we saw in simulations was that not only can parameter estimates be biased by misspecifying the imputation model, but also by overfitting the imputation model. This emphasizes the importance of using model selection not just to choose the appropriate type of imputation model, but also to decide on the appropriate level of imputation model complexity.


Subject(s)
Algorithms , Bayes Theorem , Bias , Computer Simulation , Humans , Likelihood Functions
5.
Biom J ; 61(4): 1073-1087, 2019 07.
Article in English | MEDLINE | ID: mdl-31090104

ABSTRACT

Zero-truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well-known software packages, and additional programming is often required. Motivated by the Rao-Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero-truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.


Subject(s)
Biometry/methods , Models, Statistical , Aged , Animals , Female , Humans , Likelihood Functions , Male , Marsupialia , Medicare/statistics & numerical data , Poisson Distribution , Population Density , United States
6.
Biometrics ; 74(1): 280-288, 2018 03.
Article in English | MEDLINE | ID: mdl-28632891

ABSTRACT

Sparse capture-recapture data from open populations are difficult to analyze using currently available frequentist statistical methods. However, in closed capture-recapture experiments, the Chao sparse estimator (Chao, 1989, Biometrics 45, 427-438) may be used to estimate population sizes when there are few recaptures. Here, we extend the Chao (1989) closed population size estimator to the open population setting by using linear regression and extrapolation techniques. We conduct a small simulation study and apply the models to several sparse capture-recapture data sets.


Subject(s)
Biometry/methods , Data Interpretation, Statistical , Linear Models , Animals , Computer Simulation , Models, Statistical , Population Density
7.
Nat Commun ; 8(1): 1071, 2017 10 20.
Article in English | MEDLINE | ID: mdl-29057865

ABSTRACT

Genetic rescue has now been attempted in several threatened species, but the contribution of genetics per se to any increase in population health can be hard to identify. Rescue is expected to be particularly useful when individuals are introduced into small isolated populations with low levels of genetic variation. Here we consider such a situation by documenting genetic rescue in the mountain pygmy possum, Burramys parvus. Rapid population recovery occurred in the target population after the introduction of a small number of males from a large genetically diverged population. Initial hybrid fitness was more than two-fold higher than non-hybrids; hybrid animals had a larger body size, and female hybrids produced more pouch young and lived longer. Genetic rescue likely contributed to the largest population size ever being recorded at this site. These data point to genetic rescue as being a potentially useful option for the recovery of small threatened populations.


Subject(s)
Conservation of Natural Resources/methods , Endangered Species/statistics & numerical data , Marsupialia/genetics , Animals , Female , Genetics, Population , Male , Population Density
8.
Biom J ; 58(6): 1409-1427, 2016 Nov.
Article in English | MEDLINE | ID: mdl-27477340

ABSTRACT

The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, Am. Nat. 176, 96-98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples.


Subject(s)
Biometry/methods , Models, Statistical , Binomial Distribution , Computer Simulation , Humans , Probability , Selection Bias , Time Factors
9.
Front Zool ; 13: 31, 2016.
Article in English | MEDLINE | ID: mdl-27398088

ABSTRACT

BACKGROUND: As increasingly fragmented and isolated populations of threatened species become subjected to climate change, invasive species and other stressors, there is an urgent need to consider adaptive potential when making conservation decisions rather than focussing on past processes. In many cases, populations identified as unique and currently managed separately suffer increased risk of extinction through demographic and genetic processes. Other populations currently not at risk are likely to be on a trajectory where declines in population size and fitness soon appear inevitable. RESULTS: Using datasets from natural Australian mammal populations, we show that drift processes are likely to be driving uniqueness in populations of many threatened species as a result of small population size and fragmentation. Conserving and managing such remnant populations separately will therefore often decrease their adaptive potential and increase species extinction risk. CONCLUSIONS: These results highlight the need for a paradigm shift in conservation biology practise; strategies need to focus on the preservation of genetic diversity at the species level, rather than population, subspecies or evolutionary significant unit. The introduction of new genetic variants into populations through in situ translocation needs to be considered more broadly in conservation programs as a way of decreasing extinction risk by increasing neutral genetic diversity which may increase the adaptive potential of populations if adaptive variation is also increased.

10.
Math Biosci ; 255: 43-51, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24998623

ABSTRACT

To accommodate seasonal effects that change from year to year into models for the size of an open population we consider a time-varying coefficient model. We fit this model to a capture-recapture data set collected on the little penguin Eudyptula minor in south-eastern Australia over a 25 year period using Jolly-Seber type estimators and nonparametric P-spline techniques. The time-varying coefficient model identified strong changes in the seasonal pattern across the years which we further examined using functional data analysis techniques. To evaluate the methodology we also conducted several simulation studies that incorporate seasonal variation.


Subject(s)
Models, Biological , Spheniscidae , Animals , Computer Simulation , Female , Male , Mathematical Concepts , Population Density , Seasons , Statistics, Nonparametric , Victoria
11.
Biometrics ; 70(1): 110-20, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24350717

ABSTRACT

We propose a new variable selection criterion designed for use with forward selection algorithms; the score information criterion (SIC). The proposed criterion is based on score statistics which incorporate correlated response data. The main advantage of the SIC is that it is much faster to compute than existing model selection criteria when the number of predictor variables added to a model is large, this is because SIC can be computed for all candidate models without actually fitting them. A second advantage is that it incorporates the correlation between variables into its quasi-likelihood, leading to more desirable properties than competing selection criteria. Consistency and prediction properties are shown for the SIC. We conduct simulation studies to evaluate the selection and prediction performances, and compare these, as well as computational times, with some well-known variable selection criteria. We apply the SIC on a real data set collected on arthropods by considering variable selection on a large number of interactions terms consisting of species traits and environmental covariates.


Subject(s)
Algorithms , Data Interpretation, Statistical , Likelihood Functions , Longitudinal Studies/methods , Models, Statistical , Animals , Arthropods/growth & development , Australia , Computer Simulation , Ecosystem
12.
Biom J ; 54(6): 861-74, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23027314

ABSTRACT

In capture-recapture models, survival and capture probabilities can be modelled as functions of time-varying covariates, such as temperature or rainfall. The Cormack-Jolly-Seber (CJS) model allows for flexible modelling of these covariates; however, the functional relationship may not be linear. We extend the CJS model by semi-parametrically modelling capture and survival probabilities using a frequentist approach via P-splines techniques. We investigate the performance of the estimators by conducting simulation studies. We also apply and compare these models with known semi-parametric Bayesian approaches on simulated and real data sets.


Subject(s)
Environment , Models, Statistical , Analysis of Variance , Animals , Bayes Theorem , Population Dynamics , Probability , Spheniscidae
13.
Biometrics ; 67(4): 1659-65, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21466530

ABSTRACT

In practice, when analyzing data from a capture-recapture experiment it is tempting to apply modern advanced statistical methods to the observed capture histories. However, unless the analysis takes into account that the data have only been collected from individuals who have been captured at least once, the results may be biased. Without the development of new software packages, methods such as generalized additive models, generalized linear mixed models, and simulation-extrapolation cannot be readily implemented. In contrast, the partial likelihood approach allows the analysis of a capture-recapture experiment to be conducted using commonly available software. Here we examine the efficiency of this approach and apply it to several data sets.


Subject(s)
Censuses , Data Interpretation, Statistical , Emigration and Immigration/statistics & numerical data , Models, Statistical , Population Density , Animals , Computer Simulation , Likelihood Functions
SELECTION OF CITATIONS
SEARCH DETAIL
...