Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 67
Filter
2.
Stat Med ; 39(21): 2815-2842, 2020 09 20.
Article in English | MEDLINE | ID: mdl-32419182

ABSTRACT

Missing data due to loss to follow-up or intercurrent events are unintended, but unfortunately inevitable in clinical trials. Since the true values of missing data are never known, it is necessary to assess the impact of untestable and unavoidable assumptions about any unobserved data in sensitivity analysis. This tutorial provides an overview of controlled multiple imputation (MI) techniques and a practical guide to their use for sensitivity analysis of trials with missing continuous outcome data. These include δ- and reference-based MI procedures. In δ-based imputation, an offset term, δ, is typically added to the expected value of the missing data to assess the impact of unobserved participants having a worse or better response than those observed. Reference-based imputation draws imputed values with some reference to observed data in other groups of the trial, typically in other treatment arms. We illustrate the accessibility of these methods using data from a pediatric eczema trial and a chronic headache trial and provide Stata code to facilitate adoption. We discuss issues surrounding the choice of δ in δ-based sensitivity analysis. We also review the debate on variance estimation within reference-based analysis and justify the use of Rubin's variance estimator in this setting, since as we further elaborate on within, it provides information anchored inference.


Subject(s)
Data Interpretation, Statistical , Child , Humans
3.
Stat Med ; 39(11): 1658-1674, 2020 May 20.
Article in English | MEDLINE | ID: mdl-32059073

ABSTRACT

Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.


Subject(s)
Gastroesophageal Reflux , Quality of Life , Bias , Humans , Likelihood Functions , Models, Statistical
4.
Pharm Stat ; 18(6): 645-658, 2019 11.
Article in English | MEDLINE | ID: mdl-31309730

ABSTRACT

The analysis of time-to-event data typically makes the censoring at random assumption, ie, that-conditional on covariates in the model-the distribution of event times is the same, whether they are observed or unobserved (ie, right censored). When patients who remain in follow-up stay on their assigned treatment, then analysis under this assumption broadly addresses the de jure, or "while on treatment strategy" estimand. In such cases, we may well wish to explore the robustness of our inference to more pragmatic, de facto or "treatment policy strategy," assumptions about the behaviour of patients post-censoring. This is particularly the case when censoring occurs because patients change, or revert, to the usual (ie, reference) standard of care. Recent work has shown how such questions can be addressed for trials with continuous outcome data and longitudinal follow-up, using reference-based multiple imputation. For example, patients in the active arm may have their missing data imputed assuming they reverted to the control (ie, reference) intervention on withdrawal. Reference-based imputation has two advantages: (a) it avoids the user specifying numerous parameters describing the distribution of patients' postwithdrawal data and (b) it is, to a good approximation, information anchored, so that the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. In this article, we build on recent work in the survival context, proposing a class of reference-based assumptions appropriate for time-to-event data. We report a simulation study exploring the extent to which the multiple imputation estimator (using Rubin's variance formula) is information anchored in this setting and then illustrate the approach by reanalysing data from a randomized trial, which compared medical therapy with angioplasty for patients presenting with angina.


Subject(s)
Clinical Trials as Topic/methods , Data Interpretation, Statistical , Models, Statistical , Computer Simulation , Follow-Up Studies , Humans , Randomized Controlled Trials as Topic/methods , Research Design , Time Factors
5.
J R Stat Soc Ser A Stat Soc ; 182(2): 623-645, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30828138

ABSTRACT

Analysis of longitudinal randomized clinical trials is frequently complicated because patients deviate from the protocol. Where such deviations are relevant for the estimand, we are typically required to make an untestable assumption about post-deviation behaviour to perform our primary analysis and to estimate the treatment effect. In such settings, it is now widely recognized that we should follow this with sensitivity analyses to explore the robustness of our inferences to alternative assumptions about post-deviation behaviour. Although there has been much work on how to conduct such sensitivity analyses, little attention has been given to the appropriate loss of information due to missing data within sensitivity analysis. We argue that more attention needs to be given to this issue, showing that it is quite possible for sensitivity analysis to decrease and increase the information about the treatment effect. To address this critical issue, we introduce the concept of information-anchored sensitivity analysis. By this we mean sensitivity analyses in which the proportion of information about the treatment estimate lost because of missing data is the same as the proportion of information about the treatment estimate lost because of missing data in the primary analysis. We argue that this forms a transparent, practical starting point for interpretation of sensitivity analysis. We then derive results showing that, for longitudinal continuous data, a broad class of controlled and reference-based sensitivity analyses performed by multiple imputation are information anchored. We illustrate the theory with simulations and an analysis of a peer review trial and then discuss our work in the context of other recent work in this area. Our results give a theoretical basis for the use of controlled multiple-imputation procedures for sensitivity analysis.

6.
BMC Psychol ; 6(1): 62, 2018 Dec 20.
Article in English | MEDLINE | ID: mdl-30572936

ABSTRACT

BACKGROUND: Habits (learned automatic responses to contextual cues) are considered important in sustaining health behaviour change. While habit formation is promoted by repeating behaviour in a stable context, little is known about what other variables may contribute, and whether there are variables which may accelerate the habit formation process. The aim of this study was to explore variables relating to the perceived reward value of behaviour - pleasure, perceived utility, perceived benefits, and intrinsic motivation. The paper tests whether reward has an impact on habit formation which is mediated by behavioural repetition, and whether reward moderates the relationship between repetition and habit formation. METHODS: Habit formation for flossing and vitamin C tablet adherence was investigated in the general public following an intervention, using a longitudinal, single-group design. Of a total sample of 118 participants, 80 received an online vitamin C intervention at baseline, and all 118 received a face-to-face flossing intervention four weeks later. Behaviour, habit, intention, context stability (whether the behaviour was conducted in the same place and point in routine every time), and reward variables were self-reported every four weeks, for sixteen weeks. Structured equation modelling was used to model reward-related variables as predictors of intention, repetition, and habit, and as moderators of the repetition-habit relationship. RESULTS: Habit strength and behaviour increased for both target behaviours. Intrinsic motivation and pleasure moderated the relationship between behavioural repetition and habit. Neither perceived utility nor perceived benefits predicted behaviour nor interacted with repetition. Limited support was obtained for the mediation hypothesis. Strong intentions unexpectedly weakened the repetition-habit relationship. Context stability mediated and for vitamin C, also moderated the repetition-habit relationship. CONCLUSIONS: Pleasure and intrinsic motivation can aid habit formation through promoting greater increase in habit strength per behaviour repetition. Perceived reward can therefore reinforce habits, beyond the impact of reward upon repetition. Habit-formation interventions may be most successful where target behaviours are pleasurable or intrinsically valued.


Subject(s)
Behavior Control/psychology , Habits , Health Behavior , Motivation , Pleasure , Reinforcement, Psychology , Reward , Self Care/psychology , Adult , Female , Humans , Intention , Male , Self Report
7.
Stat Med ; 37(9): 1419-1438, 2018 04 30.
Article in English | MEDLINE | ID: mdl-29349792

ABSTRACT

Quantitative evidence synthesis through meta-analysis is central to evidence-based medicine. For well-documented reasons, the meta-analysis of individual patient data is held in higher regard than aggregate data. With access to individual patient data, the analysis is not restricted to a "two-stage" approach (combining estimates and standard errors) but can estimate parameters of interest by fitting a single model to all of the data, a so-called "one-stage" analysis. There has been debate about the merits of one- and two-stage analysis. Arguments for one-stage analysis have typically noted that a wider range of models can be fitted and overall estimates may be more precise. The two-stage side has emphasised that the models that can be fitted in two stages are sufficient to answer the relevant questions, with less scope for mistakes because there are fewer modelling choices to be made in the two-stage approach. For Gaussian data, we consider the statistical arguments for flexibility and precision in small-sample settings. Regarding flexibility, several of the models that can be fitted only in one stage may not be of serious interest to most meta-analysis practitioners. Regarding precision, we consider fixed- and random-effects meta-analysis and see that, for a model making certain assumptions, the number of stages used to fit this model is irrelevant; the precision will be approximately equal. Meta-analysts should choose modelling assumptions carefully. Sometimes relevant models can only be fitted in one stage. Otherwise, meta-analysts are free to use whichever procedure is most convenient to fit the identified model.


Subject(s)
Meta-Analysis as Topic , Normal Distribution , Data Interpretation, Statistical , Humans , Linear Models , Models, Statistical
8.
J Stat Comput Simul ; 87(8): 1541-1558, 2017 May 24.
Article in English | MEDLINE | ID: mdl-28515536

ABSTRACT

The linear mixed model with an added integrated Ornstein-Uhlenbeck (IOU) process (linear mixed IOU model) allows for serial correlation and estimation of the degree of derivative tracking. It is rarely used, partly due to the lack of available software. We implemented the linear mixed IOU model in Stata and using simulations we assessed the feasibility of fitting the model by restricted maximum likelihood when applied to balanced and unbalanced data. We compared different (1) optimization algorithms, (2) parameterizations of the IOU process, (3) data structures and (4) random-effects structures. Fitting the model was practical and feasible when applied to large and moderately sized balanced datasets (20,000 and 500 observations), and large unbalanced datasets with (non-informative) dropout and intermittent missingness. Analysis of a real dataset showed that the linear mixed IOU model was a better fit to the data than the standard linear mixed model (i.e. independent within-subject errors with constant variance).

9.
Biometrics ; 73(3): 938-948, 2017 09.
Article in English | MEDLINE | ID: mdl-28134978

ABSTRACT

Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag structure through specific penalties. In addition, this framework includes, as special cases, simpler models previously proposed for linear relationships (DLMs). Alternative versions of penalized DLNMs are compared with each other and with the standard unpenalized version in a simulation study. Results show that this penalized extension to the DLNM class provides greater flexibility and improved inferential properties. The framework exploits recent theoretical developments of GAMs and is implemented using efficient routines within freely available software. Real-data applications are illustrated through two reproducible examples in time series and survival analysis.


Subject(s)
Nonlinear Dynamics , Software
10.
BMC Public Health ; 16(1): 1109, 2016 10 21.
Article in English | MEDLINE | ID: mdl-27769194

ABSTRACT

BACKGROUND: Over recent decades, hand, foot and mouth disease (HFMD) has emerged as a serious public health threat in the Asia-Pacific region because of its high rates of severe complications. Understanding the differences and similarities between mild and severe cases can be helpful in the control of HFMD. In this study, we compared the two types of HFMD cases in their temporal trends. METHODS: We retrieved the daily series of disease counts of mild and severe HFMD cases reported in mainland China in the period of 2009-2014. We applied a quasi-Poisson regression model to decompose each series into the long-term linear trend, periodic variations, and short-term fluctuations, and then we compared each component between two series separately. RESULTS: A total of 11,101,860 clinical HFMD cases together with 115,596 severe cases were included into this analysis. We found a biennial increase of 24.46 % (95 % CI: 22.80-26.14 %) for the baseline of disease incidence of mild cases, whereas a biennial decrease of 8.80 % (95 % CI: 7.26-10.31 %) was seen for that of severe cases. The periodic variations of both two series could be characterized by a mixture of biennial, annual, semi-annual and eight-monthly cycles. However, compared to the mild cases, we found the severe cases vary more widely for the biennial and annual cycle, and started its annual epidemic earlier. We also found the short-term fluctuations between two series were still significantly correlated at the current day with a correlation coefficient of 0.46 (95 % CI: 0.43-0.49). CONCLUSIONS: We found some noticeable differences and also similarities between the daily series of mild and severe HFMD cases at different time scales. Our findings can help us to deepen the understanding of the transmission of different types of HFMD cases, and also provide evidences for the planning of the associated disease control strategies.


Subject(s)
Epidemics/prevention & control , Hand, Foot and Mouth Disease/epidemiology , China/epidemiology , Female , Humans , Male , Models, Theoretical , Public Health , Seroepidemiologic Studies
11.
Epidemics ; 17: 1-9, 2016 12.
Article in English | MEDLINE | ID: mdl-27639116

ABSTRACT

OBJECTIVE: Infectious disease spread depends on contact rates between infectious and susceptible individuals. Transmission models are commonly informed using empirically collected contact data, but the relevance of different contact types to transmission is still not well understood. Some studies select contacts based on a single characteristic such as proximity (physical/non-physical), location, duration or frequency. This study aimed to explore whether clusters of contacts similar to each other across multiple characteristics could better explain disease transmission. METHODS: Individual contact data from the POLYMOD survey in Poland, Great Britain, Belgium, Finland and Italy were grouped into clusters by the k medoids clustering algorithm with a Manhattan distance metric to stratify contacts using all four characteristics. Contact clusters were then used to fit a transmission model to sero-epidemiological data for varicella-zoster virus (VZV) in each country. RESULTS AND DISCUSSION: Across the five countries, 9-15 clusters were found to optimise both quality of clustering (measured using average silhouette width) and quality of fit (measured using several information criteria). Of these, 2-3 clusters were most relevant to VZV transmission, characterised by (i) 1-2 clusters of age-assortative contacts in schools, (ii) a cluster of less age-assortative contacts in non-school settings. Quality of fit was similar to using contacts stratified by a single characteristic, providing validation that single stratifications are appropriate. However, using clustering to stratify contacts using multiple characteristics provided insight into the structures underlying infection transmission, particularly the role of age-assortative contacts, involving school age children, for VZV transmission between households.


Subject(s)
Cluster Analysis , Communicable Diseases , Varicella Zoster Virus Infection/epidemiology , Child , Europe/epidemiology , Herpesvirus 3, Human , Humans , Italy , United Kingdom
12.
J Biom Biostat ; 7(1)2016 Feb.
Article in English | MEDLINE | ID: mdl-27175309

ABSTRACT

Often, sample size is not fixed by design. A key example is a sequential trial with a stopping rule, where stopping is based on what has been observed at an interim look. While such designs are used for time and cost efficiency, and hypothesis testing theory has been well developed, estimation following a sequential trial is a challenging, still controversial problem. Progress has been made in the literature, predominantly for normal outcomes and/or for a deterministic stopping rule. Here, we place these settings in a broader context of outcomes following an exponential family distribution and, with a stochastic stopping rule that includes a deterministic rule and completely random sample size as special cases. It is shown that the estimation problem is usually simpler than often thought. In particular, it is established that the ordinary sample average is a very sensible choice, contrary to commonly encountered statements. We study (1) The so-called incompleteness property of the sufficient statistics, (2) A general class of linear estimators, and (3) Joint and conditional likelihood estimation. Apart from the general exponential family setting, normal and binary outcomes are considered as key examples. While our results hold for a general number of looks, for ease of exposition, we focus on the simple yet generic setting of two possible sample sizes, N=n or N=2n.

13.
Sci Rep ; 6: 22578, 2016 Mar 02.
Article in English | MEDLINE | ID: mdl-26931301

ABSTRACT

It is widely held that decisions whether or when to attend health facilities for childbirth are not only influenced by risk awareness and household wealth, but also by factors such as autonomy or a woman's ability to act upon her own preferences. How autonomy should be constructed and measured - namely, as an individual or cluster-level variable - has been less examined. We drew on household survey data from Zambia to study the effect of several autonomy dimensions (financial, relationship, freedom of movement, health care seeking and violence) on place of delivery for 3200 births across 203 rural clusters (villages). In multilevel logistic regression, two autonomy dimensions (relationship and health care seeking) were strongly associated with facility delivery when measured at the cluster level (OR 1.27 and 1.57, respectively), though not at the individual level. This suggests that power relations and gender norms at the community level may override an individual woman's autonomy, and cluster-level measurement may prove critical to understanding the interplay between autonomy and care seeking in this and similar contexts.


Subject(s)
Freedom , Patient Acceptance of Health Care , Adolescent , Adult , Cluster Analysis , Female , Humans , Middle Aged , Probability , Young Adult , Zambia
14.
JMIR Res Protoc ; 5(1): e9, 2016 Jan 15.
Article in English | MEDLINE | ID: mdl-26772143

ABSTRACT

BACKGROUND: Ensuring rapid access to high quality sexual health services is a key public health objective, both in the United Kingdom and internationally. Internet-based testing services for sexually transmitted infections (STIs) are considered to be a promising way to achieve this goal. This study will evaluate a nascent online STI testing and results service in South East London, delivered alongside standard face-to-face STI testing services. OBJECTIVE: The aim of this study is to establish whether an online testing and results services can (1) increase diagnoses of STIs and (2) increase uptake of STI testing, when delivered alongside standard face-to-face STI testing services. METHODS: This is a single-blind randomized controlled trial. We will recruit 3000 participants who meet the following eligibility criteria: 16-30 years of age, resident in the London boroughs of Lambeth and Southwark, having at least one sexual partner in the last 12 months, having access to the Internet and willing to take an STI test. People unable to provide informed consent and unable to read and understand English (the websites will be in English) will be excluded. Baseline data will be collected at enrolment. This includes participant contact details, demographic data (date of birth, gender, ethnicity, and sexual orientation), and sexual health behaviors (last STI test, service used at last STI test and number of sexual partners in the last 12 months). Once enrolled, participants will be randomly allocated either (1) to an online STI testing and results service (Sexual Health 24) offering postal self-administered STI kits for chlamydia, gonorrhoea, syphilis, and HIV; results via text message (short message service, SMS), except positive results for HIV, which will be delivered by phone; and direct referrals to local clinics for treatment or (2) to a conventional sexual health information website with signposting to local clinic-based sexual health services. Participants will be free to use any other interventions or services during the trial period. At 6 weeks from randomization we will collect self-reported follow-up data on service use, STI tests and results, treatment prescribed, and acceptability of STI testing services. We will also collect objective data from participating STI testing services on uptake of STI testing, STI diagnoses and treatment. We hypothesise that uptake of STI testing and STI diagnoses will be higher in the intervention arm. Our hypothesis is based on the assumption that the intervention is less time-consuming, more convenient, more private, and incur less stigma and embarrassment than face-to-face STI testing pathways. The primary outcome measure is diagnosis of any STI at 6 weeks from randomization and our co-primary outcome is completion of any STI test at 6 weeks from randomization. We define completion of a test, as samples returned, processed, and results delivered to the intervention and/or clinic settings. We will use risk ratios to calculate the effect of the intervention on our primary outcomes with 95% confidence intervals. All analyses will be based on the intention-to-treat (ITT) principle. RESULTS: This study is funded by Guy's and St Thomas' Charity and it has received ethical approval from NRES Committee London-Camberwell St Giles (Ref 14/LO/1477). Research and Development approval has been obtained from Kings College Hospital NHS Foundation Trust and Guy's and St Thomas' NHS Foundation Trust. Results are expected in June 2016. CONCLUSIONS: This study will provide evidence on the effectiveness of an online STI testing and results service in South East London. Our findings may also be generalizable to similar populations in the United Kingdom. TRIAL REGISTRATION: International Standard Randomized Controlled Trial Number (ISRCTN): 13354298; http://www.isrctn.com/ISRCTN13354298 (Archived by WebCite at http://www.webcitation.org/6d9xT2bPj).

15.
Ethn Health ; 21(1): 1-19, 2016.
Article in English | MEDLINE | ID: mdl-25494665

ABSTRACT

OBJECTIVES: Research on inequalities in child pedestrian injury risk has identified some puzzling trends: although, in general, living in more affluent areas protects children from injury, this is not true for those in some minority ethnic groups. This study aimed to identify whether 'group density' effects are associated with injury risk, and whether taking these into account alters the relationship between area deprivation and injury risk. 'Group density' effects exist when ethnic minorities living in an area with a higher proportion of people from a similar ethnic group enjoy better health than those who live in areas with a lower proportion, even though areas with dense minority ethnic populations can be relatively more materially disadvantaged. DESIGN: This study utilised variation in minority ethnic densities in London between two census periods to identify any associations between group density and injury risk. Using police data on road traffic injury and population census data from 2001 to 2011, the numbers of 'White,' 'Asian' and 'Black' child pedestrian injuries in an area were modelled as a function of the percentage of the population in that area that are 'White,' 'Asian' and 'Black,' controlling for socio-economic disadvantage and characteristics of the road environment. RESULTS: There was strong evidence (p < 0.001) of a negative association between 'Black' population density and 'Black' child pedestrian injury risk [incidence (of injury) rate ratios (IRR) 0.575, 95% CI 0.515-0.642]. There was weak evidence (p = 0.083) of a negative association between 'Asian' density and 'Asian' child pedestrian injury risk (IRR 0.901, 95% CI 0.801-1.014) and no evidence (p = 0.412) of an association between 'White' density and 'White' child pedestrian injury risk (IRR 1.075, 95% CI 0.904-1.279). When group density effects are taken into account, area deprivation is associated with injury risk of all ethnic groups. CONCLUSIONS: Group density appears to protect 'Black' children living in London against pedestrian injury risk. These findings suggest that future research should focus on structural properties of societies to explain the relationships between minority ethnicity and risk.


Subject(s)
Accidents, Traffic/statistics & numerical data , Ethnicity , Pedestrians , Population Density , Wounds and Injuries/ethnology , Adolescent , Asian People , Black People , Child , Child, Preschool , Female , Humans , London , Male , Risk Factors , Socioeconomic Factors , Walking
16.
Stat Methods Med Res ; 25(4): 1661-76, 2016 08.
Article in English | MEDLINE | ID: mdl-23868542

ABSTRACT

We combine conjugate and normal random effects in a joint model for outcomes, at least one of which is non-Gaussian, with particular emphasis on cases in which one of the outcomes is of survival type. Conjugate random effects are used to relax the often-restrictive mean-variance prescription in the non-Gaussian outcome, while normal random effects account for not only the correlation induced by repeated measurements from the same subject but also the association between the different outcomes. Using a case study in chronic heart failure, we show that model fit can be improved, even resulting in impact on significance tests, by switching to our extended framework. By first taking advantage of the ease of analytical integration over conjugate random effects, we easily estimate our framework, by maximum likelihood, in standard software.


Subject(s)
Likelihood Functions , Software , Heart Failure , Humans
17.
Stata J ; 16(2): 443-463, 2016 Apr.
Article in English | MEDLINE | ID: mdl-29398978

ABSTRACT

Randomized controlled trials provide essential evidence for the evaluation of new and existing medical treatments. Unfortunately, the statistical analysis is often complicated by the occurrence of protocol deviations, which mean we cannot always measure the intended outcomes for individuals who deviate, resulting in a missing-data problem. In such settings, however one approaches the analysis, an untestable assumption about the distribution of the unobserved data must be made. To understand how far the results depend on these assumptions, the primary analysis should be supplemented by a range of sensitivity analyses, which explore how the conclusions vary over a range of different credible assumptions for the missing data. In this article, we describe a new command, mimix, that can be used to perform reference-based sensitivity analyses for randomized controlled trials with longitudinal quantitative outcome data, using the approach proposed by Carpenter, Roger, and Kenward (2013, Journal of Biopharmaceutical Statistics 23: 1352-1371). Under this approach, we make qualitative assumptions about how individuals' missing outcomes relate to those observed in relevant groups in the trial, based on plausible clinical scenarios. Statistical analysis then proceeds using the method of multiple imputation.

18.
Stat Biosci ; 7(2): 187-205, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26478751

ABSTRACT

Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al (2012) and Milanzi et al (2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2n. In this paper, we consider the more practically useful setting of sample sizes in a the finite set {n1, n2, …, nL }. It is shown that the sample average is then a justifiable estimator , in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

19.
Epidemiology ; 26(6): 839-45, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26247488

ABSTRACT

BACKGROUND: In some common episodic conditions, such as diarrhea, respiratory infections, or fever, episode duration can reflect disease severity. The mean episode duration in a population can be estimated if both the incidence and prevalence of the condition are known. In this article, we discuss how an estimator of the average episode duration may be obtained based on prevalence alone if data are collected for two consecutive units of time (usually days) in the same person. METHODS: We derive a maximum likelihood estimator of episode duration, explore its behavior through a simulation study, and illustrate its use through a real example. RESULTS: We show that for two consecutive days, the estimator of the mean episode duration in a population equals one plus twice the ratio of the number of subjects with the condition on both days to the number of subjects with only 1 day ill. The estimator can be extended to account for 3 or 4 consecutive days. The estimator assumes nonoverlapping episodes and a time-constant incidence rate and is more precise for shorter than for longer average episode durations. CONCLUSION: The proposed method allows estimating the mean duration of disease episodes in cross-sectional studies and is applicable to large demographic and health surveys in low-income settings that routinely collect data on diarrhea and respiratory illness. The method may further be used for the calculation of the duration of infectiousness if test results are available for two consecutive days, such as paired throat swabs for influenza.


Subject(s)
Diarrhea/epidemiology , Severity of Illness Index , Time Factors , Cross-Sectional Studies , Humans , Incidence , Likelihood Functions , Prevalence
20.
Stat Methods Med Res ; 24(4): 399-402, 2015 Aug.
Article in English | MEDLINE | ID: mdl-26206561
SELECTION OF CITATIONS
SEARCH DETAIL
...