Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
1.
Stat Methods Med Res ; : 9622802241247725, 2024 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-38676359

RESUMO

This article proposes a Bayesian approach for jointly estimating marginal conditional quantiles of multi-response longitudinal data with multivariate mixed effects model. The multivariate asymmetric Laplace distribution is employed to construct the working likelihood of the considered model. Penalization priors on regression parameters are incorporated into the working likelihood to conduct Bayesian high-dimensional inference. Markov chain Monte Carlo algorithm is used to obtain the fully conditional posterior distributions of all parameters and latent variables. Monte Carlo simulations are conducted to evaluate the sample performance of the proposed joint quantile regression approach. Finally, we analyze a longitudinal medical dataset of the primary biliary cirrhosis sequential cohort study to illustrate the real application of the proposed modeling method.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38409814

RESUMO

A sufficient number of participants should be included to adequately address the research interest in the surveys with sensitive questions. In this paper, sample size formulas/iterative algorithms are developed from the perspective of controlling the confidence interval width of the prevalence of a sensitive attribute under four non-randomized response models: the crosswise model, parallel model, Poisson item count technique model and negative binomial item count technique model. In contrast to the conventional approach for sample size determination, our sample size formulas/algorithms explicitly incorporate an assurance probability of controlling the width of a confidence interval within the pre-specified range. The performance of the proposed methods is evaluated with respect to the empirical coverage probability, empirical assurance probability and confidence width. Simulation results show that all formulas/algorithms are effective and hence are recommended for practical applications. A real example is used to illustrate the proposed methods.

3.
Stat Med ; 42(26): 4794-4823, 2023 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-37652405

RESUMO

In spatio-temporal epidemiological analysis, it is of critical importance to identify the significant covariates and estimate the associated time-varying effects on the health outcome. Due to the heterogeneity of spatio-temporal data, the subsets of important covariates may vary across space and the temporal trends of covariate effects could be locally different. However, many spatial models neglected the potential local variation patterns, leading to inappropriate inference. Thus, this article proposes a flexible Bayesian hierarchical model to simultaneously identify spatial clusters of regression coefficients with common temporal trends, select significant covariates for each spatial group by introducing binary entry parameters and estimate spatio-temporally varying disease risks. A multistage strategy is employed to reduce the confounding bias caused by spatially structured random components. A simulation study demonstrates the outperformance of the proposed method, compared with several alternatives based on different assessment criteria. The methodology is motivated by two important case studies. The first concerns the low birth weight incidence data in 159 counties of Georgia, USA, for the years 2007 to 2018 and investigates the time-varying effects of potential contributing covariates in different cluster regions. The second concerns the circulatory disease risks across 323 local authorities in England over 10 years and explores the underlying spatial clusters and associated important risk factors.

4.
PLoS One ; 18(1): e0279918, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36649269

RESUMO

One of the main concerns in multidimensional item response theory (MIRT) is to detect the relationship between observed items and latent traits, which is typically addressed by the exploratory analysis and factor rotation techniques. Recently, an EM-based L1-penalized log-likelihood method (EML1) is proposed as a vital alternative to factor rotation. Based on the observed test response data, EML1 can yield a sparse and interpretable estimate of the loading matrix. However, EML1 suffers from high computational burden. In this paper, we consider the coordinate descent algorithm to optimize a new weighted log-likelihood, and consequently propose an improved EML1 (IEML1) which is more than 30 times faster than EML1. The performance of IEML1 is evaluated through simulation studies and an application on a real data set related to the Eysenck Personality Questionnaire is used to demonstrate our methodologies.


Assuntos
Modelos Estatísticos , Motivação , Modelos Logísticos , Algoritmos , Simulação por Computador
5.
J Appl Stat ; 49(10): 2629-2656, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35757045

RESUMO

In this paper, we propose a new kind of multivariate t distribution by allowing different degrees of freedom for each univariate component. Compared with the classical multivariate t distribution, it is more flexible in the model specification that can be used to deal with the variant amounts of tail weights on marginals in multivariate data modeling. In particular, it could include components following the multivariate normal distribution, and it contains the product of independent t-distributions as a special case. Subsequently, it is extended to the regression model as the joint distribution of the error terms. Important distributional properties are explored and useful statistical methods are developed. The flexibility of the specified structure in better capturing the characteristic of data is exemplified by both simulation studies and real data analyses.

6.
Transbound Emerg Dis ; 69(5): e2731-e2744, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35751843

RESUMO

The transmission of coronavirus disease-2019 (COVID-19) epidemic is a global emergency, which is worsened by the genetic mutations of SARS-CoV-2. However, till date, few statistical studies have researched the COVID-19 spread patterns in terms of the variant cases. Hence, this paper aims to explore the associated risk factors of Delta variant, the most contagious strain of COVID-19. The study collected the state-level COVID-19 Delta variant cases in the United States during a 12-week period and included potential environmental, socioeconomic, and public prevention factors as independent variables. Instead of regarding the covariate effects as constant, this paper proposes a flexible Bayesian hierarchical model with spatio-temporally varying coefficients to account for data heterogeneity. The method enables us to cluster the states into distinctive groups based on the temporal trends of the coefficients and simultaneously identify significant risk factors for each cluster. The findings contribute novel insight into the dynamics of covariate effects on the COVID-19 Delta variant over space and time, which could help the government develop targeted prevention measures for vulnerable regions based on the selected risk factors.


Assuntos
COVID-19 , Animais , Teorema de Bayes , COVID-19/epidemiologia , COVID-19/veterinária , Fatores de Risco , SARS-CoV-2/genética , Análise Espaço-Temporal , Estados Unidos/epidemiologia
7.
J Biopharm Stat ; 32(6): 871-896, 2022 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-35536693

RESUMO

This article investigates the confidence interval (CI) construction of proportion difference for two independent partially validated series under the double-sampling scheme in which both classifiers are fallible. Several CIs based on the variance estimates recovery method of combining confidence limits from asymptotic, bootstrap, and Bayesian methods for two independent binomial proportions are developed under two models. Simulation results show that all CIs except for the bootstrap percentile-t CI and Bayesian credible interval with uniform prior under the independence model and all CIs under the dependence model generally perform well and are recommended. Two examples are used to illustrate the methodologies.


Assuntos
Modelos Estatísticos , Humanos , Teorema de Bayes , Intervalos de Confiança , Simulação por Computador
8.
Psychometrika ; 87(4): 1361-1389, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-35306631

RESUMO

Studies with sensitive questions should include a sufficient number of respondents to adequately address the research interest. While studies with an inadequate number of respondents may not yield significant conclusions, studies with an excess of respondents become wasteful of investigators' budget. Therefore, it is an important step in survey sampling to determine the required number of participants. In this article, we derive sample size formulas based on confidence interval estimation of prevalence for four randomized response models, namely, the Warner's randomized response model, unrelated question model, item count technique model and cheater detection model. Specifically, our sample size formulas control, with a given assurance probability, the width of a confidence interval within the planned range. Simulation results demonstrate that all formulas are accurate in terms of empirical coverage probabilities and empirical assurance probabilities. All formulas are illustrated using a real-life application about the use of unethical tactics in negotiation.


Assuntos
Modelos Estatísticos , Humanos , Tamanho da Amostra , Prevalência , Psicometria , Probabilidade , Simulação por Computador , Intervalos de Confiança
9.
Stat Med ; 41(5): 932, 2022 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-35194816
10.
Biom J ; 64(4): 714-732, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34914842

RESUMO

Zeros in compositional data are very common and can be classified into rounded and essential zeros. The rounded zero refers to a small proportion or below detection limit value, while the essential zero refers to the complete absence of the component in the composition. In this article, we propose a new framework for analyzing compositional data with zero entries by introducing a stochastic representation. In particular, a new distribution, namely the Dirichlet composition distribution, is developed to accommodate the possible essential-zero feature in compositional data. We derive its distributional properties (e.g., its moments). The calculation of maximum likelihood estimates via the Expectation-Maximization (EM) algorithm will be proposed. The regression model based on the new Dirichlet composition distribution will be considered. Simulation studies are conducted to evaluate the performance of the proposed methodologies. Finally, our method is employed to analyze a dataset of fluorescence in situ hybridization (FISH) for chromosome detection.


Assuntos
Algoritmos , Cromossomos , Simulação por Computador , Hibridização in Situ Fluorescente , Funções Verossimilhança , Distribuição de Poisson
11.
Br J Math Stat Psychol ; 75(2): 363-394, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34918834

RESUMO

The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.


Assuntos
Algoritmos , Motivação , Teorema de Bayes , Simulação por Computador
12.
Artigo em Inglês | MEDLINE | ID: mdl-33477576

RESUMO

With the rapid spread of the pandemic due to the coronavirus disease 2019 (COVID-19), the virus has already led to considerable mortality and morbidity worldwide, as well as having a severe impact on economic development. In this article, we analyze the state-level correlation between COVID-19 risk and weather/climate factors in the USA. For this purpose, we consider a spatio-temporal multivariate time series model under a hierarchical framework, which is especially suitable for envisioning the virus transmission tendency across a geographic area over time. Briefly, our model decomposes the COVID-19 risk into: (i) an autoregressive component that describes the within-state COVID-19 risk effect; (ii) a spatiotemporal component that describes the across-state COVID-19 risk effect; (iii) an exogenous component that includes other factors (e.g., weather/climate) that could envision future epidemic development risk; and (iv) an endemic component that captures the function of time and other predictors mainly for individual states. Our results indicate that maximum temperature, minimum temperature, humidity, the percentage of cloud coverage, and the columnar density of total atmospheric ozone have a strong association with the COVID-19 pandemic in many states. In particular, the maximum temperature, minimum temperature, and the columnar density of total atmospheric ozone demonstrate statistically significant associations with the tendency of COVID-19 spreading in almost all states. Furthermore, our results from transmission tendency analysis suggest that the community-level transmission has been relatively mitigated in the USA, and the daily confirmed cases within a state are predominated by the earlier daily confirmed cases within that state compared to other factors, which implies that states such as Texas, California, and Florida with a large number of confirmed cases still need strategies like stay-at-home orders to prevent another outbreak.


Assuntos
COVID-19/epidemiologia , Pandemias , Tempo (Meteorologia) , COVID-19/transmissão , California , Florida , Humanos , Modelos Teóricos , Ozônio , Fatores de Risco , Análise Espaço-Temporal , Texas , Estados Unidos/epidemiologia
13.
Stat Med ; 40(1): 119-132, 2021 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-33015853

RESUMO

In this article, we develop a so-called profile likelihood ratio test (PLRT) based on the estimated error density for the multiple linear regression model. Unlike the existing likelihood ratio test (LRT), our proposed PLRT does not require any specification on the error distribution. The asymptotic properties are developed and the Wilks phenomenon is studied. Simulation studies are conducted to examine the performance of the PLRT. It is observed that our proposed PLRT generally outperforms the existing LRT, empirical likelihood ratio test and the weighted profile likelihood ratio test in sense that (i) its type I error rates are closer to the prespecified nominal level; (ii) it generally has higher powers; (iii) it performs satisfactorily when moments of the error do not exist (eg, Cauchy distribution); and (iv) it has higher probability of correctly selecting the correct model in the multiple testing problem. A mammalian eye gene expression dataset and a concrete compressive strength dataset are analyzed to illustrate our methodologies.


Assuntos
Funções Verossimilhança , Simulação por Computador , Humanos , Modelos Lineares
14.
Stat Methods Med Res ; 30(1): 129-150, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32746735

RESUMO

In this paper, we consider variable selection for ultra-high dimensional quantile regression model with missing data and measurement errors in covariates. Specifically, we correct the bias in the loss function caused by measurement error by applying the orthogonal quantile regression approach and remove the bias caused by missing data using the inverse probability weighting. A nonconvex Atan penalized estimation method is proposed for simultaneous variable selection and estimation. With the proper choice of the regularization parameter and under some relaxed conditions, we show that the proposed estimate enjoys the oracle properties. The choice of smoothing parameters is also discussed. The performance of the proposed variable selection procedure is assessed by Monte Carlo simulation studies. We further demonstrate the proposed procedure with a breast cancer data set.


Assuntos
Neoplasias da Mama , Viés , Simulação por Computador , Feminino , Humanos , Método de Monte Carlo , Probabilidade
15.
Stat Med ; 39(29): 4480-4498, 2020 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-32909318

RESUMO

The Poisson item count technique (PICT) is a survey method that was recently developed to elicit respondents' truthful answers to sensitive questions. It simplifies the well-known item count technique (ICT) by replacing a list of independent innocuous questions in known proportions with a single innocuous counting question. However, ICT and PICT both rely on the strong "no design effect assumption" (ie, respondents give the same answers to the innocuous items regardless of the absence or presence of the sensitive item in the list) and "no liar" (ie, all respondents give truthful answers) assumptions. To address the problem of self-protective behavior and provide more reliable analyses, we introduced a noncompliance parameter into the existing PICT. Based on the survey design of PICT, we considered more practical model assumptions and developed the corresponding statistical inferences. Simulation studies were conducted to evaluate the performance of our method. Finally, a real example of automobile insurance fraud was used to demonstrate our method.


Assuntos
Cooperação do Paciente , Projetos de Pesquisa , Simulação por Computador , Humanos , Inquéritos e Questionários
16.
Comb Chem High Throughput Screen ; 23(8): 740-756, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32342803

RESUMO

AIM AND OBJECTIVE: Near Infrared (NIR) spectroscopy data are featured by few dozen to many thousands of samples and highly correlated variables. Quantitative analysis of such data usually requires a combination of analytical methods with variable selection or screening methods. Commonly-used variable screening methods fail to recover the true model when (i) some of the variables are highly correlated, and (ii) the sample size is less than the number of relevant variables. In these cases, Partial Least Squares (PLS) regression based approaches can be useful alternatives. MATERIALS AND METHODS: In this research, a fast variable screening strategy, namely the preconditioned screening for ridge partial least squares regression (PSRPLS), is proposed for modelling NIR spectroscopy data with high-dimensional and highly correlated covariates. Under rather mild assumptions, we prove that using Puffer transformation, the proposed approach successfully transforms the problem of variable screening with highly correlated predictor variables to that of weakly correlated covariates with less extra computational effort. RESULTS: We show that our proposed method leads to theoretically consistent model selection results. Four simulation studies and two real examples are then analyzed to illustrate the effectiveness of the proposed approach. CONCLUSION: By introducing Puffer transformation, high correlation problem can be mitigated using the PSRPLS procedure we construct. By employing RPLS regression to our approach, it can be made more simple and computational efficient to cope with the situation where model size is larger than the sample size while maintaining a high precision prediction.


Assuntos
Solo/química , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Simulação por Computador , Bases de Dados de Compostos Químicos , Análise dos Mínimos Quadrados , Modelos Teóricos , Método de Monte Carlo
17.
J Appl Stat ; 47(8): 1375-1401, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-35706696

RESUMO

A disease prevalence can be estimated by classifying subjects according to whether they have the disease. When gold-standard tests are too expensive to be applied to all subjects, partially validated data can be obtained by double-sampling in which all individuals are classified by a fallible classifier, and some of individuals are validated by the gold-standard classifier. However, it could happen in practice that such infallible classifier does not available. In this article, we consider two models in which both classifiers are fallible and propose four asymptotic test procedures for comparing disease prevalence in two groups. Corresponding sample size formulae and validated ratio given the total sample sizes are also derived and evaluated. Simulation results show that (i) Score test performs well and the corresponding sample size formula is also accurate in terms of the empirical power and size in two models; (ii) the Wald test based on the variance estimator with parameters estimated under the null hypothesis outperforms the others even under small sample sizes in Model II, and the sample size estimated by this test is also accurate; (iii) the estimated validated ratios based on all tests are accurate. The malarial data are used to illustrate the proposed methodologies.

18.
Stat Med ; 38(23): 4670-4685, 2019 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-31359443

RESUMO

The proportional subdistribution hazard regression model has been widely used by clinical researchers for analyzing competing risks data. It is well known that quantile regression provides a more comprehensive alternative to model how covariates influence not only the location but also the entire conditional distribution. In this paper, we develop variable selection procedures based on penalized estimating equations for competing risks quantile regression. Asymptotic properties of the proposed estimators including consistency and oracle properties are established. Monte Carlo simulation studies are conducted, confirming that the proposed methods are efficient. A bone marrow transplant data set is analyzed to demonstrate our methodologies.


Assuntos
Modelos de Riscos Proporcionais , Transplante de Medula Óssea , Simulação por Computador , Humanos , Leucemia Mieloide Aguda/terapia , Método de Monte Carlo
19.
J Biopharm Stat ; 29(3): 446-467, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30933654

RESUMO

A stratified study is often designed for adjusting a confounding effect or effect of different centers/groups in two treatments or diagnostic tests, and the risk difference is one of the most frequently used indices in comparing efficiency between two treatments or diagnostic tests. This article presented five simultaneous confidence intervals (CIs) for risk differences in stratified bilateral designs accounting for the intraclass correlation and developed seven CIs for the common risk difference under the homogeneity assumption. The performance of the CIs is evaluated with respect to the empirical coverage probabilities, empirical coverage widths and ratios of mesial noncoverage probability and the noncoverage probability under various scenarios. Empirical results show that Wald simultaneous CI, Haldane simultaneous CI, Score simultaneous CI based on Bonferroni method and simultaneous CI based on bootstrap-resampling method perform satisfactorily and hence be recommended for applications, the CI based on the weighted-least-square (WLS) estimator, the CIs based on Mantel-Haenszel estimator, the CI based on Cochran statistic and the CI based on Score statistic for the common risk difference behave well even under small sample sizes. A real data example is used to demonstrate the proposed methodologies.


Assuntos
Intervalos de Confiança , Modelos Estatísticos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Projetos de Pesquisa/estatística & dados numéricos , Simulação por Computador , Humanos , Análise dos Mínimos Quadrados , Probabilidade , Risco , Tamanho da Amostra
20.
Stat Methods Med Res ; 28(4): 1019-1043, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29233082

RESUMO

Double sampling is usually applied to collect necessary information for situations in which an infallible classifier is available for validating a subset of the sample that has already been classified by a fallible classifier. Inference procedures have previously been developed based on the partially validated data obtained by the double-sampling process. However, it could happen in practice that such infallible classifier or gold standard does not exist. In this article, we consider the case in which both classifiers are fallible and propose asymptotic and approximate unconditional test procedures based on six test statistics for a population proportion and five approximate sample size formulas based on the recommended test procedures under two models. Our results suggest that both asymptotic and approximate unconditional procedures based on the score statistic perform satisfactorily for small to large sample sizes and are highly recommended. When sample size is moderate or large, asymptotic procedures based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic, log- and logit-transformation statistics based on both models generally perform well and are hence recommended. The approximate unconditional procedures based on the log-transformation statistic under Model I, Wald statistic with the variance being estimated under the null hypothesis, log- and logit-transformation statistics under Model II are recommended when sample size is small. In general, sample size formulae based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic and score statistic are recommended in practical applications. The applicability of the proposed methods is illustrated by a real-data example.


Assuntos
Modelos Estatísticos , Estudos de Amostragem , Algoritmos , Humanos , Funções Verossimilhança , Noruega , Tamanho da Amostra
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...