ABSTRACT
Ignoring the presence of dependent censoring in data analysis can lead to biased estimates, for example, not considering the effect of abandonment of the tuberculosis treatment may influence inferences about the cure probability. In order to assess the relationship between cure and abandonment outcomes, we propose a copula Bayesian approach. Therefore, the main objective of this work is to introduce a Bayesian survival regression model, capable of taking into account the dependent censoring in the adjustment. So, this proposed approach is based on Clayton's copula, to provide the relation between survival and dependent censoring times. In addition, the Weibull and the piecewise exponential marginal distributions are considered in order to fit the times. A simulation study is carried out to perform comparisons between different scenarios of dependence, different specifications of prior distributions, and comparisons with the maximum likelihood inference. Finally, we apply the proposed approach to a tuberculosis treatment adherence dataset of an HIV cohort from Alvorada-RS, Brazil. Results show that cure and abandonment outcomes are negatively correlated, that is, as long as the chance of abandoning the treatment increases, the chance of tuberculosis cure decreases.
Subject(s)
Treatment Adherence and Compliance , Tuberculosis , Humans , Bayes Theorem , Brazil , Computer Simulation , Tuberculosis/drug therapyABSTRACT
Studies on seed science are developed for a wide range of purposes, being the statistical analysis of data essential for experimental reliability and evidence. Due to the characteristics of seed data, several statistical methods can be applied, among them the survival analysis stands out, in virtue of allocating censored data and describing phenomena over time. Therefore, this bibliometric study verified the use of survival analysis in studies with seed germination and to examine the applications of survival analysis in original articles from the Web of Science database for the period from 2000 to 2020. For the application of survival analysis, there was a low number of publications related to seed science, with the USA being the country with the highest number of publications mainly to studies in plant ecology and physiology. In general, the researches were most involved to the evaluation of factors influencing dormancy, physiological stresses, dispersion capacity, population differences and habitats of development which affected seed germination. Therefore, the qualitative overview demonstrates that the survival analysis is a statistical tool of great potential regarding the studies in the area.
Estudos em ciência de sementes são desenvolvidos frente a uma ampla gama de finalidades, sendo a análise estatística dos dados um componente imprescindível para confiabilidade e comprovação experimental. Devido às características dos dados de sementes, diversos métodos estatísticos podem ser aplicados na sua análise, dentre os quais destaca-se a análise de sobrevivência, em virtude de alocar dados censurados e descrever fenômenos ao longo do tempo. Sendo assim, este estudo bibliométrico teve por objetivo verificar o uso da análise de sobrevivência em estudos com germinação de sementes e examinar as aplicações da análise de sobrevivência em artigos originais da base de dados Web of Science para o período de 2000 a 2020. Observou-se para aplicação da análise de sobrevivência, baixo número de publicações com este método estatístico ao decorrer dos anos, sendo os Estados Unidos (EUA) o país de maior produtividade em publicações, com pesquisas relacionadas principalmente a estudos em ecologia vegetal e fisiologia. Em geral, as pesquisas foram mais voltadas para a avaliação de fatores que influenciam a dormência, estresses fisiológicos, capacidade de dispersão, diferenças populacionais e habitats de desenvolvimento que afetaram a germinação das sementes. Portanto, o panorama qualitativo demonstra que a análise de sobrevivência é uma ferramenta estatística de grande potencial em relação aos estudos na área.
Subject(s)
Seeds , Bibliometrics , Germination , Plant DevelopmentABSTRACT
Anglo-Nubian goats are hardy animals able to adapt to tropical environments, where their farming is usually held on pasture. However, climatic conditions in this environment also favor endoparasitism, which implies a negative aspect if sensitivity to worm infections interferes by reducing the reproductive lifespan of females, as their stay in the herd is a trait of great importance for efficient goat farming. In this respect, this study proposes to use the survival analysis methodology with the Cox-Gompertz proportional hazards model to assess the length of stay in the herd by relating the removal of females due to death with worm infection being the main cause to other forms of culling, using information from 101 Anglo-Nubian goats born from2009 to 2013 in an experimental herd from Teresina - PI, Brazil. The Cox-Gompertz proportional hazards model chose body condition score, birth weight and birth season as important covariates (p-value ≤ 10%). Body condition score showed to be a favorable factor for the longer stay of the goats in the herd. Eggs per gram, age at first kidding, birth type and dam age were not significant. The Cox-Gompertz proportional hazards model is suitable for fitting the statistical model to estimate the length of stay in the herd, with censoring related to endoparasitism in Anglo-Nubian goats.(AU)
Os caprinos da raça Anglonubiana apresentam-se rústicos com adaptabilidade a ambiente tropical, onde sua exploração ocorre geralmente em pasto. Entretanto, condições climáticas desse ambiente favorecem também o endoparasitismo, implicando num aspecto negativo se a sensibilidade a verminoses interferir reduzindo a duração da vida reprodutiva de fêmeas, pois a permanência delas no rebanho é uma característica de grande importância para a maior eficiência da caprinocultura. Neste contexto, objetiva-se com este estudo utilizar a metodologia de análise de sobrevivência com modelo de riscos proporcionais de Cox-Gompertz para avaliação do tempo permanência no rebanho, relacionando a saída da fêmea por morte tendo a verminose como a principal causa, em relação a outras formas de descarte, utilizando-se informações de 101 cabras da raça Anglonubiana nascidas no período de 2009 a 2013 em rebanho experimental localizado em Teresina-Piauí. O modelo de riscos proporcionais de Cox-Gompertz, elegeu como covariáveis importantes (p-valor ≤ 10%): o escore da condição corporal (ECC), o peso ao nascer(PN) e a estação de nascimento (EN). O ECC se apresentou como fator favorável a maior permanência das cabras no rebanho. Não foram significativos o OPG, idade ao primeiro parto, tipo de nascimento e idade da mãe. O modelo de riscos proporcionais de Cox-Gompertz se mostra adequado no ajuste do modelo estatístico para estimar o tempo de permanência no rebanho, com censura relacionada a endoparasitismo em cabras da raça Anglonubiana.(AU)
Subject(s)
Animals , Female , Ruminants/parasitology , Helminthiasis, Animal/diagnosis , Helminthiasis, Animal/mortalityABSTRACT
Anglo-Nubian goats are hardy animals able to adapt to tropical environments, where their farming is usually held on pasture. However, climatic conditions in this environment also favor endoparasitism, which implies a negative aspect if sensitivity to worm infections interferes by reducing the reproductive lifespan of females, as their stay in the herd is a trait of great importance for efficient goat farming. In this respect, this study proposes to use the survival analysis methodology with the Cox-Gompertz proportional hazards model to assess the length of stay in the herd by relating the removal of females due to death with worm infection being the main cause to other forms of culling, using information from 101 Anglo-Nubian goats born from2009 to 2013 in an experimental herd from Teresina - PI, Brazil. The Cox-Gompertz proportional hazards model chose body condition score, birth weight and birth season as important covariates (p-value ≤ 10%). Body condition score showed to be a favorable factor for the longer stay of the goats in the herd. Eggs per gram, age at first kidding, birth type and dam age were not significant. The Cox-Gompertz proportional hazards model is suitable for fitting the statistical model to estimate the length of stay in the herd, with censoring related to endoparasitism in Anglo-Nubian goats.
Os caprinos da raça Anglonubiana apresentam-se rústicos com adaptabilidade a ambiente tropical, onde sua exploração ocorre geralmente em pasto. Entretanto, condições climáticas desse ambiente favorecem também o endoparasitismo, implicando num aspecto negativo se a sensibilidade a verminoses interferir reduzindo a duração da vida reprodutiva de fêmeas, pois a permanência delas no rebanho é uma característica de grande importância para a maior eficiência da caprinocultura. Neste contexto, objetiva-se com este estudo utilizar a metodologia de análise de sobrevivência com modelo de riscos proporcionais de Cox-Gompertz para avaliação do tempo permanência no rebanho, relacionando a saída da fêmea por morte tendo a verminose como a principal causa, em relação a outras formas de descarte, utilizando-se informações de 101 cabras da raça Anglonubiana nascidas no período de 2009 a 2013 em rebanho experimental localizado em Teresina-Piauí. O modelo de riscos proporcionais de Cox-Gompertz, elegeu como covariáveis importantes (p-valor ≤ 10%): o escore da condição corporal (ECC), o peso ao nascer(PN) e a estação de nascimento (EN). O ECC se apresentou como fator favorável a maior permanência das cabras no rebanho. Não foram significativos o OPG, idade ao primeiro parto, tipo de nascimento e idade da mãe. O modelo de riscos proporcionais de Cox-Gompertz se mostra adequado no ajuste do modelo estatístico para estimar o tempo de permanência no rebanho, com censura relacionada a endoparasitismo em cabras da raça Anglonubiana.
Subject(s)
Female , Animals , Helminthiasis, Animal/diagnosis , Helminthiasis, Animal/mortality , Ruminants/parasitologyABSTRACT
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset.
ABSTRACT
Serial interval (SI), defined as the time between symptom onset in an infector and infectee pair, is commonly used to understand infectious diseases transmission. Slow progression to active disease, as well as the small percentage of individuals who will eventually develop active disease, complicate the estimation of the SI for tuberculosis (TB). In this paper, we showed via simulation studies that when there is credible information on the percentage of those who will develop TB disease following infection, a cure model, first introduced by Boag in 1949, should be used to estimate the SI for TB. This model includes a parameter in the likelihood function to account for the study population being composed of those who will have the event of interest and those who will never have the event. We estimated the SI for TB to be approximately 0.5 years for the United States and Canada (January 2002 to December 2006) and approximately 2.0 years for Brazil (March 2008 to June 2012), which might imply a higher occurrence of reinfection TB in a developing country like Brazil.
Subject(s)
Biostatistics/methods , Disease Transmission, Infectious/statistics & numerical data , Mycobacterium tuberculosis , Time Factors , Tuberculosis/transmission , Brazil/epidemiology , Canada/epidemiology , Humans , Tuberculosis/epidemiology , United States/epidemiologyABSTRACT
In this study, we analyzed the role of individuals health-related factors along with socio-demographic and economic characteristics on both the likelihood of tobacco consumption and quantity demanded levels using two competitive econometric methods: double hurdle model versus hyperbolic sine double-hurdle model. Statistical tests confirmed the dependency errors between the prevalence rate of smoking and the consumption level, whilst the inverse-hyperbolic sine double-hurdle model data fits best in describing the normalization of the data and the two data generating processes: the probability and consumption levels of cigarettes. Also, the variance-covariance of the selected model as a function of additional exogenous variables are confirmed, while the error terms between the likelihood to smoke and the consumption levels are positive and statistically significant, indicating that holding control variables fixed, the uncontrolled variables out of the system that increase the prevalence rate of smoking also boost the consumption level, or vice versa. Many individual disease variables are significant in both equations, breaking new grounds in literature for identifying how both the prevalence rate of smoking and amount have shaped.(AU)
Neste estudo, analisamos o papel dos fatores relacionados à saúde dos indivíduos, juntamente com as características sócio-demográficas e econômicas, tanto na probabilidade de consumo de tabaco quanto nos níveis de quantidade demandada, usando dois métodos econométricos competitivos: modelo de obstáculo duplo versus modelo de obstáculo duplo seno hiperbólico. Os testes estatísticos confirmaram os erros de dependência entre a taxa de prevalência de tabagismo e o nível de consumo, enquanto o modelo de seno duplo inverso-hiperbólico se ajusta melhor aos dados para descrever a normalização dos dados e os dois processos geradores de dados: os níveis de probabilidade e consumo de cigarros. Também são confirmadas a covariância de variância do modelo selecionado em função de variáveis exógenas adicionais, enquanto os termos de erro entre a probabilidade de fumar e os níveis de consumo são positivos e estatisticamente significativos, indicando que, mantendo variáveis de controle fixas, as variáveis não controladas são do sistema que aumenta a taxa de prevalência do tabagismo e também cresce o nível de consumo, ou vice-versa. Muitas variáveis individuais da doença são encontradas significativamente em ambas as equações, abrindo novos caminhos na literatura para identificar como a taxa de prevalência de tabagismo e a quantidade se moldaram.(AU)
Subject(s)
Humans , Tobacco Use/adverse effects , Censuses , Public Health , TurkeyABSTRACT
In many fields and applications, count data can be subject to delayed reporting. This is where the total count, such as the number of disease cases contracted in a given week, may not be immediately available, instead arriving in parts over time. For short-term decision making, the statistical challenge lies in predicting the total count based on any observed partial counts, along with a robust quantification of uncertainty. We discuss previous approaches to modeling delayed reporting and present a multivariate hierarchical framework where the count generating process and delay mechanism are modeled simultaneously in a flexible way. This framework can also be easily adapted to allow for the presence of underreporting in the final observed count. To illustrate our approach and to compare it with existing frameworks, we present a case study of reported dengue fever cases in Rio de Janeiro. Based on both within-sample and out-of-sample posterior predictive model checking and arguments of interpretability, adaptability, and computational efficiency, we discuss the relative merits of different approaches.
Subject(s)
Models, Statistical , BrazilABSTRACT
ABSTRACT: In this study, we analyzed the role of individuals' health-related factors along with socio-demographic and economic characteristics on both the likelihood of tobacco consumption and quantity demanded levels using two competitive econometric methods: double hurdle model versus hyperbolic sine double-hurdle model. Statistical tests confirmed the dependency errors between the prevalence rate of smoking and the consumption level, whilst the inverse-hyperbolic sine double-hurdle model data fits best in describing the normalization of the data and the two data generating processes: the probability and consumption levels of cigarettes. Also, the variance-covariance of the selected model as a function of additional exogenous variables are confirmed, while the error terms between the likelihood to smoke and the consumption levels are positive and statistically significant, indicating that holding control variables fixed, the uncontrolled variables out of the system that increase the prevalence rate of smoking also boost the consumption level, or vice versa. Many individual disease variables are significant in both equations, breaking new grounds in literature for identifying how both the prevalence rate of smoking and amount have shaped.
RESUMO: Neste estudo, analisamos o papel dos fatores relacionados à saúde dos indivíduos, juntamente com as características sócio-demográficas e econômicas, tanto na probabilidade de consumo de tabaco quanto nos níveis de quantidade demandada, usando dois métodos econométricos competitivos: modelo de obstáculo duplo versus modelo de obstáculo duplo seno hiperbólico. Os testes estatísticos confirmaram os erros de dependência entre a taxa de prevalência de tabagismo e o nível de consumo, enquanto o modelo de seno duplo inverso-hiperbólico se ajusta melhor aos dados para descrever a normalização dos dados e os dois processos geradores de dados: os níveis de probabilidade e consumo de cigarros. Também são confirmadas a covariância de variância do modelo selecionado em função de variáveis exógenas adicionais, enquanto os termos de erro entre a probabilidade de fumar e os níveis de consumo são positivos e estatisticamente significativos, indicando que, mantendo variáveis de controle fixas, as variáveis não controladas são do sistema que aumenta a taxa de prevalência do tabagismo e também cresce o nível de consumo, ou vice-versa. Muitas variáveis individuais da doença são encontradas significativamente em ambas as equações, abrindo novos caminhos na literatura para identificar como a taxa de prevalência de tabagismo e a quantidade se moldaram.
ABSTRACT
A key biomarker in the study of differentiated thyroid cancer is thyroglobulin. Measurements of the levels of this protein in the blood are determined using laboratory instruments that cannot detect very small concentrations below a threshold, generating left-censored measurements. In the presence of censoring, ordinary least-squares regression models generate biased parameter estimates; therefore, it is necessary to resort to more complex models that consider the censored observations and the behavior of the distribution of the response variable, such as censored and mixed regression models. These techniques were used to model the relationship between thyroglobulin levels in individuals with differentiated thyroid cancer before and after treatment with radioactive iodine (I-131). Log-normal, log-skew-normal, log-power-normal, and log-generalized-gamma probability distributions were used to model the behavior of errors in the adjusted models. Log-generalized-gamma distribution yielded the best results according to the established model selection criteria.
Subject(s)
Models, Statistical , Thyroglobulin/blood , Thyroid Neoplasms/radiotherapy , Adult , Biomarkers, Tumor/blood , Female , Humans , Iodine Radioisotopes , Likelihood Functions , Male , Thyroid Neoplasms/surgeryABSTRACT
We consider a simple linear regression model that accommodates situations where both the dependent and the independent variables are interval censored. We obtain maximum likelihood estimators of its parameters and compare their performance with that of estimators derived under ordinary linear regression models. We also develop prediction intervals for the response and illustrate the results with data from an audiometric study designed to evaluate the possibility of prediction of behavioural thresholds from physiological thresholds.
Subject(s)
Likelihood Functions , Linear Models , Algorithms , Audiometry/statistics & numerical data , Clinical Trials as Topic/statistics & numerical data , Diagnostic Tests, Routine/methods , Hearing Loss/diagnosis , Humans , Infant , Language DevelopmentABSTRACT
A major challenge when monitoring risks in socially deprived areas of under developed countries is that economic, epidemiological, and social data are typically underreported. Thus, statistical models that do not take the data quality into account will produce biased estimates. To deal with this problem, counts in suspected regions are usually approached as censored information. The censored Poisson model can be considered, but all censored regions must be precisely known a priori, which is not a reasonable assumption in most practical situations. We introduce the random-censoring Poisson model (RCPM) which accounts for the uncertainty about both the count and the data reporting processes. Consequently, for each region, we will be able to estimate the relative risk for the event of interest as well as the censoring probability. To facilitate the posterior sampling process, we propose a Markov chain Monte Carlo scheme based on the data augmentation technique. We run a simulation study comparing the proposed RCPM with 2 competitive models. Different scenarios are considered. RCPM and censored Poisson model are applied to account for potential underreporting of early neonatal mortality counts in regions of Minas Gerais State, Brazil, where data quality is known to be poor.
Subject(s)
Models, Statistical , Poisson Distribution , Algorithms , Bayes Theorem , Biostatistics , Brazil/epidemiology , Computer Simulation , Humans , Infant , Infant Mortality , Infant, Newborn , Markov Chains , Monte Carlo Method , ProbabilityABSTRACT
This paper describes a novel scheme for the fusion of spectrum sensing information in cooperative spectrum sensing for cognitive radio applications. The scheme combines a spectrum-efficient, pre-distortion-based fusion strategy with an energy-efficient censoring-based fusion strategy to achieve the combined effect of reduction in bandwidth and power consumption during the transmissions of the local decisions to the fusion center. Expressions for computing the key performance metrics of the spectrum sensing of the proposed scheme are derived and validated by means of computer simulations. An extensive analysis of the overall energy efficiency is made, along with comparisons with reference strategies proposed in the literature. It is demonstrated that the proposed fusion scheme can outperform the energy efficiency attained by these reference strategies. Moreover, it attains approximately the same global decision performance of the best among these strategies.
ABSTRACT
Chlordecone (CLD) was an organochlorine insecticide whose previous use resulted in an extensive pollution of the environment with severe health effects and social consequences. A closely related compound, 5b-hydrochlordecone (5b-hydroCLD), has been searched for and often detected in environmental matrices from the geographical area where CLD was applied. The current consensus considered that its presence was not the result of a biotic or abiotic dechlorination of CLD in these matrices but rather the consequence of its presence as impurity (synthesis by-product) in the CLD released into the environment. The aim of the present study was to determine if and to what extent degradation of CLD into 5b-hydroCLD occurred in the field. To test this hypothesis, the ratios of 5b-hydroCLD and CLD concentrations in a dataset of 810 soils collected between 2006 and 2012 in Martinique were compared to the ratios measured in 3 samples of the CLD dust commercial formulations applied in the banana fields of French West Indies (FWI) and 1 sample of the technical-grade CLD corresponding to the active ingredient used in such formulations. Soil data were processed with a hierarchical Bayesian model to account for random measurement errors and data censoring. Any pathway of CLD transformation into 5b-hydroCLD occurring over the long term in FWI soils would indeed change the ratio of 5b-hydroCLD/CLD compared to what it was in the initially applied formulations. Results showed a significant increase of the 5b-hydroCLD/CLD ratio in the soils-25 times greater in soil than in commercial formulations-which suggested that natural CLD transformation into 5b-hydroCLD over the long term occurred in these soils. Results from this study may impact future decisions for the remediation of the polluted areas.
Subject(s)
Biodegradation, Environmental , Chlordecone/analogs & derivatives , Chlordecone/metabolism , Insecticides/metabolism , Soil Pollutants/metabolism , Soil/chemistry , Bayes Theorem , Humans , Martinique , Musa , Time , West IndiesABSTRACT
We describe CD4 counts at 6-month intervals for 5 years after combination antiretroviral therapy initiation among 12 879 antiretroviral-naive human immunodeficiency virus-infected adults from Latin America and the Caribbean. Median CD4 counts increased from 154 cells/mm(3) at baseline (interquartile range [IQR], 60-251) to 413 cells/mm(3) (IQR, 234-598) by year 5.