ABSTRACT
Dengue fever is a tropical disease transmitted mainly by the female Aedes aegypti mosquito that affects millions of people every year. As there is still no safe and effective vaccine, currently the best way to prevent the disease is to control the proliferation of the transmitting mosquito. Since the proliferation and life cycle of the mosquito depend on environmental variables such as temperature and water availability, among others, statistical models are needed to understand the existing relationships between environmental variables and the recorded number of dengue cases and predict the number of cases for some future time interval. This prediction is of paramount importance for the establishment of control policies. In general, dengue-fever datasets contain the number of cases recorded periodically (in days, weeks, months or years). Since many dengue-fever datasets tend to be of the overdispersed, long-tail type, some common models like the Poisson regression model or negative binomial regression model are not adequate to model it. For this reason, in this paper we propose modeling a dengue-fever dataset by using a Poisson-inverse-Gaussian regression model. The main advantage of this model is that it adequately models overdispersed long-tailed data because it has a wider skewness range than the negative binomial distribution. We illustrate the application of this model in a real dataset and compare its performance to that of a negative binomial regression model.
ABSTRACT
Healthcare is going through a big data revolution. The amount of data generated by healthcare is expected to increase significantly in the coming years. Therefore, efficient and effective data processing methods are required to transform data into information. In addition, applying statistical analysis can transform the information into useful knowledge. We developed a data mining method that can uncover new knowledge in this enormous field for clinical decision making while generating scientific methods and hypotheses. The proposed pipeline can be generally applied to a variety of data mining tasks in medical informatics. For this study, we applied the proposed pipeline for post-marketing surveillance on drug safety using FAERS, the data warehouse created by FDA. We used 14 kinds of neurology drugs to illustrate our methods. Our result indicated that this approach can successfully reveal insight for further drug safety evaluation.
ABSTRACT
OBJECTIVE: We examine and compare pedestrian-vehicle collisions and injury outcomes involving school-age children between 5 and 18 years of age in the capital cities of Santiago, Chile, and Seoul, South Korea. METHODS: We conduct descriptive analysis of the child pedestrian-vehicle collision (P-VC) data (904 collisions for Santiago and 3,505 for Seoul) reported by the police between 2010 and 2011. We also statistically analyze factors associated with child P-VCs, by both incident severity and age group, using 3 regression models: negative binomial, probit, and spatial lag models. RESULTS: Descriptive statistics suggest that child pedestrians in Seoul have a higher risk of being involved in traffic crashes than their counterparts in Santiago. However, in Seoul a greater proportion of children are unharmed as a result of these incidents, whereas more child pedestrians are killed in Santiago. Younger children in Seoul suffer more injuries from P-VCs than in Santiago. The majority of P-VCs in both cities tend to occur in the afternoon and evening, at intersections in Santiago and at midblock locations in Seoul. Our model results suggest that the resident population of children is positively associated with P-VCs in both cities, and school concentrations apparently increase P-VC risk among older children in Santiago. Bus stops are associated with higher P-VCs in Seoul, and subway stations relate to higher P-VCs among older children in Santiago. Zone-level land use mix was negatively related to child P-VCs in Seoul but not in Santiago. Arterial roads are associated with fewer P-VCs, especially for younger children in both cities. A share of collector roads is associated with increased P-VCs in Seoul but fewer P-VCs in Santiago. Hilliness is related to fewer P-VCs in both cities. Differences in these model results for Santiago and Seoul warrant additional analysis, as do the differences in results across model type (negative binomial versus spatial lag models). CONCLUSIONS: To reduce child P-VCs, this study suggests the need to assess subway station and bus stop area conditions in Santiago and Seoul, respectively; areas with high density of schools in Santiago; areas with greater concentrations of children in both cities; and collector roads in Seoul.
Subject(s)
Accidents, Traffic/statistics & numerical data , Pedestrians , Wounds and Injuries/epidemiology , Adolescent , Child , Child, Preschool , Chile/epidemiology , Cities , Female , Humans , Male , Risk , Schools/statistics & numerical data , Seoul/epidemiologyABSTRACT
Este trabalho teve como objetivo realizar análise ecológica sobre suicídio de pessoas com 60 anos ou mais nos municípios brasileiros no triênio 2005-2007, investigando-se fatores associados ao evento. Foram utilizados dados referentes aos óbitos por suicídio extraídos do Sistema de Informação sobre Mortalidade (SIM), códigos X60 a X86 e Y87.0 (CID-10). Foram ajustados modelos de regressão de Poisson, binomial negativa e binomial negativa inflacionada de zeros (ZINB). Este último exibiu os melhores resultados quando da comparação de modelos. Foram identificados como fatores associados ao suicídio: proporção de não brancos (associação negativa), taxa de internação por transtornos de humor (associação positiva) e razão de sexo (associação negativa).
This scope of this paper was to conduct an ecological analysis of suicide mortality of people aged 60 years or more in Brazilian municipalities between 2005 and 2007, by investigating factors associated with the event. Data on suicide deaths were extracted from the Mortality Information System, codes X60 to X86 and Y87.0 (ICD-10). Poisson, negative binomial and zero-inflated negative binomial (ZINB) regression models were adjusted. The latter exhibited the best results when comparing models. The proportion of non-whites (negative association), the rate of hospitalization for mood disorders (positive association) and sex ratio (negative association) were identified as factors associated with suicide.