Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J R Stat Soc Ser C Appl Stat ; 73(3): 715-734, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38883260

RESUMO

In many contexts, particularly when study subjects are adolescents, peer effects can invalidate typical statistical requirements in the data. For instance, it is plausible that a student's academic performance is influenced both by their own mother's educational level as well as that of their peers. Since the underlying social network is measured, the Add Health study provides a unique opportunity to examine the impact of maternal college education on adolescent school performance, both direct and indirect. However, causal inference on populations embedded in social networks poses technical challenges, since the typical no interference assumption no longer holds. While inverse probability-of-treatment weighted (IPW) estimators have been developed for this setting, they are often highly unstable. Motivated by the question of maternal education, we propose doubly robust (DR) estimators combining models for treatment and outcome that are consistent and asymptotically normal if either model is correctly specified. We present empirical results that illustrate the DR property and the efficiency gain of DR over IPW estimators even when the treatment model is misspecified. Contrary to previous studies, our robust analysis does not provide evidence of an indirect effect of maternal education on academic performance within adolescents' social circles in Add Health.

2.
Spat Spatiotemporal Epidemiol ; 44: 100559, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36707192

RESUMO

Quantifying the impact of lockdowns on COVID-19 mortality risks is an important priority in the public health fight against the virus, but almost all of the existing research has only conducted macro country-wide assessments or limited multi-country comparisons. In contrast, the extent of within-country variation in the impacts of a nation-wide lockdown is yet to be thoroughly investigated, which is the gap in the knowledge base that this paper fills. Our study focuses on England, which was subject to 3 national lockdowns between March 2020 and March 2021. We model weekly COVID-19 mortality counts for the 312 Local Authority Districts in mainland England, and our aim is to understand the impact that lockdowns had at both a national and a regional level. Specifically, we aim to quantify how long after the implementation of a lockdown do mortality risks reduce at a national level, the extent to which these impacts vary regionally within a country, and which parts of England exhibit similar impacts. As the spatially aggregated weekly COVID-19 mortality counts are small in size we estimate the spatio-temporal trends in mortality risks with a Poisson log-linear smoothing model that borrows strength in the estimation between neighbouring data points. Inference is based in a Bayesian paradigm, using Markov chain Monte Carlo simulation. Our main findings are that mortality risks typically begin to reduce between 3 and 4 weeks after lockdown, and that there appears to be an urban-rural divide in lockdown impacts.


Assuntos
COVID-19 , Humanos , Teorema de Bayes , COVID-19/prevenção & controle , Controle de Doenças Transmissíveis , Simulação por Computador , Inglaterra/epidemiologia
3.
J R Stat Soc Ser A Stat Soc ; 182(3): 1061-1080, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31217673

RESUMO

Health inequalities are the unfair and avoidable differences in people's health between different social groups. These inequalities have a huge influence on people's lives, particularly those who live at the poorer end of the socio-economic spectrum, as they result in prolonged ill health and shorter lives. Most studies estimate health inequalities for a single disease, but this will give an incomplete picture of the overall inequality in population health. Here we propose a novel multivariate spatiotemporal model for quantifying health inequalities in Scotland across multiple diseases, which will enable us to understand better how these inequalities vary across disease and have changed over time. In developing this model we are interested in estimating health inequalities between Scotland's 14 regional health boards, who are responsible for the protection and improvement of their population's health. The methodology is applied to hospital admissions data for cerebrovascular disease, coronary heart disease and respiratory disease, which are three of the leading causes of death, from 2003 to 2012 across Scotland.

4.
Biom J ; 59(1): 41-56, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27492753

RESUMO

Spatiotemporal disease mapping focuses on estimating the spatial pattern in disease risk across a set of nonoverlapping areal units over a fixed period of time. The key aim of such research is to identify areas that have a high average level of disease risk or where disease risk is increasing over time, thus allowing public health interventions to be focused on these areas. Such aims are well suited to the statistical approach of clustering, and while much research has been done in this area in a purely spatial setting, only a handful of approaches have focused on spatiotemporal clustering of disease risk. Therefore, this paper outlines a new modeling approach for clustering spatiotemporal disease risk data, by clustering areas based on both their mean risk levels and the behavior of their temporal trends. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland.


Assuntos
Análise por Conglomerados , Saúde Pública/métodos , Medição de Risco/métodos , Risco , Teorema de Bayes , Humanos
5.
Anal Chim Acta ; 931: 34-46, 2016 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-27282749

RESUMO

Many chemometric tools are invaluable and have proven effective in data mining and substantial dimensionality reduction of highly multivariate data. This becomes vital for interpreting various physicochemical data due to rapid development of advanced analytical techniques, delivering much information in a single measurement run. This concerns especially spectra, which are frequently used as the subject of comparative analysis in e.g. forensic sciences. In the presented study the microtraces collected from the scenarios of hit-and-run accidents were analysed. Plastic containers and automotive plastics (e.g. bumpers, headlamp lenses) were subjected to Fourier transform infrared spectrometry and car paints were analysed using Raman spectroscopy. In the forensic context analytical results must be interpreted and reported according to the standards of the interpretation schemes acknowledged in forensic sciences using the likelihood ratio approach. However, for proper construction of LR models for highly multivariate data, such as spectra, chemometric tools must be employed for substantial data compression. Conversion from classical feature representation to distance representation was proposed for revealing hidden data peculiarities and linear discriminant analysis was further applied for minimising the within-sample variability while maximising the between-sample variability. Both techniques enabled substantial reduction of data dimensionality. Univariate and multivariate likelihood ratio models were proposed for such data. It was shown that the combination of chemometric tools and the likelihood ratio approach is capable of solving the comparison problem of highly multivariate and correlated data after proper extraction of the most relevant features and variance information hidden in the data structure.

6.
Spat Spatiotemporal Epidemiol ; 16: 11-20, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26919751

RESUMO

Disease mapping aims to estimate the spatial pattern in disease risk across an area, identifying units which have elevated disease risk. Existing methods use Bayesian hierarchical models with spatially smooth conditional autoregressive priors to estimate risk, but these methods are unable to identify the geographical extent of spatially contiguous high-risk clusters of areal units. Our proposed solution to this problem is a two-stage approach, which produces a set of potential cluster structures for the data and then chooses the optimal structure via a Bayesian hierarchical model. The first stage uses a spatially adjusted hierarchical agglomerative clustering algorithm. The second stage fits a Poisson log-linear model to the data to estimate the optimal cluster structure and the spatial pattern in disease risk. The methodology was applied to a study of chronic obstructive pulmonary disease (COPD) in local authorities in England, where a number of high risk clusters were identified.


Assuntos
Teorema de Bayes , Surtos de Doenças/estatística & dados numéricos , Modelos Estatísticos , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Análise Espacial , Algoritmos , Análise por Conglomerados , Inglaterra/epidemiologia , Feminino , Humanos , Masculino , Risco
7.
Biostatistics ; 15(3): 457-69, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24622038

RESUMO

Disease mapping is the field of spatial epidemiology interested in estimating the spatial pattern in disease risk across [Formula: see text] areal units. One aim is to identify units exhibiting elevated disease risks, so that public health interventions can be made. Bayesian hierarchical models with a spatially smooth conditional autoregressive prior are used for this purpose, but they cannot identify the spatial extent of high-risk clusters. Therefore, we propose a two-stage solution to this problem, with the first stage being a spatially adjusted hierarchical agglomerative clustering algorithm. This algorithm is applied to data prior to the study period, and produces [Formula: see text] potential cluster structures for the disease data. The second stage fits a separate Poisson log-linear model to the study data for each cluster structure, which allows for step-changes in risk where two clusters meet. The most appropriate cluster structure is chosen by model comparison techniques, specifically by minimizing the Deviance Information Criterion. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland.


Assuntos
Teorema de Bayes , Métodos Epidemiológicos , Modelos Estatísticos , Análise Espacial , Humanos , Transtornos Respiratórios/epidemiologia , Escócia/epidemiologia
8.
Ann Appl Stat ; 4(1): 396-421, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-20936055

RESUMO

Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins.

9.
Ann Inst Stat Math ; 62(1): 11-35, 2010 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-20827439

RESUMO

We propose a method for selecting variables in latent class analysis, which is the most common model-based clustering method for discrete data. The method assesses a variable's usefulness for clustering by comparing two models, given the clustering variables already selected. In one model the variable contributes information about cluster allocation beyond that contained in the already selected variables, and in the other model it does not. A headlong search algorithm is used to explore the model space and select clustering variables. In simulated datasets we found that the method selected the correct clustering variables, and also led to improvements in classification performance and in accuracy of the choice of the number of classes. In two real datasets, our method discovered the same group structure with fewer variables. In a dataset from the International HapMap Project consisting of 639 single nucleotide polymorphisms (SNPs) from 210 members of different groups, our method discovered the same group structure with a much smaller number of SNPs.

10.
BMC Bioinformatics ; 6: 173, 2005 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-16011807

RESUMO

BACKGROUND: One of the primary tasks in analysing gene expression data is finding genes that are differentially expressed in different samples. Multiple testing issues due to the thousands of tests run make some of the more popular methods for doing this problematic. RESULTS: We propose a simple method, Normal Uniform Differential Gene Expression (NUDGE) detection for finding differentially expressed genes in cDNA microarrays. The method uses a simple univariate normal-uniform mixture model, in combination with new normalization methods for spread as well as mean that extend the lowess normalization of Dudoit, Yang, Callow and Speed (2002) 1. It takes account of multiple testing, and gives probabilities of differential expression as part of its output. It can be applied to either single-slide or replicated experiments, and it is very fast. Three datasets are analyzed using NUDGE, and the results are compared to those given by other popular methods: unadjusted and Bonferroni-adjusted t tests, Significance Analysis of Microarrays (SAM), and Empirical Bayes for microarrays (EBarrays) with both Gamma-Gamma and Lognormal-Normal models. CONCLUSION: The method gives a high probability of differential expression to genes known/suspected a priori to be differentially expressed and a low probability to the others. In terms of known false positives and false negatives, the method outperforms all multiple-replicate methods except for the Gamma-Gamma EBarrays method to which it offers comparable results with the added advantages of greater simplicity, speed, fewer assumptions and applicability to the single replicate case. An R package called nudge to implement the methods in this paper will be made available soon at http://www.bioconductor.org.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Linfócitos T CD4-Positivos/virologia , DNA Complementar/metabolismo , Interpretação Estatística de Dados , Reações Falso-Negativas , Reações Falso-Positivas , HIV/genética , Humanos , Internet , Modelos Genéticos , Modelos Estatísticos , Hibridização de Ácido Nucleico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Análise de Sequência de DNA , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...