Pesquisa | Portal Regional da BVS (teste)

Robertson, David S; Wason, James M S; Ramdas, Aaditya.

Stat Sci ; 38(4): 557-575, 2023 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-38223302

RESUMO

Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.

Brainprints: identifying individuals from magnetoencephalograms.

Wu, Shenghao; Ramdas, Aaditya; Wehbe, Leila.

Commun Biol ; 5(1): 852, 2022 08 22.

Artigo em Inglês | MEDLINE | ID: mdl-35995976

RESUMO

Magnetoencephalography (MEG) is used to study a wide variety of cognitive processes. Increasingly, researchers are adopting principles of open science and releasing their MEG data. While essential for reproducibility, sharing MEG data has unforeseen privacy risks. Individual differences may make a participant identifiable from their anonymized recordings. However, our ability to identify individuals based on these individual differences has not yet been assessed. Here, we propose interpretable MEG features to characterize individual difference. We term these features brainprints (brain fingerprints). We show through several datasets that brainprints accurately identify individuals across days, tasks, and even between MEG and Electroencephalography (EEG). Furthermore, we identify consistent brainprint components that are important for identification. We study the dependence of identifiability on the amount of data available. We also relate identifiability to the level of preprocessing and the experimental task. Our findings reveal specific aspects of individual variability in MEG. They also raise concerns about unregulated sharing of brain data, even if anonymized.

Assuntos

Mapeamento Encefálico , Magnetoencefalografia , Encéfalo , Eletroencefalografia , Humanos , Reprodutibilidade dos Testes

Fast and powerful conditional randomization testing via distillation.

Liu, Molei; Katsevich, Eugene; Janson, Lucas; Ramdas, Aaditya.

Biometrika ; 109(2): 277-293, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-37416628

RESUMO

We consider the problem of conditional independence testing: given a response Y and covariates (X,Z), we test the null hypothesis that Yâ««Xâ£Z. The conditional randomization test was recently proposed as a way to use distributional information about Xâ£Z to exactly and nonasymptotically control Type-I error using any test statistic in any dimensionality without assuming anything about Yâ£(X,Z). This flexibility, in principle, allows one to derive powerful test statistics from complex prediction algorithms while maintaining statistical validity. Yet the direct use of such advanced test statistics in the conditional randomization test is prohibitively computationally expensive, especially with multiple testing, due to the requirement to recompute the test statistic many times on resampled data. We propose the distilled conditional randomization test, a novel approach to using state-of-the-art machine learning algorithms in the conditional randomization test while drastically reducing the number of times those algorithms need to be run, thereby taking advantage of their power and the conditional randomization test's statistical guarantees without suffering the usual computational expense. In addition to distillation, we propose a number of other tricks, like screening and recycling computations, to further speed up the conditional randomization test without sacrificing its high power and exact validity. Indeed, we show in simulations that all our proposals combined lead to a test that has similar power to the most powerful existing conditional randomization test implementations, but requires orders of magnitude less computation, making it a practical tool even for large datasets. We demonstrate these benefits on a breast cancer dataset by identifying biomarkers related to cancer stage.

Online control of the familywise error rate.

Tian, Jinjin; Ramdas, Aaditya.

Stat Methods Med Res ; 30(4): 976-993, 2021 04.

Artigo em Inglês | MEDLINE | ID: mdl-33413033

RESUMO

Biological research often involves testing a growing number of null hypotheses as new data are accumulated over time. We study the problem of online control of the familywise error rate, that is testing an a priori unbounded sequence of hypotheses (p-values) one by one over time without knowing the future, such that with high probability there are no false discoveries in the entire sequence. This paper unifies algorithmic concepts developed for offline (single batch) familywise error rate control and online false discovery rate control to develop novel online familywise error rate control methods. Though many offline familywise error rate methods (e.g., Bonferroni, fallback procedures and Sidak's method) can trivially be extended to the online setting, our main contribution is the design of new, powerful, adaptive online algorithms that control the familywise error rate when the p-values are independent or locally dependent in time. Our numerical experiments demonstrate substantial gains in power, that are also formally proved in an idealized Gaussian sequence model. A promising application to the International Mouse Phenotyping Consortium is described.

Assuntos

Algoritmos , Projetos de Pesquisa , Animais , Camundongos , Distribuição Normal

Universal inference.

Wasserman, Larry; Ramdas, Aaditya; Balakrishnan, Sivaraman.

Proc Natl Acad Sci U S A ; 117(29): 16880-16890, 2020 07 21.

Artigo em Inglês | MEDLINE | ID: mdl-32631986

RESUMO

We propose a general method for constructing confidence sets and hypothesis tests that have finite-sample guarantees without regularity conditions. We refer to such procedures as "universal." The method is very simple and is based on a modified version of the usual likelihood-ratio statistic that we call "the split likelihood-ratio test" (split LRT) statistic. The (limiting) null distribution of the classical likelihood-ratio statistic is often intractable when used to test composite null hypotheses in irregular statistical models. Our method is especially appealing for statistical inference in these complex setups. The method we suggest works for any parametric model and also for some nonparametric models, as long as computing a maximum-likelihood estimator (MLE) is feasible under the null. Canonical examples arise in mixture modeling and shape-constrained inference, for which constructing tests and confidence sets has been notoriously difficult. We also develop various extensions of our basic methods. We show that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood. We investigate some conditions under which our methods yield valid inferences under model misspecification. Further, the split LRT can be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid P values and confidence sequences. Finally, when combined with the method of sieves, it can be used to perform model selection with nested model classes.

REGULARIZED BRAIN READING WITH SHRINKAGE AND SMOOTHING.

Wehbe, Leila; Ramdas, Aaditya; Steorts, Rebecca C; Shalizi, Cosma Rohilla.

Ann Appl Stat ; 9(4): 1997-2022, 2015 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34326914

RESUMO

Functional neuroimaging measures how the brain responds to complex stimuli. However, sample sizes are modest, noise is substantial, and stimuli are high dimensional. Hence, direct estimates are inherently imprecise and call for regularization. We compare a suite of approaches which regularize via shrinkage: ridge regression, the elastic net (a generalization of ridge regression and the lasso), and a hierarchical Bayesian model based on small area estimation (SAE). We contrast regularization with spatial smoothing and combinations of smoothing and shrinkage. All methods are tested on functional magnetic resonance imaging (fMRI) data from multiple subjects participating in two different experiments related to reading, for both predicting neural response to stimuli and decoding stimuli from responses. Interestingly, when the regularization parameters are chosen by cross-validation independently for every voxel, low/high regularization is chosen in voxels where the classification accuracy is high/low, indicating that the regularization intensity is a good tool for identification of relevant voxels for the cognitive task. Surprisingly, all the regularization methods work about equally well, suggesting that beating basic smoothing and shrinkage will take not only clever methods, but also careful modeling.

Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses.

Wehbe, Leila; Murphy, Brian; Talukdar, Partha; Fyshe, Alona; Ramdas, Aaditya; Mitchell, Tom.

PLoS One ; 9(11): e112575, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25426840

RESUMO

Story understanding involves many perceptual and cognitive subprocesses, from perceiving individual words, to parsing sentences, to understanding the relationships among the story characters. We present an integrated computational model of reading that incorporates these and additional subprocesses, simultaneously discovering their fMRI signatures. Our model predicts the fMRI activity associated with reading arbitrary text passages, well enough to distinguish which of two story segments is being read with 74% accuracy. This approach is the first to simultaneously track diverse reading subprocesses during complex story processing and predict the detailed neural representation of diverse story features, ranging from visual word properties to the mention of different story characters and different actions they perform. We construct brain representation maps that replicate many results from a wide range of classical studies that focus each on one aspect of language processing and offer new insights on which type of information is processed by different areas involved in language processing. Additionally, this approach is promising for studying individual differences: it can be used to create single subject maps that may potentially be used to measure reading comprehension and diagnose reading disorders.

Assuntos

Encéfalo/fisiologia , Compreensão/fisiologia , Rememoração Mental/fisiologia , Leitura , Adolescente , Adulto , Atenção , Encéfalo/anatomia & histologia , Mapeamento Encefálico , Feminino , Humanos , Processamento de Imagem Assistida por Computador , Idioma , Imageamento por Ressonância Magnética , Masculino

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA