Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
Add more filters










Publication year range
1.
Sci Adv ; 10(18): eadl2524, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38691613

ABSTRACT

The U.S. Census Bureau faces a difficult trade-off between the accuracy of Census statistics and the protection of individual information. We conduct an independent evaluation of bias and noise induced by the Bureau's two main disclosure avoidance systems: the TopDown algorithm used for the 2020 Census and the swapping algorithm implemented for the three previous Censuses. Our evaluation leverages the Noisy Measurement File (NMF) as well as two independent runs of the TopDown algorithm applied to the 2010 decennial Census. We find that the NMF contains too much noise to be directly useful without measurement error modeling, especially for Hispanic and multiracial populations. TopDown's postprocessing reduces the NMF noise and produces data whose accuracy is similar to that of swapping. While the estimated errors for both TopDown and swapping algorithms are generally no greater than other sources of Census error, they can be relatively substantial for geographies with small total populations.


Subject(s)
Algorithms , Bias , Censuses , United States , Humans , Privacy
3.
Org Lett ; 25(44): 7953-7957, 2023 Nov 10.
Article in English | MEDLINE | ID: mdl-37901962

ABSTRACT

The Pd-catalyzed stereoselective construction of decalins with one-carbon units bearing heteroatoms at the ring junction is described. The Pd-catalyzed cyclization of silyl enol ether resulted in exclusive formation of the cis isomer (89%, >100/1 cis/trans). On the contrary, Pd-catalyzed carboiodination and carboborylation (with oxidative workup) provided products in 56% yield (1/>100 cis/trans) and 69% yield (1/11 cis/trans), respectively.

4.
Proc Natl Acad Sci U S A ; 120(25): e2217322120, 2023 Jun 20.
Article in English | MEDLINE | ID: mdl-37310996

ABSTRACT

Congressional district lines in many US states are drawn by partisan actors, raising concerns about gerrymandering. To separate the partisan effects of redistricting from the effects of other factors including geography and redistricting rules, we compare possible party compositions of the US House under the enacted plan to those under a set of alternative simulated plans that serve as a nonpartisan baseline. We find that partisan gerrymandering is widespread in the 2020 redistricting cycle, but most of the electoral bias it creates cancels at the national level, giving Republicans two additional seats on average. Geography and redistricting rules separately contribute a moderate pro-Republican bias. Finally, we find that partisan gerrymandering reduces electoral competition and makes the partisan composition of the US House less responsive to shifts in the national vote.

5.
Science ; 380(6648): 902-903, 2023 06 02.
Article in English | MEDLINE | ID: mdl-37262166
6.
Sci Data ; 10(1): 299, 2023 05 19.
Article in English | MEDLINE | ID: mdl-37208389

ABSTRACT

We provide the largest compiled publicly available dictionaries of first, middle, and surnames for the purpose of imputing race and ethnicity using, for example, Bayesian Improved Surname Geocoding (BISG). The dictionaries are based on the voter files of six U.S. Southern States that collect self-reported racial data upon voter registration. Our data cover the racial make-up of a larger set of names than any comparable dataset, containing 136 thousand first names, 125 thousand middle names, and 338 thousand surnames. Individuals are categorized into five mutually exclusive racial and ethnic groups - White, Black, Hispanic, Asian, and Other - and racial/ethnic probabilities by name are provided for every name in each dictionary. We provide both probabilities of the form ℙ(race|name) and ℙ(name|race), and conditions under which they can be assumed to be representative of a given target population. These conditional probabilities can then be deployed for imputation in a data analytic task for which self-reported racial and ethnic data is not available.


Subject(s)
Ethnicity , Hispanic or Latino , Humans , Bayes Theorem , Black People , Self Report , United States
8.
Biometrics ; 79(3): 2370-2381, 2023 09.
Article in English | MEDLINE | ID: mdl-36285364

ABSTRACT

Two-stage randomized experiments become an increasingly popular experimental design for causal inference when the outcome of one unit may be affected by the treatment assignments of other units in the same cluster. In this paper, we provide a methodological framework for general tools of statistical inference and power analysis for two-stage randomized experiments. Under the randomization-based framework, we consider the estimation of a new direct effect of interest as well as the average direct and spillover effects studied in the literature. We provide unbiased estimators of these causal quantities and their conservative variance estimators in a general setting. Using these results, we then develop hypothesis testing procedures and derive sample size formulas. We theoretically compare the two-stage randomized design with the completely randomized and cluster randomized designs, which represent two limiting designs. Finally, we conduct simulation studies to evaluate the empirical performance of our sample size formulas. For empirical illustration, the proposed methodology is applied to the randomized evaluation of the Indian National Health Insurance Program. An open-source software package is available for implementing the proposed methodology.


Subject(s)
Research Design , Software , Computer Simulation , Sample Size , Causality , Models, Statistical
9.
Sci Adv ; 8(49): eadc9824, 2022 Dec 09.
Article in English | MEDLINE | ID: mdl-36490334

ABSTRACT

Prediction of individuals' race and ethnicity plays an important role in studies of racial disparity. Bayesian Improved Surname Geocoding (BISG), which relies on detailed census information, has emerged as a leading methodology for this prediction task. Unfortunately, BISG suffers from two data problems. First, the census often contains zero counts for minority groups in the locations where members of those groups reside. Second, many surnames-especially those of minorities-are missing from the census data. We introduce a fully Bayesian BISG (fBISG) methodology that accounts for census measurement error by extending the naïve Bayesian inference of the BISG methodology. We also use additional data on last, first, and middle names taken from the voter files of six Southern states where self-reported race is available. Our empirical validation shows that the fBISG methodology and name supplements substantially improve the accuracy of race imputation, especially for racial minorities.

10.
Sci Data ; 9(1): 689, 2022 11 11.
Article in English | MEDLINE | ID: mdl-36369510

ABSTRACT

This article introduces the 50STATESIMULATIONS, a collection of simulated congressional districting plans and underlying code developed by the Algorithm-Assisted Redistricting Methodology (ALARM) Project. The 50STATESIMULATIONS allow for the evaluation of enacted and other congressional redistricting plans in the United States. While the use of redistricting simulation algorithms has become standard in academic research and court cases, any simulation analysis requires non-trivial efforts to combine multiple data sets, identify state-specific redistricting criteria, implement complex simulation algorithms, and summarize and visualize simulation outputs. We have developed a complete workflow that facilitates this entire process of simulation-based redistricting analysis for the congressional districts of all 50 states. The resulting 50STATESIMULATIONS include ensembles of simulated 2020 congressional redistricting plans and necessary replication data. We also provide the underlying code, which serves as a template for customized analyses. All data and code are free and publicly available. This article details the design, creation, and validation of the data.

11.
Sci Adv ; 7(41): eabk3283, 2021 Oct 08.
Article in English | MEDLINE | ID: mdl-34613778

ABSTRACT

Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other applications of differential privacy, however, census statistics must be postprocessed after noise injection to be usable. We study the impact of the U.S. Census Bureau's latest disclosure avoidance system (DAS) on a major application of census statistics, the redrawing of electoral districts. We find that the DAS systematically undercounts the population in mixed-race and mixed-partisan precincts, yielding unpredictable racial and partisan biases. While the DAS leads to a likely violation of the "One Person, One Vote" standard as currently interpreted, it does not prevent accurate predictions of an individual's race and ethnicity. Our findings underscore the difficulty of balancing accuracy and respondent privacy in the Census.

13.
Stat Med ; 37(20): 2907-2922, 2018 09 10.
Article in English | MEDLINE | ID: mdl-29707818

ABSTRACT

The matched-pairs design enables researchers to efficiently infer causal effects from randomized experiments. In this paper, we exploit the key feature of the matched-pairs design and develop a sensitivity analysis for missing outcomes due to truncation by death, in which the outcomes of interest (e.g., quality of life measures) are not even well defined for some units (e.g., deceased patients). Our key idea is that if 2 nearly identical observations are paired prior to the randomization of the treatment, the missingness of one unit's outcome is informative about the potential missingness of the other unit's outcome under an alternative treatment condition. We consider the average treatment effect among always-observed pairs (ATOP) whose units exhibit no missing outcome regardless of their treatment status. The naive estimator based on available pairs is unbiased for the ATOP if 2 units of the same pair are identical in terms of their missingness patterns. The proposed sensitivity analysis characterizes how the bounds of the ATOP widen as the degree of the within-pair similarity decreases. We further extend the methodology to the matched-pairs design in observational studies. Our simulation studies show that informative bounds can be obtained under some scenarios when the proportion of missing data is not too large. The proposed methodology is also applied to the randomized evaluation of the Mexican universal health insurance program. An open-source software package is available for implementing the proposed research.


Subject(s)
Matched-Pair Analysis , Mortality , Humans , Mexico , Models, Statistical , Observational Studies as Topic , Randomized Controlled Trials as Topic , Software , Treatment Outcome , Universal Health Insurance
15.
Psychol Methods ; 19(4): 482-7, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25486116

ABSTRACT

Mediation analysis has been extensively applied in psychological and other social science research. A number of methodologists have recently developed a formal theoretical framework for mediation analysis from a modern causal inference perspective. In Imai, Keele, and Tingley (2010), we have offered such an approach to causal mediation analysis that formalizes identification, estimation, and sensitivity analysis in a single framework. This approach has been used by a number of substantive researchers, and in subsequent work we have also further extended it to more complex settings and developed new research designs. In an insightful article, Pearl (2014) proposed an alternative approach that is based on a set of assumptions weaker than ours. In this comment, we demonstrate that the theoretical differences between our identification assumptions and his alternative conditions are likely to be of little practical relevance in the substantive research settings faced by most psychologists and other social scientists. We also show that our proposed estimation algorithms can be easily applied in the situations discussed in Pearl (2014). The methods discussed in this comment and many more are implemented via mediation, an open-source software (Tingley, Yamamoto, Hirose, Keele, & Imai, 2013).


Subject(s)
Causality , Models, Statistical , Statistics as Topic , Humans
16.
Pediatr Blood Cancer ; 60(5): 836-41, 2013 May.
Article in English | MEDLINE | ID: mdl-23023736

ABSTRACT

BACKGROUND: Wiskott-Aldrich syndrome (WAS) is a rare X-linked immunodeficiency caused by defects of the WAS protein (WASP) gene. Patients with WAS typically demonstrate micro-thrombocytopenia. PROCEDURES: The report describes seven male infants with WAS that initially presented with leukocytosis, monocytosis, and myeloid and erythroid precursors in the peripheral blood (PB) and dysplasia in the bone marrow (BM), which was initially indistinguishable from juvenile myelomonocytic leukaemia (JMML). RESULTS: The median age of affected patients was 1 month (range, 1-4 months). Splenomegaly was absent in four of these patients, which was unusual for JMML. A mutation analysis of genes in the RAS-signalling pathway did not support a diagnosis of JMML. Non-haematological features, such as eczema (n = 7) and bloody stools (n = 6), ultimately led to the diagnosis of WAS at a median age of 4 months (range, 3-8 months), which was confirmed by absent (n = 6) or reduced (n = 1) WASP expression in lymphocytes by flow cytometry (FCM) and a WASP gene mutation. Interestingly, mean platelet volume (MPV) was normal in three of five patients and six of seven patients demonstrated occasional giant platelets, which was not compatible with WAS. CONCLUSIONS: These data suggest that WAS should be considered in male infants presenting with JMML-like features if no molecular markers of JMML can be detected.


Subject(s)
Leukemia, Myelomonocytic, Juvenile/diagnosis , Leukemia, Myelomonocytic, Juvenile/genetics , Wiskott-Aldrich Syndrome/diagnosis , Wiskott-Aldrich Syndrome/genetics , Bone Marrow/pathology , DNA Mutational Analysis , Diagnosis, Differential , Erythroid Precursor Cells , GTP Phosphohydrolases/genetics , Humans , Infant , Infant, Newborn , Leukocytosis/complications , Male , Membrane Proteins/genetics , Myeloid Progenitor Cells , Protein Tyrosine Phosphatase, Non-Receptor Type 11/genetics , Proto-Oncogene Proteins/genetics , Proto-Oncogene Proteins p21(ras) , Thrombocytopenia , Wiskott-Aldrich Syndrome/blood , Wiskott-Aldrich Syndrome Protein/genetics , ras Proteins/genetics
17.
Eur J Pediatr ; 171(8): 1273-6, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22430350

ABSTRACT

A Japanese patient presented with lymphedema, severe Varicella zoster, and Salmonella infection, recurrent respiratory infections, panniculitis, monocytopenia, B- and NK-cell lymphopenia, and myelodysplasia. The phenotype was a mixture of the monocytopenia and mycobacterial infection (MonoMAC) and Emberger syndromes. Sequencing of the GATA-2 cDNA revealed the heterozygous missense mutation 1187 G > A. This mutation resulted in the amino acid mutation Arg396Gln in the zinc fingers-2 domain, which is predicted to cause significant structural change and prevent a critical interaction with DNA. Functional analysis of the patient's GATA-2 mutation is required to understand the relationship between these distinctive syndromes.


Subject(s)
GATA2 Transcription Factor/genetics , Immunologic Deficiency Syndromes/diagnosis , Lymphedema/diagnosis , Myelodysplastic Syndromes/diagnosis , Female , Genetic Markers , Humans , Immunologic Deficiency Syndromes/genetics , Lymphedema/genetics , Mutation, Missense , Myelodysplastic Syndromes/genetics , Phenotype , Syndrome , Young Adult
18.
Multivariate Behav Res ; 46(5): 861-873, 2011 Sep.
Article in English | MEDLINE | ID: mdl-23788819

ABSTRACT

In this commentary, we demonstrate how the potential outcomes framework can help understand the key identification assumptions underlying causal mediation analysis. We show that this framework can lead to the development of alternative research design and statistical analysis strategies applicable to the longitudinal data settings considered by Maxwell, Cole, and Mitchell (2011).

19.
Psychol Methods ; 15(4): 309-34, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20954780

ABSTRACT

Traditionally in the social sciences, causal mediation analysis has been formulated, understood, and implemented within the framework of linear structural equation models. We argue and demonstrate that this is problematic for 3 reasons: the lack of a general definition of causal mediation effects independent of a particular statistical model, the inability to specify the key identification assumption, and the difficulty of extending the framework to nonlinear models. In this article, we propose an alternative approach that overcomes these limitations. Our approach is general because it offers the definition, identification, estimation, and sensitivity analysis of causal mediation effects without reference to any specific statistical model. Further, our approach explicitly links these 4 elements closely together within a single framework. As a result, the proposed framework can accommodate linear and nonlinear relationships, parametric and nonparametric models, continuous and discrete mediators, and various types of outcome variables. The general definition and identification result also allow us to develop sensitivity analysis in the context of commonly used models, which enables applied researchers to formally assess the robustness of their empirical conclusions to violations of the key assumption. We illustrate our approach by applying it to the Job Search Intervention Study. We also offer easy-to-use software that implements all our proposed methods.


Subject(s)
Causality , Models, Statistical , Algorithms , Data Interpretation, Statistical , Linear Models , Sensitivity and Specificity , Social Sciences/methods
20.
Lancet ; 373(9673): 1447-54, 2009 Apr 25.
Article in English | MEDLINE | ID: mdl-19359034

ABSTRACT

BACKGROUND: We assessed aspects of Seguro Popular, a programme aimed to deliver health insurance, regular and preventive medical care, medicines, and health facilities to 50 million uninsured Mexicans. METHODS: We randomly assigned treatment within 74 matched pairs of health clusters-ie, health facility catchment areas-representing 118 569 households in seven Mexican states, and measured outcomes in a 2005 baseline survey (August, 2005, to September, 2005) and follow-up survey 10 months later (July, 2006, to August, 2006) in 50 pairs (n=32 515). The treatment consisted of encouragement to enrol in a health-insurance programme and upgraded medical facilities. Participant states also received funds to improve health facilities and to provide medications for services in treated clusters. We estimated intention to treat and complier average causal effects non-parametrically. FINDINGS: Intention-to-treat estimates indicated a 23% reduction from baseline in catastrophic expenditures (1.9% points; 95% CI 0.14-3.66). The effect in poor households was 3.0% points (0.46-5.54) and in experimental compliers was 6.5% points (1.65-11.28), 30% and 59% reductions, respectively. The intention-to-treat effect on health spending in poor households was 426 pesos (39-812), and the complier average causal effect was 915 pesos (147-1684). Contrary to expectations and previous observational research, we found no effects on medication spending, health outcomes, or utilisation. INTERPRETATION: Programme resources reached the poor. However, the programme did not show some other effects, possibly due to the short duration of treatment (10 months). Although Seguro Popular seems to be successful at this early stage, further experiments and follow-up studies, with longer assessment periods, are needed to ascertain the long-term effects of the programme.


Subject(s)
Health Policy , Insurance, Health , National Health Programs , Universal Health Insurance , Adult , Child , Child, Preschool , Cluster Analysis , Female , Health Care Surveys , Health Expenditures/statistics & numerical data , Humans , Infant , Male , Mexico , Program Evaluation , Socioeconomic Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...