Search | VHL Regional Portal

Parameterizing neural power spectra into periodic and aperiodic components.

Donoghue, Thomas; Haller, Matar; Peterson, Erik J; Varma, Paroma; Sebastian, Priyadarshini; Gao, Richard; Noto, Torben; Lara, Antonio H; Wallis, Joni D; Knight, Robert T; Shestyuk, Avgusta; Voytek, Bradley.

Nat Neurosci ; 23(12): 1655-1665, 2020 12.

Article in English | MEDLINE | ID: mdl-33230329

ABSTRACT

Electrophysiological signals exhibit both periodic and aperiodic properties. Periodic oscillations have been linked to numerous physiological, cognitive, behavioral and disease states. Emerging evidence demonstrates that the aperiodic component has putative physiological interpretations and that it dynamically changes with age, task demands and cognitive states. Electrophysiological neural activity is typically analyzed using canonically defined frequency bands, without consideration of the aperiodic (1/f-like) component. We show that standard analytic approaches can conflate periodic parameters (center frequency, power, bandwidth) with aperiodic ones (offset, exponent), compromising physiological interpretations. To overcome these limitations, we introduce an algorithm to parameterize neural power spectra as a combination of an aperiodic component and putative periodic oscillatory peaks. This algorithm requires no a priori specification of frequency bands. We validate this algorithm on simulated data, and demonstrate how it can be used in applications ranging from analyzing age-related changes in working memory to large-scale data exploration and analysis.

Subject(s)

Electrophysiological Phenomena/physiology , Periodicity , Adult , Aged , Aging/psychology , Algorithms , Animals , Cognition/physiology , Electroencephalography , Female , Humans , Macaca mulatta , Magnetic Resonance Imaging , Magnetoencephalography , Male , Memory, Short-Term , Middle Aged , Psychomotor Performance/physiology , Reproducibility of Results , Young Adult

Cardiac Imaging of Aortic Valve Area From 34 287 UK Biobank Participants Reveals Novel Genetic Associations and Shared Genetic Comorbidity With Multiple Disease Phenotypes.

Córdova-Palomera, Aldo; Tcheandjieu, Catherine; Fries, Jason A; Varma, Paroma; Chen, Vincent S; Fiterau, Madalina; Xiao, Ke; Tejeda, Heliodoro; Keavney, Bernard D; Cordell, Heather J; Tanigawa, Yosuke; Venkataraman, Guhan; Rivas, Manuel A; Ré, Christopher; Ashley, Euan; Priest, James R.

Circ Genom Precis Med ; 13(6): e003014, 2020 12.

Article in English | MEDLINE | ID: mdl-33125279

ABSTRACT

BACKGROUND: The aortic valve is an important determinant of cardiovascular physiology and anatomic location of common human diseases. METHODS: From a sample of 34 287 white British ancestry participants, we estimated functional aortic valve area by planimetry from prospectively obtained cardiac magnetic resonance imaging sequences of the aortic valve. Aortic valve area measurements were submitted to genome-wide association testing, followed by polygenic risk scoring and phenome-wide screening, to identify genetic comorbidities. RESULTS: A genome-wide association study of aortic valve area in these UK Biobank participants showed 3 significant associations, indexed by rs71190365 (chr13:50764607, DLEU1, P=1.8×10-9), rs35991305 (chr12:94191968, CRADD, P=3.4×10-8), and chr17:45013271:C:T (GOSR2, P=5.6×10-8). Replication on an independent set of 8145 unrelated European ancestry participants showed consistent effect sizes in all 3 loci, although rs35991305 did not meet nominal significance. We constructed a polygenic risk score for aortic valve area, which in a separate cohort of 311 728 individuals without imaging demonstrated that smaller aortic valve area is predictive of increased risk for aortic valve disease (odds ratio, 1.14; P=2.3×10-6). After excluding subjects with a medical diagnosis of aortic valve stenosis (remaining n=308 683 individuals), phenome-wide association of >10 000 traits showed multiple links between the polygenic score for aortic valve disease and key health-related comorbidities involving the cardiovascular system and autoimmune disease. Genetic correlation analysis supports a shared genetic etiology with between aortic valve area and birth weight along with other cardiovascular conditions. CONCLUSIONS: These results illustrate the use of automated phenotyping of cardiac imaging data from the general population to investigate the genetic etiology of aortic valve disease, perform clinical prediction, and uncover new clinical and genetic correlates of cardiac anatomy.

Subject(s)

Aortic Valve/diagnostic imaging , Biological Specimen Banks , Cardiovascular Diseases/diagnostic imaging , Cardiovascular Diseases/genetics , Genome-Wide Association Study , Magnetic Resonance Imaging , Adult , Aged , Aortic Valve/pathology , Aortic Valve Stenosis/diagnostic imaging , Aortic Valve Stenosis/genetics , Comorbidity , Female , Genome, Human , Humans , Male , Middle Aged , Multifactorial Inheritance/genetics , Phenomics , Phenotype , Survival Analysis , United Kingdom

Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences.

Fries, Jason A; Varma, Paroma; Chen, Vincent S; Xiao, Ke; Tejeda, Heliodoro; Saha, Priyanka; Dunnmon, Jared; Chubb, Henry; Maskatia, Shiraz; Fiterau, Madalina; Delp, Scott; Ashley, Euan; Ré, Christopher; Priest, James R.

Nat Commun ; 10(1): 3111, 2019 07 15.

Article in English | MEDLINE | ID: mdl-31308376

ABSTRACT

Biomedical repositories such as the UK Biobank provide increasing access to prospectively collected cardiac imaging, however these data are unlabeled, which creates barriers to their use in supervised machine learning. We develop a weakly supervised deep learning model for classification of aortic valve malformations using up to 4,000 unlabeled cardiac MRI sequences. Instead of requiring highly curated training data, weak supervision relies on noisy heuristics defined by domain experts to programmatically generate large-scale, imperfect training labels. For aortic valve classification, models trained with imperfect labels substantially outperform a supervised model trained on hand-labeled MRIs. In an orthogonal validation experiment using health outcomes data, our model identifies individuals with a 1.8-fold increase in risk of a major adverse cardiac event. This work formalizes a deep learning baseline for aortic valve classification and outlines a general strategy for using weak supervision to train machine learning models using unlabeled medical images at scale.

Subject(s)

Aortic Valve/abnormalities , Heart Valve Diseases/pathology , Machine Learning , Aortic Valve/diagnostic imaging , Aortic Valve/pathology , Heart Diseases/pathology , Heart Valve Diseases/diagnostic imaging , Humans , Magnetic Resonance Imaging , Neural Networks, Computer , Supervised Machine Learning

Scene Graph Prediction with Limited Labels.

Chen, Vincent S; Varma, Paroma; Krishna, Ranjay; Bernstein, Michael; Ré, Christopher; Fei-Fei, Li.

Proc IEEE Int Conf Comput Vis ; 2019: 2580-2590, 2019.

Article in English | MEDLINE | ID: mdl-32218709

ABSTRACT

Visual knowledge bases such as Visual Genome power numerous applications in computer vision, including visual question answering and captioning, but suffer from sparse, incomplete relationships. All scene graph models to date are limited to training on a small set of visual relationships that have thousands of training labels each. Hiring human annotators is expensive, and using textual knowledge base completion methods are incompatible with visual data. In this paper, we introduce a semi-supervised method that assigns probabilistic relationship labels to a large number of unlabeled images using few' labeled examples. We analyze visual relationships to suggest two types of image-agnostic features that are used to generate noisy heuristics, whose outputs are aggregated using a factor graph-based generative model. With as few as 10 labeled examples per relationship, the generative model creates enough training data to train any existing state-of-the-art scene graph model. We demonstrate that our method outperforms all baseline approaches on scene graph prediction by 5.16 recall@ 100 for PREDCLS. In our limited label setting, we define a complexity metric for relationships that serves as an indicator (R2 = 0.778) for conditions under which our method succeeds over transfer learning, the de-facto approach for training with limited labels.

Snuba: Automating Weak Supervision to Label Training Data.

Varma, Paroma; Ré, Christopher.

Proceedings VLDB Endowment ; 12(3): 223-236, 2018 Nov.

Article in English | MEDLINE | ID: mdl-31777681

ABSTRACT

As deep learning models are applied to increasingly diverse problems, a key bottleneck is gathering enough high-quality training labels tailored to each task. Users therefore turn to weak supervision, relying on imperfect sources of labels like pattern matching and user-defined heuristics. Unfortunately, users have to design these sources for each task. This process can be time consuming and expensive: domain experts often perform repetitive steps like guessing optimal numerical thresholds and developing informative text patterns. To address these challenges, we present Snuba, a system to automatically generate heuristics using a small labeled dataset to assign training labels to a large, unlabeled dataset in the weak supervision setting. Snuba generates heuristics that each labels the subset of the data it is accurate for, and iteratively repeats this process until the heuristics together label a large portion of the unlabeled data. We develop a statistical measure that guarantees the iterative process will automatically terminate before it degrades training label quality. Snuba automatically generates heuristics in under five minutes and performs up to 9.74 F1 points better than the best known user-defined heuristics developed over many days. In collaborations with users at research labs, Stanford Hospital, and on open source datasets, Snuba outperforms other automated approaches like semi-supervised learning by up to 14.35 F1 points.

Training Classifiers with Natural Language Explanations.

Hancock, Braden; Bringmann, Martin; Varma, Paroma; Liang, Percy; Wang, Stephanie; Ré, Christopher.

Proc Conf Assoc Comput Linguist Meet ; 2018: 1884-1895, 2018 Jul.

Article in English | MEDLINE | ID: mdl-31130772

ABSTRACT

Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.

Exploring the Utility of Developer Exhaust.

Zhang, Jian; Lam, Max; Wang, Stephanie; Varma, Paroma; Nardi, Luigi; Olukotun, Kunle; Ré, Christopher.

Proc Second Workshop Data Manag End End Mach Learn (2018) ; 20182018 Jun.

Article in English | MEDLINE | ID: mdl-31131381

ABSTRACT

Using machine learning to analyze data often results in developer exhaust - code, logs, or metadata that do not define the learning algorithm but are byproducts of the data analytics pipeline. We study how the rich information present in developer exhaust can be used to approximately solve otherwise complex tasks. Specifically, we focus on using log data associated with training deep learning models to perform model search by predicting performance metrics for untrained models. Instead of designing a different model for each performance metric, we present two preliminary methods that rely only on information present in logs to predict these characteristics for different architectures. We introduce (i) a nearest neighbor approach with a hand-crafted edit distance metric to compare model architectures and (ii) a more generalizable, end-to-end approach that trains an LSTM using model architectures and associated logs to predict performance metrics of interest. We perform model search optimizing for best validation accuracy, degree of overfitting, and best validation accuracy given a constraint on training time. Our approaches can predict validation accuracy within 1.37% error on average, while the baseline achieves 4.13% by using the performance of a trained model with the closest number of layers. When choosing the best performing model given constraints on training time, our approaches select the top-3 models that overlap with the true top- 3 models 82% of the time, while the baseline only achieves this 54% of the time. Our preliminary experiments hold promise for how developer exhaust can help learn models that can approximate various complex tasks efficiently.

Inferring Generative Model Structure with Static Analysis.

Varma, Paroma; He, Bryan; Bajaj, Payal; Banerjee, Imon; Khandwala, Nishith; Rubin, Daniel L; Ré, Christopher.

Adv Neural Inf Process Syst ; 30: 239-249, 2017 Dec.

Article in English | MEDLINE | ID: mdl-29391769

ABSTRACT

Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The structure of these models affects training label quality, but is difficult to learn without any ground truth labels. We instead rely on these weak supervision sources having some structure by virtue of being encoded programmatically. We present Coral, a paradigm that infers generative model structure by statically analyzing the code for these heuristics, thus reducing the data required to learn structure significantly. We prove that Coral's sample complexity scales quasilinearly with the number of heuristics and number of relations found, improving over the standard sample complexity, which is exponential in n for identifying nth degree relations. Experimentally, Coral matches or outperforms traditional structure learning approaches by up to 3.81 F1 points. Using Coral to model dependencies instead of assuming independence results in better performance than a fully supervised model by 3.07 accuracy points when heuristics are used to label radiology data without ground truth labels.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL