Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
1.
medRxiv ; 2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38585855

ABSTRACT

Cough is a common and commonly ignored symptom of lung disease. Cough is often perceived as difficult to quantify, frequently self-limiting, and non-specific. However, cough has a central role in the clinical detection of many lung diseases including tuberculosis (TB), which remains the leading infectious disease killer worldwide. TB screening currently relies on self-reported cough which fails to meet the World Health Organization (WHO) accuracy targets for a TB triage test. Artificial intelligence (AI) models based on cough sound have been developed for several respiratory conditions, with limited work being done in TB. To support the development of an accurate, point-of-care cough-based triage tool for TB, we have compiled a large multi-country database of cough sounds from individuals being evaluated for TB. The dataset includes more than 700,000 cough sounds from 2,143 individuals with detailed demographic, clinical and microbiologic diagnostic information. We aim to empower researchers in the development of cough sound analysis models to improve TB diagnosis, where innovative approaches are critically needed to end this long-standing pandemic.

2.
BMC Med Inform Decis Mak ; 24(1): 57, 2024 Feb 20.
Article in English | MEDLINE | ID: mdl-38378636

ABSTRACT

BACKGROUND: The two-way partial AUC has been recently proposed as a way to directly quantify partial area under the ROC curve with simultaneous restrictions on the sensitivity and specificity ranges of diagnostic tests or classifiers. The metric, as originally implemented in the tpAUC R package, is estimated using a nonparametric estimator based on a trimmed Mann-Whitney U-statistic, which becomes computationally expensive in large sample sizes. (Its computational complexity is of order [Formula: see text], where [Formula: see text] and [Formula: see text] represent the number of positive and negative cases, respectively). This is problematic since the statistical methodology for comparing estimates generated from alternative diagnostic tests/classifiers relies on bootstrapping resampling and requires repeated computations of the estimator on a large number of bootstrap samples. METHODS: By leveraging the graphical and probabilistic representations of the AUC, partial AUCs, and two-way partial AUC, we derive a novel estimator for the two-way partial AUC, which can be directly computed from the output of any software able to compute AUC and partial AUCs. We implemented our estimator using the computationally efficient pROC R package, which leverages a nonparametric approach using the trapezoidal rule for the computation of AUC and partial AUC scores. (Its computational complexity is of order [Formula: see text], where [Formula: see text].). We compare the empirical bias and computation time of the proposed estimator against the original estimator provided in the tpAUC package in a series of simulation studies and on two real datasets. RESULTS: Our estimator tended to be less biased than the original estimator based on the trimmed Mann-Whitney U-statistic across all experiments (and showed considerably less bias in the experiments based on small sample sizes). But, most importantly, because the computational complexity of the proposed estimator is of order [Formula: see text], rather than [Formula: see text], it is much faster to compute when sample sizes are large. CONCLUSIONS: The proposed estimator provides an improvement for the computation of two-way partial AUC, and allows the comparison of diagnostic tests/machine learning classifiers in large datasets where repeated computations of the original estimator on bootstrap samples become too expensive to compute.


Subject(s)
Area Under Curve , Humans , Computer Simulation
3.
JMIR Public Health Surveill ; 9: e42963, 2023 06 19.
Article in English | MEDLINE | ID: mdl-37335609

ABSTRACT

BACKGROUND: Public involvement in research is a growing phenomenon as well as a condition of research funding, and it is often referred to as coproduction. Coproduction involves stakeholder contributions at every stage of research, but different processes exist. However, the impact of coproduction on research is not well understood. Web-based young people's advisory groups (YPAGs) were established as part of the MindKind study at 3 sites (India, South Africa, and the United Kingdom) to coproduce the wider research study. Each group site, led by a professional youth advisor, conducted all youth coproduction activities collaboratively with other research staff. OBJECTIVE: This study aimed to evaluate the impact of youth coproduction in the MindKind study. METHODS: To measure the impact of web-based youth coproduction on all stakeholders, the following methods were used: analysis of project documents, capturing the views of stakeholders using the Most Significant Change technique, and impact frameworks to assess the impact of youth coproduction on specific stakeholder outcomes. Data were analyzed in collaboration with researchers, advisors, and YPAG members to explore the impact of youth coproduction on research. RESULTS: The impact was recorded on 5 levels. First, at the paradigmatic level, a novel method of conducting research allowed for a widely diverse group of YPAG representations, influencing study priorities, conceptualization, and design. Second, at the infrastructural level, the YPAG and youth advisors meaningfully contributed to the dissemination of materials; infrastructural constraints of undertaking coproduction were also identified. Third, at the organizational level, coproduction necessitated implementing new communication practices, such as a web-based shared platform. This meant that materials were easily accessible to the whole team and communication streams remained consistent. Fourth, at the group level, authentic relationships developed between the YPAG members, advisors, and the rest of the team, facilitated by regular web-based contact. Finally, at the individual level, participants reported enhanced insights into mental well-being and appreciation for the opportunity to engage in research. CONCLUSIONS: This study revealed several factors that shape the creation of web-based coproduction, with clear positive outcomes for advisors, YPAG members, researchers, and other project staff. However, several challenges of coproduced research were also encountered in multiple contexts and amid pressing timelines. For systematic reporting of the impact of youth coproduction, we propose that monitoring, evaluation, and learning systems be designed and implemented early.


Subject(s)
Learning , Mental Health , Humans , Adolescent , United Kingdom , Communication , Internet
4.
PLoS One ; 18(4): e0279857, 2023.
Article in English | MEDLINE | ID: mdl-37074995

ABSTRACT

Mobile devices offer a scalable opportunity to collect longitudinal data that facilitate advances in mental health treatment to address the burden of mental health conditions in young people. Sharing these data with the research community is critical to gaining maximal value from rich data of this nature. However, the highly personal nature of the data necessitates understanding the conditions under which young people are willing to share them. To answer this question, we developed the MindKind Study, a multinational, mixed methods study that solicits young people's preferences for how their data are governed and quantifies potential participants' willingness to join under different conditions. We employed a community-based participatory approach, involving young people as stakeholders and co-researchers. At sites in India, South Africa, and the UK, we enrolled 3575 participants ages 16-24 in the mobile app-mediated quantitative study and 143 participants in the public deliberation-based qualitative study. We found that while youth participants have strong preferences for data governance, these preferences did not translate into (un)willingness to join the smartphone-based study. Participants grappled with the risks and benefits of participation as well as their desire that the "right people" access their data. Throughout the study, we recognized young people's commitment to finding solutions and co-producing research architectures to allow for more open sharing of mental health data to accelerate and derive maximal benefit from research.


Subject(s)
Mental Health , Adolescent , Humans , Young Adult , Adult , South Africa , Qualitative Research , United Kingdom , India
5.
PLOS Digit Health ; 2(3): e0000208, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36976789

ABSTRACT

One of the promising opportunities of digital health is its potential to lead to more holistic understandings of diseases by interacting with the daily life of patients and through the collection of large amounts of real-world data. Validating and benchmarking indicators of disease severity in the home setting is difficult, however, given the large number of confounders present in the real world and the challenges in collecting ground truth data in the home. Here we leverage two datasets collected from patients with Parkinson's disease, which couples continuous wrist-worn accelerometer data with frequent symptom reports in the home setting, to develop digital biomarkers of symptom severity. Using these data, we performed a public benchmarking challenge in which participants were asked to build measures of severity across 3 symptoms (on/off medication, dyskinesia, and tremor). 42 teams participated and performance was improved over baseline models for each subchallenge. Additional ensemble modeling across submissions further improved performance, and the top models validated in a subset of patients whose symptoms were observed and rated by trained clinicians.

7.
Biol Psychiatry ; 91(1): 92-101, 2022 01 01.
Article in English | MEDLINE | ID: mdl-34154796

ABSTRACT

BACKGROUND: While schizophrenia differs between males and females in the age of onset, symptomatology, and disease course, the molecular mechanisms underlying these differences remain uncharacterized. METHODS: To address questions about the sex-specific effects of schizophrenia, we performed a large-scale transcriptome analysis of RNA sequencing data from 437 controls and 341 cases from two distinct cohorts from the CommonMind Consortium. RESULTS: Analysis across the cohorts identified a reproducible gene expression signature of schizophrenia that was highly concordant with previous work. Differential expression across sex was reproducible across cohorts and identified X- and Y-linked genes, as well as those involved in dosage compensation. Intriguingly, the sex expression signature was also enriched for genes involved in neurexin family protein binding and synaptic organization. Differential expression analysis testing a sex-by-diagnosis interaction effect did not identify any genome-wide signature after multiple testing corrections. Gene coexpression network analysis was performed to reduce dimensionality from thousands of genes to dozens of modules and elucidate interactions among genes. We found enrichment of coexpression modules for sex-by-diagnosis differential expression signatures, which were highly reproducible across the two cohorts and involved a number of diverse pathways, including neural nucleus development, neuron projection morphogenesis, and regulation of neural precursor cell proliferation. CONCLUSIONS: Overall, our results indicate that the effect size of sex differences in schizophrenia gene expression signatures is small and underscore the challenge of identifying robust sex-by-diagnosis signatures, which will require future analyses in larger cohorts.


Subject(s)
Schizophrenia , Transcriptome , Brain , Female , Gene Expression Profiling , Humans , Male , Schizophrenia/genetics , Sex Characteristics
8.
Genome Med ; 13(1): 76, 2021 05 04.
Article in English | MEDLINE | ID: mdl-33947463

ABSTRACT

BACKGROUND: Alzheimer's disease (AD) is an incurable neurodegenerative disease currently affecting 1.75% of the US population, with projected growth to 3.46% by 2050. Identifying common genetic variants driving differences in transcript expression that confer AD risk is necessary to elucidate AD mechanism and develop therapeutic interventions. We modify the FUSION transcriptome-wide association study (TWAS) pipeline to ingest gene expression values from multiple neocortical regions. METHODS: A combined dataset of 2003 genotypes clustered to 1000 Genomes individuals from Utah with Northern and Western European ancestry (CEU) was used to construct a training set of 790 genotypes paired to 888 RNASeq profiles from temporal cortex (TCX = 248), prefrontal cortex (FP = 50), inferior frontal gyrus (IFG = 41), superior temporal gyrus (STG = 34), parahippocampal cortex (PHG = 34), and dorsolateral prefrontal cortex (DLPFC = 461). Following within-tissue normalization and covariate adjustment, predictive weights to impute expression components based on a gene's surrounding cis-variants were trained. The FUSION pipeline was modified to support input of pre-scaled expression values and support cross validation with a repeated measure design arising from the presence of multiple transcriptome samples from the same individual across different tissues. RESULTS: Cis-variant architecture alone was informative to train weights and impute expression for 6780 (49.67%) autosomal genes, the majority of which significantly correlated with gene expression; FDR < 5%: N = 6775 (99.92%), Bonferroni: N = 6716 (99.06%). Validation of weights in 515 matched genotype to RNASeq profiles from the CommonMind Consortium (CMC) was (72.14%) in DLPFC profiles. Association of imputed expression components from all 2003 genotype profiles yielded 8 genes significantly associated with AD (FDR < 0.05): APOC1, EED, CD2AP, CEACAM19, CLPTM1, MTCH2, TREM2, and KNOP1. CONCLUSIONS: We provide evidence of cis-genetic variation conferring AD risk through 8 genes across six distinct genomic loci. Moreover, we provide expression weights for 6780 genes as a valuable resource to the community, which can be abstracted across the neocortex and a wide range of neuronal phenotypes.


Subject(s)
Alzheimer Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Neocortex/metabolism , Quantitative Trait Loci , Transcriptome , Computational Biology/methods , Gene Expression Regulation , Genome-Wide Association Study/methods , Humans , Organ Specificity/genetics
9.
NPJ Digit Med ; 4(1): 53, 2021 Mar 19.
Article in English | MEDLINE | ID: mdl-33742069

ABSTRACT

Consumer wearables and sensors are a rich source of data about patients' daily disease and symptom burden, particularly in the case of movement disorders like Parkinson's disease (PD). However, interpreting these complex data into so-called digital biomarkers requires complicated analytical approaches, and validating these biomarkers requires sufficient data and unbiased evaluation methods. Here we describe the use of crowdsourcing to specifically evaluate and benchmark features derived from accelerometer and gyroscope data in two different datasets to predict the presence of PD and severity of three PD symptoms: tremor, dyskinesia, and bradykinesia. Forty teams from around the world submitted features, and achieved drastically improved predictive performance for PD status (best AUROC = 0.87), as well as tremor- (best AUPR = 0.75), dyskinesia- (best AUPR = 0.48) and bradykinesia-severity (best AUPR = 0.95).

10.
Sci Data ; 8(1): 48, 2021 02 05.
Article in English | MEDLINE | ID: mdl-33547309

ABSTRACT

Parkinson's disease (PD) is a neurodegenerative disorder associated with motor and non-motor symptoms. Current treatments primarily focus on managing motor symptom severity such as tremor, bradykinesia, and rigidity. However, as the disease progresses, treatment side-effects can emerge such as on/off periods and dyskinesia. The objective of the Levodopa Response Study was to identify whether wearable sensor data can be used to objectively quantify symptom severity in individuals with PD exhibiting motor fluctuations. Thirty-one subjects with PD were recruited from 2 sites to participate in a 4-day study. Data was collected using 2 wrist-worn accelerometers and a waist-worn smartphone. During Days 1 and 4, a portion of the data was collected in the laboratory while subjects performed a battery of motor tasks as clinicians rated symptom severity. The remaining of the recordings were performed in the home and community settings. To our knowledge, this is the first dataset collected using wearable accelerometers with specific focus on individuals with PD experiencing motor fluctuations that is made available via an open data repository.


Subject(s)
Accelerometry/methods , Parkinson Disease/diagnosis , Wearable Electronic Devices , Humans , Parabrachial Nucleus , Parkinson Disease/physiopathology , Smartphone , Wrist
11.
Sci Data ; 8(1): 47, 2021 02 05.
Article in English | MEDLINE | ID: mdl-33547317

ABSTRACT

Parkinson's disease (PD) is a neurodegenerative disorder characterized by motor and non-motor symptoms. Dyskinesia and motor fluctuations are complications of PD medications. An objective measure of on/off time with/without dyskinesia has been sought for some time because it would facilitate the titration of medications. The objective of the dataset herein presented is to assess if wearable sensor data can be used to generate accurate estimates of limb-specific symptom severity. Nineteen subjects with PD experiencing motor fluctuations were asked to wear a total of five wearable sensors on both forearms and shanks, as well as on the lower back. Accelerometer data was collected for four days, including two laboratory visits lasting 3 to 4 hours each while the remainder of the time was spent at home and in the community. During the laboratory visits, subjects performed a battery of motor tasks while clinicians rated limb-specific symptom severity. At home, subjects were instructed to use a smartphone app that guided the periodic performance of a set of motor tasks.


Subject(s)
Accelerometry/instrumentation , Monitoring, Ambulatory , Parkinson Disease/diagnosis , Wearable Electronic Devices , Forearm , Humans , Leg , Mobile Applications , Parkinson Disease/physiopathology , Smartphone , Torso
12.
PLoS Genet ; 17(1): e1009224, 2021 01.
Article in English | MEDLINE | ID: mdl-33417599

ABSTRACT

Discovering drugs that efficiently treat brain diseases has been challenging. Genetic variants that modulate the expression of potential drug targets can be utilized to assess the efficacy of therapeutic interventions. We therefore employed Mendelian Randomization (MR) on gene expression measured in brain tissue to identify drug targets involved in neurological and psychiatric diseases. We conducted a two-sample MR using cis-acting brain-derived expression quantitative trait loci (eQTLs) from the Accelerating Medicines Partnership for Alzheimer's Disease consortium (AMP-AD) and the CommonMind Consortium (CMC) meta-analysis study (n = 1,286) as genetic instruments to predict the effects of 7,137 genes on 12 neurological and psychiatric disorders. We conducted Bayesian colocalization analysis on the top MR findings (using P<6x10-7 as evidence threshold, Bonferroni-corrected for 80,557 MR tests) to confirm sharing of the same causal variants between gene expression and trait in each genomic region. We then intersected the colocalized genes with known monogenic disease genes recorded in Online Mendelian Inheritance in Man (OMIM) and with genes annotated as drug targets in the Open Targets platform to identify promising drug targets. 80 eQTLs showed MR evidence of a causal effect, from which we prioritised 47 genes based on colocalization with the trait. We causally linked the expression of 23 genes with schizophrenia and a single gene each with anorexia, bipolar disorder and major depressive disorder within the psychiatric diseases and 9 genes with Alzheimer's disease, 6 genes with Parkinson's disease, 4 genes with multiple sclerosis and two genes with amyotrophic lateral sclerosis within the neurological diseases we tested. From these we identified five genes (ACE, GPNMB, KCNQ5, RERE and SUOX) as attractive drug targets that may warrant follow-up in functional studies and clinical trials, demonstrating the value of this study design for discovering drug targets in neuropsychiatric diseases.


Subject(s)
Alzheimer Disease/genetics , Drug Discovery , Genetic Predisposition to Disease , Transcriptome/genetics , Alzheimer Disease/drug therapy , Bipolar Disorder/drug therapy , Bipolar Disorder/genetics , Bipolar Disorder/pathology , Brain/metabolism , Brain/pathology , Genome-Wide Association Study , Humans , Mendelian Randomization Analysis , Molecular Targeted Therapy , Nervous System Diseases/drug therapy , Nervous System Diseases/genetics , Nervous System Diseases/pathology , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics , Schizophrenia/drug therapy , Schizophrenia/genetics , Schizophrenia/pathology
14.
Nat Commun ; 11(1): 5781, 2020 11 13.
Article in English | MEDLINE | ID: mdl-33188183

ABSTRACT

The temporal molecular changes that lead to disease onset and progression in Alzheimer's disease (AD) are still unknown. Here we develop a temporal model for these unobserved molecular changes with a manifold learning method applied to RNA-Seq data collected from human postmortem brain samples collected within the ROS/MAP and Mayo Clinic RNA-Seq studies. We define an ordering across samples based on their similarity in gene expression and use this ordering to estimate the molecular disease stage-or disease pseudotime-for each sample. Disease pseudotime is strongly concordant with the burden of tau (Braak score, P = 1.0 × 10-5), Aß (CERAD score, P = 1.8 × 10-5), and cognitive diagnosis (P = 3.5 × 10-7) of late-onset (LO) AD. Early stage disease pseudotime samples are enriched for controls and show changes in basic cellular functions. Late stage disease pseudotime samples are enriched for late stage AD cases and show changes in neuroinflammation and amyloid pathologic processes. We also identify a set of late stage pseudotime samples that are controls and show changes in genes enriched for protein trafficking, splicing, regulation of apoptosis, and prevention of amyloid cleavage pathways. In summary, we present a method for ordering patients along a trajectory of LOAD disease progression from brain transcriptomic data.


Subject(s)
Brain/pathology , Nerve Degeneration/pathology , Algorithms , Alzheimer Disease/pathology , Disease Progression , Gene Expression Profiling , Gene Expression Regulation , Humans , Nerve Degeneration/genetics , Prefrontal Cortex/pathology , Time Factors , Unsupervised Machine Learning
15.
Curr Protoc Hum Genet ; 108(1): e105, 2020 12.
Article in English | MEDLINE | ID: mdl-33085189

ABSTRACT

The AD Knowledge Portal (adknowledgeportal.org) is a public data repository that shares data and other resources generated by multiple collaborative research programs focused on aging, dementia, and Alzheimer's disease (AD). In this article, we highlight how to use the Portal to discover and download genomic variant and transcriptomic data from the same individuals. First, we show how to use the web interface to browse and search for data of interest using relevant file annotations. We demonstrate how to learn more about the context surrounding the data, including diagnostic criteria and methodological details about sample preparation and data analysis. We present two primary ways to download data-using a web interface, and using a programmatic method that provides access using the command line. Finally, we show how to merge separate sources of metadata into a comprehensive file that contains factors and covariates necessary in downstream analyses. © 2020 The Authors. Basic Protocol 1: Find and download files associated with a selected study Basic Protocol 2: Download files in bulk using the command line client Basic Protocol 3: Working with file annotations and metadata.


Subject(s)
Aging , Alzheimer Disease/therapy , Databases, Genetic/statistics & numerical data , Genomics/methods , Information Storage and Retrieval/methods , Software , Alzheimer Disease/diagnosis , Genomics/statistics & numerical data , Humans , Internet
16.
Sci Data ; 7(1): 340, 2020 10 12.
Article in English | MEDLINE | ID: mdl-33046718

ABSTRACT

The availability of high-quality RNA-sequencing and genotyping data of post-mortem brain collections from consortia such as CommonMind Consortium (CMC) and the Accelerating Medicines Partnership for Alzheimer's Disease (AMP-AD) Consortium enable the generation of a large-scale brain cis-eQTL meta-analysis. Here we generate cerebral cortical eQTL from 1433 samples available from four cohorts (identifying >4.1 million significant eQTL for >18,000 genes), as well as cerebellar eQTL from 261 samples (identifying 874,836 significant eQTL for >10,000 genes). We find substantially improved power in the meta-analysis over individual cohort analyses, particularly in comparison to the Genotype-Tissue Expression (GTEx) Project eQTL. Additionally, we observed differences in eQTL patterns between cerebral and cerebellar brain regions. We provide these brain eQTL as a resource for use by the research community. As a proof of principle for their utility, we apply a colocalization analysis to identify genes underlying the GWAS association peaks for schizophrenia and identify a potentially novel gene colocalization with lncRNA RP11-677M14.2 (posterior probability of colocalization 0.975).


Subject(s)
Cerebellar Cortex/metabolism , Cerebral Cortex/metabolism , Gene Expression Profiling , Quantitative Trait Loci , Datasets as Topic , Genome-Wide Association Study , Humans , Meta-Analysis as Topic , RNA, Long Noncoding/genetics , Schizophrenia/genetics
17.
Cell Rep ; 32(2): 107908, 2020 07 14.
Article in English | MEDLINE | ID: mdl-32668255

ABSTRACT

We present a consensus atlas of the human brain transcriptome in Alzheimer's disease (AD), based on meta-analysis of differential gene expression in 2,114 postmortem samples. We discover 30 brain coexpression modules from seven regions as the major source of AD transcriptional perturbations. We next examine overlap with 251 brain differentially expressed gene sets from mouse models of AD and other neurodegenerative disorders. Human-mouse overlaps highlight responses to amyloid versus tau pathology and reveal age- and sex-dependent expression signatures for disease progression. Human coexpression modules enriched for neuronal and/or microglial genes broadly overlap with mouse models of AD, Huntington's disease, amyotrophic lateral sclerosis, and aging. Other human coexpression modules, including those implicated in proteostasis, are not activated in AD models but rather following other, unexpected genetic manipulations. Our results comprise a cross-species resource, highlighting transcriptional networks altered by human brain pathophysiology and identifying correspondences with mouse models for AD preclinical studies.


Subject(s)
Alzheimer Disease/genetics , Brain/metabolism , Brain/pathology , Transcriptome/genetics , Animals , Case-Control Studies , Disease Models, Animal , Female , Gene Expression Profiling , Gene Expression Regulation , Gene Regulatory Networks , Humans , Male , Mice , Sex Characteristics , Species Specificity , Transcription, Genetic
18.
Nat Commun ; 11(1): 2990, 2020 06 12.
Article in English | MEDLINE | ID: mdl-32533064

ABSTRACT

Structural variants (SVs) contribute to many disorders, yet, functionally annotating them remains a major challenge. Here, we integrate SVs with RNA-sequencing from human post-mortem brains to quantify their dosage and regulatory effects. We show that genic and regulatory SVs exist at significantly lower frequencies than intergenic SVs. Functional impact of copy number variants (CNVs) stems from both the proportion of genic and regulatory content altered and loss-of-function intolerance of the gene. We train a linear model to predict expression effects of rare CNVs and use it to annotate regulatory disruption of CNVs from 14,891 independent genome-sequenced individuals. Pathogenic deletions implicated in neurodevelopmental disorders show significantly more extreme regulatory disruption scores and if rank ordered would be prioritized higher than using frequency or length alone. This work shows the deleteriousness of regulatory SVs, particularly those altering CTCF sites and provides a simple approach for functionally annotating the regulatory consequences of CNVs.


Subject(s)
Brain/metabolism , DNA Copy Number Variations , Gene Expression Regulation , Genetic Variation , Genome, Human/genetics , Autopsy/methods , Brain/pathology , Female , Gene Expression Profiling/methods , Humans , Male , Neurodevelopmental Disorders/genetics , Sequence Analysis, RNA/methods
19.
Bioinformatics ; 35(14): i568-i576, 2019 07 15.
Article in English | MEDLINE | ID: mdl-31510680

ABSTRACT

MOTIVATION: Late onset Alzheimer's disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. RESULTS: We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer's. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer's and are enriched in pathways that have been previously associated with the disease. AVAILABILITY AND IMPLEMENTATION: Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking.


Subject(s)
Algorithms , Alzheimer Disease , Genome-Wide Association Study , Alzheimer Disease/genetics , Humans , Machine Learning , Software
20.
Sci Data ; 6(1): 180, 2019 09 24.
Article in English | MEDLINE | ID: mdl-31551426

ABSTRACT

Schizophrenia and bipolar disorder are serious mental illnesses that affect more than 2% of adults. While large-scale genetics studies have identified genomic regions associated with disease risk, less is known about the molecular mechanisms by which risk alleles with small effects lead to schizophrenia and bipolar disorder. In order to fill this gap between genetics and disease phenotype, we have undertaken a multi-cohort genomics study of postmortem brains from controls, individuals with schizophrenia and bipolar disorder. Here we present a public resource of functional genomic data from the dorsolateral prefrontal cortex (DLPFC; Brodmann areas 9 and 46) of 986 individuals from 4 separate brain banks, including 353 diagnosed with schizophrenia and 120 with bipolar disorder. The genomic data include RNA-seq and SNP genotypes on 980 individuals, and ATAC-seq on 269 individuals, of which 264 are a subset of individuals with RNA-seq. We have performed extensive preprocessing and quality control on these data so that the research community can take advantage of this public resource available on the Synapse platform at http://CommonMind.org .


Subject(s)
Bipolar Disorder , Schizophrenia , Bipolar Disorder/genetics , Bipolar Disorder/pathology , Cohort Studies , Epigenomics , Humans , Prefrontal Cortex/metabolism , Prefrontal Cortex/pathology , Schizophrenia/genetics , Schizophrenia/pathology , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...