Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Med Internet Res ; 25: e42621, 2023 07 12.
Article in English | MEDLINE | ID: mdl-37436815

ABSTRACT

BACKGROUND: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures. OBJECTIVE: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond. METHODS: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime. RESULTS: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites. CONCLUSIONS: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.


Subject(s)
Algorithms , Artificial Intelligence , Humans , Health Occupations , Software , Computer Communication Networks , Privacy
2.
Front Immunol ; 13: 1043579, 2022.
Article in English | MEDLINE | ID: mdl-36532064

ABSTRACT

Infectious agents have been long considered to play a role in the pathogenesis of neurological diseases as part of the interaction between genetic susceptibility and the environment. The role of bacteria in CNS autoimmunity has also been highlighted by changes in the diversity of gut microbiota in patients with neurological diseases such as Parkinson's disease, Alzheimer disease and multiple sclerosis, emphasizing the role of the gut-brain axis. We discuss the hypothesis of a brain microbiota, the BrainBiota: bacteria living in symbiosis with brain cells. Existence of various bacteria in the human brain is suggested by morphological evidence, presence of bacterial proteins, metabolites, transcripts and mucosal-associated invariant T cells. Based on our data, we discuss the hypothesis that these bacteria are an integral part of brain development and immune tolerance as well as directly linked to the gut microbiome. We further suggest that changes of the BrainBiota during brain diseases may be the consequence or cause of the chronic inflammation similarly to the gut microbiota.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Multiple Sclerosis , Humans , Inflammation , Autoimmunity , Bacteria
3.
Bioinformatics ; 38(8): 2278-2286, 2022 04 12.
Article in English | MEDLINE | ID: mdl-35139148

ABSTRACT

MOTIVATION: Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules.Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. RESULTS: The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances.Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. AVAILABILITY AND IMPLEMENTATION: The implementation of the federated random forests can be found at https://featurecloud.ai/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Privacy , Random Forest , Machine Learning , Precision Medicine , Delivery of Health Care
4.
Genome Biol ; 23(1): 32, 2022 01 24.
Article in English | MEDLINE | ID: mdl-35073941

ABSTRACT

Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distributed datasets while preserving the accuracy of the results. sPLINK is robust against heterogeneous distributions of data across cohorts while meta-analysis considerably loses accuracy in such scenarios. sPLINK achieves practical runtime and acceptable network usage for chi-square and linear/logistic regression tests. sPLINK is available at https://exbio.wzw.tum.de/splink .


Subject(s)
Genome-Wide Association Study , Privacy , Genome-Wide Association Study/methods , Linear Models , Logistic Models , Meta-Analysis as Topic
5.
Genome Biol ; 22(1): 338, 2021 12 14.
Article in English | MEDLINE | ID: mdl-34906207

ABSTRACT

Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneously distributed among cohorts. Flimma ( https://exbio.wzw.tum.de/flimma/ ) addresses this issue by implementing the state-of-the-art workflow limma voom in a federated manner, i.e., patient data never leaves its source site. Flimma results are identical to those generated by limma voom on aggregated datasets even in imbalanced scenarios where meta-analysis approaches fail.


Subject(s)
Gene Expression , Privacy , Biomedical Research , Computer Communication Networks , Computer Security/legislation & jurisprudence , Computer Security/standards , Databases, Factual/legislation & jurisprudence , Databases, Factual/standards , Gene Expression/ethics , Genes , Government Regulation , Humans , Machine Learning
6.
Mult Scler ; 27(12): 1829-1837, 2021 10.
Article in English | MEDLINE | ID: mdl-33464158

ABSTRACT

BACKGROUND: Human endogenous retrovirus (HERV) expression in multiple sclerosis (MS) brain lesions may contribute to chronic inflammation, but expression of genome-wide HERVs in different MS lesions is unknown. OBJECTIVE: We examined the HERV expression landscape in different MS lesions compared to control brains. METHODS: Transcripts from 71 MS brain samples and 25 control WM were obtained by next-generation RNA sequencing and mapped against HERV transcripts across the human genome. Differential expression of mapped HERV-W and HERV-H reads between MS lesion types and controls was analysed. RESULTS: Out of 6.38 billion high-quality paired end reads, 174 million reads (2.73%) mapped to HERV transcripts. There was no difference in HERVs expression level between MS and control brains, but HERV-W transcripts were significantly reduced in chronic active lesions. Of the four HERV-W transcripts exclusively present in MS, ERV3633503 located on chromosome 7q21.13 close to the MS genetic risk locus had the highest number of reads. In the HERV-H family, 75% of transcripts located to nearby 7q21-22 were overrepresented in MS, and ERV3643914 was expressed more than 16 times in MS compared to control brains. CONCLUSION: Novel HERV-W and HERV-H transcripts located at chromosome 7 regions were uniquely expressed in MS lesions, indicating their potential role in brain lesion evolution.


Subject(s)
Endogenous Retroviruses , Multiple Sclerosis , Brain , Endogenous Retroviruses/genetics , Genome, Human , High-Throughput Nucleotide Sequencing , Humans , Multiple Sclerosis/genetics
7.
Netw Syst Med ; 3(1): 122-129, 2020.
Article in English | MEDLINE | ID: mdl-32954379

ABSTRACT

Introduction: Multiple sclerosis (MS) is a chronic disorder of the central nervous system with an untreatable late progressive phase. Molecular maps of different stages of brain lesion evolution in patients with progressive multiple sclerosis (PMS) are missing but critical for understanding disease development and to identify novel targets to halt progression. Materials and Methods: The MS Atlas database comprises comprehensive high-quality transcriptomic profiles of 98 white matter (WM) brain samples of different lesion types (normal-appearing WM [NAWM], active, chronic active, inactive, remyelinating) from ten progressive MS patients and 25 WM areas from five non-neurological diseased cases. Results: We introduce the first MS brain lesion atlas (msatlas.dk), developed to address the current challenges of understanding mechanisms driving the fate on a lesion basis. The MS Atlas gives means for testing research hypotheses, validating biomarkers and drug targets. It comes with a user-friendly web interface, and it fosters bioinformatic methods for de novo network enrichment to extract mechanistic markers for specific lesion types and pathway-based lesion type comparison. We describe examples of how the MS Atlas can be used to extract systems medicine signatures and demonstrate the interface of MS Atlas. Conclusion: This compendium of mechanistic PMS WM lesion profiles is an invaluable resource to fuel future MS research and a new basis for treatment development.

8.
Acta Neuropathol Commun ; 7(1): 205, 2019 12 11.
Article in English | MEDLINE | ID: mdl-31829262

ABSTRACT

To identify pathogenetic markers and potential drivers of different lesion types in the white matter (WM) of patients with progressive multiple sclerosis (PMS), we sequenced RNA from 73 different WM areas. Compared to 25 WM controls, 6713 out of 18,609 genes were significantly differentially expressed in MS tissues (FDR < 0.05). A computational systems medicine analysis was performed to describe the MS lesion endophenotypes. The cellular source of specific molecules was examined by RNAscope, immunohistochemistry, and immunofluorescence. To examine common lesion specific mechanisms, we performed de novo network enrichment based on shared differentially expressed genes (DEGs), and found TGFß-R2 as a central hub. RNAscope revealed astrocytes as the cellular source of TGFß-R2 in remyelinating lesions. Since lesion-specific unique DEGs were more common than shared signatures, we examined lesion-specific pathways and de novo networks enriched with unique DEGs. Such network analysis indicated classic inflammatory responses in active lesions; catabolic and heat shock protein responses in inactive lesions; neuronal/axonal specific processes in chronic active lesions. In remyelinating lesions, de novo analyses identified axonal transport responses and adaptive immune markers, which was also supported by the most heterogeneous immunoglobulin gene expression. The signature of the normal-appearing white matter (NAWM) was more similar to control WM than to lesions: only 465 DEGs differentiated NAWM from controls, and 16 were unique. The upregulated marker CD26/DPP4 was expressed by microglia in the NAWM but by mononuclear cells in active lesions, which may indicate a special subset of microglia before the lesion develops, but also emphasizes that omics related to MS lesions should be interpreted in the context of different lesions types. While chronic active lesions were the most distinct from control WM based on the highest number of unique DEGs (n = 2213), remyelinating lesions had the highest gene expression levels, and the most different molecular map from chronic active lesions. This may suggest that these two lesion types represent two ends of the spectrum of lesion evolution in PMS. The profound changes in chronic active lesions, the predominance of synaptic/neural/axonal signatures coupled with minor inflammation may indicate end-stage irreversible molecular events responsible for this less treatable phase.


Subject(s)
Brain/pathology , High-Throughput Nucleotide Sequencing/methods , Multiple Sclerosis, Chronic Progressive/genetics , Multiple Sclerosis, Chronic Progressive/pathology , Sequence Analysis, RNA/methods , White Matter/pathology , Gene Expression Profiling/methods , Humans , Receptor, Transforming Growth Factor-beta Type II/genetics
9.
Acta Neuropathol Commun ; 7(1): 136, 2019 08 21.
Article in English | MEDLINE | ID: mdl-31434573

ABSTRACT

The authors have retracted this article [1] because a line was omitted from the data sheet; this was due to a bug in the analysis scripts.

10.
Acta Neuropathol Commun ; 7(1): 58, 2019 04 25.
Article in English | MEDLINE | ID: mdl-31023379

ABSTRACT

The heterogeneity of multiple sclerosis is reflected by dynamic changes of different lesion types in the brain white matter (WM). To identify potential drivers of this process, we RNA-sequenced 73 WM areas from patients with progressive MS (PMS) and 25 control WM. Lesion endophenotypes were described by a computational systems medicine analysis combined with RNAscope, immunohistochemistry, and immunofluorescence. The signature of the normal-appearing WM (NAWM) was more similar to control WM than to lesions: one of the six upregulated genes in NAWM was CD26/DPP4 expressed by microglia. Chronic active lesions that become prominent in PMS had a signature that were different from all other lesion types, and were differentiated from them by two clusters of 62 differentially expressed genes (DEGs). An upcoming MS biomarker, CHI3L1 was among the top ten upregulated genes in chronic active lesions expressed by astrocytes in the rim. TGFß-R2 was the central hub in a remyelination-related protein interaction network, and was expressed there by astrocytes. We used de novo networks enriched by unique DEGs to determine lesion-specific pathway regulation, i.e. cellular trafficking and activation in active lesions; healing and immune responses in remyelinating lesions characterized by the most heterogeneous immunoglobulin gene expression; coagulation and ion balance in inactive lesions; and metabolic changes in chronic active lesions. Because we found inverse differential regulation of particular genes among different lesion types, our data emphasize that omics related to MS lesions should be interpreted in the context of lesion pathology. Our data indicate that the impact of molecular pathways is substantially changing as different lesions develop. This was also reflected by the high number of unique DEGs that were more common than shared signatures. A special microglia subset characterized by CD26 may play a role in early lesion development, while astrocyte-derived TGFß-R2 and TGFß pathways may be drivers of repair in contrast to chronic tissue damage. The highly specific mechanistic signature of chronic active lesions indicates that as these lesions develop in PMS, the molecular changes are substantially skewed: the unique mitochondrial/metabolic changes and specific downregulation of molecules involved in tissue repair may reflect a stage of exhaustion.

11.
Methods Mol Biol ; 1807: 51-62, 2018.
Article in English | MEDLINE | ID: mdl-30030803

ABSTRACT

DNA-methylation has a strong influence on gene expression such that differences in methylation are associated with a wide range of diseases. Array-based approaches like the Illumina 450 K or 850 K EPIC chips have been used in a wide range of studies mostly comparing a disease group with healthy control, but also to correlate with survival times, for instance. Processing, normalization, and analysis of raw data require extensive knowledge in statistics and programming languages such as R. Here we introduce DiMmer, an easy-to-use Java tool for the analysis of EWAS. A graphical user interface guides the user through preprocessing, normalization, testing for differentially methylated CpGs, and finally the discovery of differentially methylated regions (DMRs). The software performs randomization tests to compute empirical P-values, corrects for multiple testing, and requires no prior knowledge in programming. All computed results are provided as plots or tables and can be easily exported. DiMmer is thus a powerful one-stop-shop for EWAS data analysis.


Subject(s)
DNA Methylation/genetics , Epigenesis, Genetic , Genome-Wide Association Study/methods , Software , CpG Islands/genetics , Humans , Molecular Sequence Annotation , User-Computer Interface
12.
Metabolites ; 5(2): 344-63, 2015 Jun 10.
Article in English | MEDLINE | ID: mdl-26065494

ABSTRACT

Computational breath analysis is a growing research area aiming at identifying volatile organic compounds (VOCs) in human breath to assist medical diagnostics of the next generation. While inexpensive and non-invasive bioanalytical technologies for metabolite detection in exhaled air and bacterial/fungal vapor exist and the first studies on the power of supervised machine learning methods for profiling of the resulting data were conducted, we lack methods to extract hidden data features emerging from confounding factors. Here, we present Carotta, a new cluster analysis framework dedicated to uncovering such hidden substructures by sophisticated unsupervised statistical learning methods. We study the power of transitivity clustering and hierarchical clustering to identify groups of VOCs with similar expression behavior over most patient breath samples and/or groups of patients with a similar VOC intensity pattern. This enables the discovery of dependencies between metabolites. On the one hand, this allows us to eliminate the effect of potential confounding factors hindering disease classification, such as smoking. On the other hand, we may also identify VOCs associated with disease subtypes or concomitant diseases. Carotta is an open source software with an intuitive graphical user interface promoting data handling, analysis and visualization. The back-end is designed to be modular, allowing for easy extensions with plugins in the future, such as new clustering methods and statistics. It does not require much prior knowledge or technical skills to operate. We demonstrate its power and applicability by means of one artificial dataset. We also apply Carotta exemplarily to a real-world example dataset on chronic obstructive pulmonary disease (COPD). While the artificial data are utilized as a proof of concept, we will demonstrate how Carotta finds candidate markers in our real dataset associated with confounders rather than the primary disease (COPD) and bronchial carcinoma (BC). Carotta is publicly available at http://carotta.compbio.sdu.dk [1].

SELECTION OF CITATIONS
SEARCH DETAIL
...