Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 223
Filter
1.
Genome Biol ; 25(1): 154, 2024 Jun 13.
Article in English | MEDLINE | ID: mdl-38872191

ABSTRACT

Genomic data holds huge potential for medical progress but requires strict safety measures due to its sensitive nature to comply with data protection laws. This conflict is especially pronounced in genome-wide association studies (GWAS) which rely on vast amounts of genomic data to improve medical diagnoses. To ensure both their benefits and sufficient data security, we propose a federated approach in combination with privacy-enhancing technologies utilising the findings from a systematic review on federated learning and legal regulations in general and applying these to GWAS.


Subject(s)
Computer Security , Genome-Wide Association Study , Humans , Computer Security/legislation & jurisprudence , Genetic Privacy/legislation & jurisprudence
2.
JMIR AI ; 3: e47652, 2024 Mar 29.
Article in English | MEDLINE | ID: mdl-38875678

ABSTRACT

BACKGROUND: Central collection of distributed medical patient data is problematic due to strict privacy regulations. Especially in clinical environments, such as clinical time-to-event studies, large sample sizes are critical but usually not available at a single institution. It has been shown recently that federated learning, combined with privacy-enhancing technologies, is an excellent and privacy-preserving alternative to data sharing. OBJECTIVE: This study aims to develop and validate a privacy-preserving, federated survival support vector machine (SVM) and make it accessible for researchers to perform cross-institutional time-to-event analyses. METHODS: We extended the survival SVM algorithm to be applicable in federated environments. We further implemented it as a FeatureCloud app, enabling it to run in the federated infrastructure provided by the FeatureCloud platform. Finally, we evaluated our algorithm on 3 benchmark data sets, a large sample size synthetic data set, and a real-world microbiome data set and compared the results to the corresponding central method. RESULTS: Our federated survival SVM produces highly similar results to the centralized model on all data sets. The maximal difference between the model weights of the central model and the federated model was only 0.001, and the mean difference over all data sets was 0.0002. We further show that by including more data in the analysis through federated learning, predictions are more accurate even in the presence of site-dependent batch effects. CONCLUSIONS: The federated survival SVM extends the palette of federated time-to-event analysis methods by a robust machine learning approach. To our knowledge, the implemented FeatureCloud app is the first publicly available implementation of a federated survival SVM, is freely accessible for all kinds of researchers, and can be directly used within the FeatureCloud platform.

3.
Nucleic Acids Res ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38783119

ABSTRACT

In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.

4.
Neurol Neuroimmunol Neuroinflamm ; 11(3): e200213, 2024 May.
Article in English | MEDLINE | ID: mdl-38564686

ABSTRACT

BACKGROUND AND OBJECTIVES: In progressive multiple sclerosis (MS), compartmentalized inflammation plays a pivotal role in the complex pathology of tissue damage. The interplay between epigenetic regulation, transcriptional modifications, and location-specific alterations within white matter (WM) lesions at the single-cell level remains underexplored. METHODS: We examined intracellular and intercellular pathways in the MS brain WM using a novel dataset obtained by integrated single-cell multi-omics techniques from 3 active lesions, 3 chronic active lesions, 3 remyelinating lesions, and 3 control WM of 6 patients with progressive MS and 3 non-neurologic controls. Single-nucleus RNA-seq and ATAC-seq were combined and additionally enriched with newly conducted spatial transcriptomics from 1 chronic active lesion. Functional gene modules were then validated in our previously published bulk tissue transcriptome data obtained from 73 WM lesions of patients with progressive MS and 25 WM of non-neurologic disease controls. RESULTS: Our analysis uncovered an MS-specific oligodendrocyte genetic signature influenced by the KLF/SP gene family. This modulation has potential associations with the autocrine iron uptake signaling observed in transcripts of transferrin and its receptor LRP2. In addition, an inflammatory profile emerged within these oligodendrocytes. We observed unique cellular endophenotypes both at the periphery and within the chronic active lesion. These include a distinct metabolic astrocyte phenotype, the importance of FGF signaling among astrocytes and neurons, and a notable enrichment of mitochondrial genes at the lesion edge populated predominantly by astrocytes. Our study also identified B-cell coexpression networks indicating different functional B-cell subsets with differential location and specific tendencies toward certain lesion types. DISCUSSION: The use of single-cell multi-omics has offered a detailed perspective into the cellular dynamics and interactions in MS. These nuanced findings might pave the way for deeper insights into lesion pathogenesis in progressive MS.


Subject(s)
Multiple Sclerosis, Chronic Progressive , Multiple Sclerosis , White Matter , Humans , Multiple Sclerosis/genetics , Multiple Sclerosis/pathology , Epigenesis, Genetic , Multiomics , Multiple Sclerosis, Chronic Progressive/genetics , Multiple Sclerosis, Chronic Progressive/pathology , White Matter/pathology
5.
Bioinform Adv ; 4(1): vbae034, 2024.
Article in English | MEDLINE | ID: mdl-38505804

ABSTRACT

Summary: Diseases can be caused by molecular perturbations that induce specific changes in regulatory interactions and their coordinated expression, also referred to as network rewiring. However, the detection of complex changes in regulatory connections remains a challenging task and would benefit from the development of novel nonparametric approaches. We develop a new ensemble method called BoostDiff (boosted differential regression trees) to infer a differential network discriminating between two conditions. BoostDiff builds an adaptively boosted (AdaBoost) ensemble of differential trees with respect to a target condition. To build the differential trees, we propose differential variance improvement as a novel splitting criterion. Variable importance measures derived from the resulting models are used to reflect changes in gene expression predictability and to build the output differential networks. BoostDiff outperforms existing differential network methods on simulated data evaluated in four different complexity settings. We then demonstrate the power of our approach when applied to real transcriptomics data in COVID-19, Crohn's disease, breast cancer, prostate adenocarcinoma, and stress response in Bacillus subtilis. BoostDiff identifies context-specific networks that are enriched with genes of known disease-relevant pathways and complements standard differential expression analyses. Availability and implementation: BoostDiff is available at https://github.com/scibiome/boostdiff_inference.

6.
Microb Genom ; 10(2)2024 Feb.
Article in English | MEDLINE | ID: mdl-38421266

ABSTRACT

Molecular profiling techniques such as metagenomics, metatranscriptomics or metabolomics offer important insights into the functional diversity of the microbiome. In contrast, 16S rRNA gene sequencing, a widespread and cost-effective technique to measure microbial diversity, only allows for indirect estimation of microbial function. To mitigate this, tools such as PICRUSt2, Tax4Fun2, PanFP and MetGEM infer functional profiles from 16S rRNA gene sequencing data using different algorithms. Prior studies have cast doubts on the quality of these predictions, motivating us to systematically evaluate these tools using matched 16S rRNA gene sequencing, metagenomic datasets, and simulated data. Our contribution is threefold: (i) using simulated data, we investigate if technical biases could explain the discordance between inferred and expected results; (ii) considering human cohorts for type two diabetes, colorectal cancer and obesity, we test if health-related differential abundance measures of functional categories are concordant between 16S rRNA gene-inferred and metagenome-derived profiles and; (iii) since 16S rRNA gene copy number is an important confounder in functional profiles inference, we investigate if a customised copy number normalisation with the rrnDB database could improve the results. Our results show that 16S rRNA gene-based functional inference tools generally do not have the necessary sensitivity to delineate health-related functional changes in the microbiome and should thus be used with care. Furthermore, we outline important differences in the individual tools tested and offer recommendations for tool selection.


Subject(s)
Metagenome , Microbiota , Humans , RNA, Ribosomal, 16S/genetics , Genes, rRNA , Microbiota/genetics , Algorithms
7.
Sci Rep ; 14(1): 2808, 2024 02 02.
Article in English | MEDLINE | ID: mdl-38307916

ABSTRACT

Bulk RNA sequencing (RNA-seq) of blood is typically used for gene expression analysis in biomedical research but is still rarely used in clinical practice. In this study, we propose that RNA-seq should be considered a diagnostic tool, as it offers not only insights into aberrant gene expression and splicing but also delivers additional readouts on immune cell type composition as well as B-cell and T-cell receptor (BCR/TCR) repertoires. We demonstrate that RNA-seq offers insights into a patient's immune status via integrative analysis of RNA-seq data from patients infected with various SARS-CoV-2 variants (in total 196 samples with up to 200 million reads sequencing depth). We compare the results of computational cell-type deconvolution methods (e.g., MCP-counter, xCell, EPIC, quanTIseq) to complete blood count data, the current gold standard in clinical practice. We observe varying levels of lymphocyte depletion and significant differences in neutrophil levels between SARS-CoV-2 variants. Additionally, we identify B and T cell receptor (BCR/TCR) sequences using the tools MiXCR and TRUST4 to show that-combined with sequence alignments and BLASTp-they could be used to classify a patient's disease. Finally, we investigated the sequencing depth required for such analyses and concluded that 10 million reads per sample is sufficient. In conclusion, our study reveals that computational cell-type deconvolution and BCR/TCR methods using bulk RNA-seq analyses can supplement missing CBC data and offer insights into immune responses, disease severity, and pathogen-specific immunity, all achievable with a sequencing depth of 10 million reads per sample.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/genetics , Gene Expression Profiling , Receptors, Antigen, T-Cell/genetics , Sequence Analysis, RNA/methods , Immunity
8.
bioRxiv ; 2024 Jan 15.
Article in English | MEDLINE | ID: mdl-38313260

ABSTRACT

RNA sequencing offers unique insights into transcriptome diversity, and a plethora of tools have been developed to analyze alternative splicing. One important task is to detect changes in the relative transcript abundance in differential transcript usage (DTU) analysis. The choice of the right analysis tool is non-trivial and depends on experimental factors such as the availability of single- or paired-end and bulk or single-cell data. To help users select the most promising tool for their task, we performed a comprehensive benchmark of DTU detection tools. We cover a wide array of experimental settings, using simulated bulk and single-cell RNA-seq data as well as real transcriptomics datasets, including time-series data. Our results suggest that DEXSeq, edgeR, and LimmaDS are better choices for paired-end data, while DSGseq and DEXSeq can be used for single-end data. In single-cell simulation settings, we showed that satuRn performs better than DTUrtle. In addition, we showed that Spycone is optimal for time series DTU/IS analysis based on the evidence provided using GO terms enrichment analysis.

9.
CPT Pharmacometrics Syst Pharmacol ; 13(2): 257-269, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37950385

ABSTRACT

High drug development costs and the limited number of new annual drug approvals increase the need for innovative approaches for drug effect prediction. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), led to a global pandemic with high morbidity and mortality. Although effective preventive measures exist, there are few effective treatments for hospitalized patients with SARS-CoV-2 infection. Drug repurposing and drug effect prediction are promising strategies that could shorten development time and reduce costs compared with de novo drug discovery. In this work, we present a machine learning framework to integrate a variety of target network features and physicochemical properties of compounds, and analyze their influence on the therapeutic effects for SARS-CoV-2 infection and on host cell cytotoxic effects. Random forest models trained on compounds with known experimental effects on SARS-CoV-2 infection and subsequent feature importance analysis based on Shapley values provided insights into the determinants of drug efficacy and cytotoxicity, which can be incorporated into novel drug discovery approaches. Given the complexity of molecular mechanisms of drug action and limited sample sizes, our models achieve a reasonable mean area under the receiver operating characteristic curve (ROC-AUC) of 0.73 on an unseen validation set. To our knowledge, this is the first work to incorporate a combination of network and physicochemical features of compounds into a machine learning model to predict drug effects on SARS-CoV-2 infection. Our systems pharmacology-based machine learning framework can be used to classify other existing drugs for SARS-CoV-2 infection and can easily be adapted to drug effect prediction for future viral outbreaks.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Drug Discovery , Drug Development , Machine Learning
10.
bioRxiv ; 2023 Nov 06.
Article in English | MEDLINE | ID: mdl-38076885

ABSTRACT

Bulk RNA sequencing (RNA-seq) of blood is typically used for gene expression analysis in biomedical research but is still rarely used in clinical practice. In this study, we argue that RNA-seq should be considered a routine diagnostic tool, as it offers not only insights into aberrant gene expression and splicing but also delivers additional readouts on immune cell type composition as well as B-cell and T-cell receptor (BCR/TCR) repertoires. We demonstrate that RNA-seq offers vital insights into a patient's immune status via integrative analysis of RNA-seq data from patients infected with various SARS-CoV-2 variants (in total 240 samples with up to 200 million reads sequencing depth). We compare the results of computational cell-type deconvolution methods (e.g., MCP-counter, xCell, EPIC, quanTIseq) to complete blood count data, the current gold standard in clinical practice. We observe varying levels of lymphocyte depletion and significant differences in neutrophil levels between SARS-CoV-2 variants. Additionally, we identify B and T cell receptor (BCR/TCR) sequences using the tools MiXCR and TRUST4 to show that - combined with sequence alignments and pBLAST - they could be used to classify a patient's disease. Finally, we investigated the sequencing depth required for such analyses and concluded that 10 million reads per sample is sufficient. In conclusion, our study reveals that computational cell-type deconvolution and BCR/TCR methods using bulk RNA-seq analyses can supplement missing CBC data and offer insights into immune responses, disease severity, and pathogen-specific immunity, all achievable with a sequencing depth of 10 million reads per sample.

11.
medRxiv ; 2023 Nov 09.
Article in English | MEDLINE | ID: mdl-38076997

ABSTRACT

Most heritable diseases are polygenic. To comprehend the underlying genetic architecture, it is crucial to discover the clinically relevant epistatic interactions (EIs) between genomic single nucleotide polymorphisms (SNPs)1-3. Existing statistical computational methods for EI detection are mostly limited to pairs of SNPs due to the combinatorial explosion of higher-order EIs. With NeEDL (network-based epistasis detection via local search), we leverage network medicine to inform the selection of EIs that are an order of magnitude more statistically significant compared to existing tools and consist, on average, of five SNPs. We further show that this computationally demanding task can be substantially accelerated once quantum computing hardware becomes available. We apply NeEDL to eight different diseases and discover genes (affected by EIs of SNPs) that are partly known to affect the disease, additionally, these results are reproducible across independent cohorts. EIs for these eight diseases can be interactively explored in the Epistasis Disease Atlas (https://epistasis-disease-atlas.com). In summary, NeEDL is the first application that demonstrates the potential of seamlessly integrated quantum computing techniques to accelerate biomedical research. Our network medicine approach detects higher-order EIs with unprecedented statistical and biological evidence, yielding unique insights into polygenic diseases and providing a basis for the development of improved risk scores and combination therapies.

12.
Biomedicines ; 11(12)2023 Nov 28.
Article in English | MEDLINE | ID: mdl-38137391

ABSTRACT

BACKGROUND: Blood-barrier (BBB) breakdown and active inflammation are hallmarks of relapsing multiple sclerosis (RMS), but the molecular events contributing to the development of new lesions are not well explored. Leaky endothelial junctions are associated with increased production of endothelial-derived extracellular microvesicles (EVs) and result in the entry of circulating immune cells into the brain. MRI with intravenous gadolinium (Gd) can visualize acute blood-barrier disruption as the initial event of the evolution of new lesions. METHODS: Here, weekly MRI with Gd was combined with proteomics, multiplex immunoassay, and endothelial stress-optimized EV array to identify early markers related to BBB disruption. Five patients with RMS with no disease-modifying treatment were monitored weekly using high-resolution 3T MRI scanning with intravenous gadolinium (Gd) for 8 weeks. Patients were then divided into three groups (low, medium, or high MRI activity) defined by the number of new, total, and maximally enhancing Gd-enhancing lesions and the number of new FLAIR lesions. Plasma samples taken at each MRI were analyzed for protein biomarkers of inflammation by quantitative proteomics, and cytokines using multiplex immunoassays. EVs were characterized with an optimized endothelial stress EV array based on exosome surface protein markers for the detection of soluble secreted EVs. RESULTS: Proteomics analysis of plasma yielded quantitative information on 208 proteins at each patient time point (n = 40). We observed the highest number of unique dysregulated proteins (DEPs) and the highest functional enrichment in the low vs. high MRI activity comparison. Complement activation and complement/coagulation cascade were also strongly overrepresented in the low vs. high MRI activity comparison. Activation of the alternative complement pathway, pathways of blood coagulation, extracellular matrix organization, and the regulation of TLR and IGF transport were unique for the low vs. high MRI activity comparison as well, with these pathways being overrepresented in the patient with high MRI activity. Principal component analysis indicated the individuality of plasma profiles in patients. IL-17 was upregulated at all time points during 8 weeks in patients with high vs. low MRI activity. Hierarchical clustering of soluble markers in the plasma indicated that all four MRI outcomes clustered together with IL-17, IL-12p70, and IL-1ß. MRI outcomes also showed clustering with EV markers CD62E/P, MIC A/B, ICAM-1, and CD42A. The combined cluster of these cytokines, EV markers, and MRI outcomes clustered also with IL-12p40 and IL-7. All four MRI outcomes correlated positively with levels of IL-17 (p < 0.001, respectively), and EV-ICAM-1 (p < 0.0003, respectively). IL-1ß levels positively correlated with the number of new Gd-enhancing lesions (p < 0.01), new FLAIR lesions (p < 0.001), and total number of Gd-enhancing lesions (p < 0.05). IL-6 levels positively correlated with the number of new FLAIR lesions (p < 0.05). Random Forests and linear mixed models identified IL-17, CCL17/TARC, CCL3/MIP-1α, and TNF-α as composite biomarkers predicting new lesion evolution. CONCLUSIONS: Combination of serial frequent MRI with proteome, neuroinflammation markers, and protein array data of EVs enabled assessment of temporal changes in inflammation and endothelial dysfunction in RMS related to the evolution of new and enhancing lesions. Particularly, the Th17 pathway and IL-1ß clustered and correlated with new lesions and Gd enhancement, indicating their importance in BBB disruption and initiating acute brain inflammation in MS. In addition to the Th17 pathway, abundant protein changes between MRI activity groups suggested the role of EVs and the coagulation system along with innate immune responses including acute phase proteins, complement components, and neutrophil degranulation.

13.
Bioinformatics ; 39(11)2023 11 01.
Article in English | MEDLINE | ID: mdl-37862243

ABSTRACT

MOTIVATION: The reconstruction of small key regulatory networks that explain the differences in the development of cell (sub)types from single-cell RNA sequencing is a yet unresolved computational problem. RESULTS: To this end, we have developed SCANet, an all-in-one package for single-cell profiling that covers the whole differential mechanotyping workflow, from inference of trait/cell-type-specific gene co-expression modules, driver gene detection, and transcriptional gene regulatory network reconstruction to mechanistic drug repurposing candidate prediction. To illustrate the power of SCANet, we examined data from two studies. First, we identify the drivers of the mechanotype of a cytokine storm associated with increased mortality in patients with acute respiratory illness. Secondly, we find 20 drugs for eight potential pharmacological targets in cellular driver mechanisms in the intestinal stem cells of obese mice. AVAILABILITY AND IMPLEMENTATION: SCANet is a free, open-source, and user-friendly Python package that can be seamlessly integrated into single-cell-based systems medicine research and mechanistic drug discovery.


Subject(s)
Gene Expression Profiling , Software , Humans , Animals , Mice , Sequence Analysis, RNA , Drug Repositioning , Single-Cell Gene Expression Analysis , Single-Cell Analysis , Gene Regulatory Networks
14.
NPJ Syst Biol Appl ; 9(1): 49, 2023 Oct 10.
Article in English | MEDLINE | ID: mdl-37816770

ABSTRACT

Proteomics technologies, which include a diverse range of approaches such as mass spectrometry-based, array-based, and others, are key technologies for the identification of biomarkers and disease mechanisms, referred to as mechanotyping. Despite over 15,000 published studies in 2022 alone, leveraging publicly available proteomics data for biomarker identification, mechanotyping and drug target identification is not readily possible. Proteomic data addressing similar biological/biomedical questions are made available by multiple research groups in different locations using different model organisms. Furthermore, not only various organisms are employed but different assay systems, such as in vitro and in vivo systems, are used. Finally, even though proteomics data are deposited in public databases, such as ProteomeXchange, they are provided at different levels of detail. Thus, data integration is hampered by non-harmonized usage of identifiers when reviewing the literature or performing meta-analyses to consolidate existing publications into a joint picture. To address this problem, we present ProHarMeD, a tool for harmonizing and comparing proteomics data gathered in multiple studies and for the extraction of disease mechanisms and putative drug repurposing candidates. It is available as a website, Python library and R package. ProHarMeD facilitates ID and name conversions between protein and gene levels, or organisms via ortholog mapping, and provides detailed logs on the loss and gain of IDs after each step. The web tool further determines IDs shared by different studies, proposes potential disease mechanisms as well as drug repurposing candidates automatically, and visualizes these results interactively. We apply ProHarMeD to a set of four studies on bone regeneration. First, we demonstrate the benefit of ID harmonization which increases the number of shared genes between studies by 50%. Second, we identify a potential disease mechanism, with five corresponding drug targets, and the top 20 putative drug repurposing candidates, of which Fondaparinux, the candidate with the highest score, and multiple others are known to have an impact on bone regeneration. Hence, ProHarMeD allows users to harmonize multi-centric proteomics research data in meta-analyses, evaluates the success of the ID conversions and remappings, and finally, it closes the gaps between proteomics, disease mechanism mining and drug repurposing. It is publicly available at https://apps.cosy.bio/proharmed/ .


Subject(s)
Drug Repositioning , Proteomics , Proteomics/methods , Proteins , Biomarkers
15.
NAR Genom Bioinform ; 5(3): lqad081, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37705830

ABSTRACT

MicroRNAs (miRNAs) are small non-coding RNA molecules that bind to target sites in different gene regions and regulate post-transcriptional gene expression. Approximately 95% of human multi-exon genes can be spliced alternatively, which enables the production of functionally diverse transcripts and proteins from a single gene. Through alternative splicing, transcripts might lose the exon with the miRNA target site and become unresponsive to miRNA regulation. To check this hypothesis, we studied the role of miRNA target sites in both coding and non-coding regions using six cancer data sets from The Cancer Genome Atlas (TCGA) and Parkinson's disease data from PPMI. First, we predicted miRNA target sites on mRNAs from their sequence using TarPmiR. To check whether alternative splicing interferes with this regulation, we trained linear regression models to predict miRNA expression from transcript expression. Using nested models, we compared the predictive power of transcripts with miRNA target sites in the coding regions to that of transcripts without target sites. Models containing transcripts with target sites perform significantly better. We conclude that alternative splicing does interfere with miRNA regulation by skipping exons with miRNA target sites within the coding region.

16.
Bioinform Adv ; 3(1): vbad093, 2023.
Article in English | MEDLINE | ID: mdl-37485422

ABSTRACT

Motivation: Circular RNAs (circRNAs) are long noncoding RNAs (lncRNAs) often associated with diseases and considered potential biomarkers for diagnosis and treatment. Among other functions, circRNAs have been shown to act as microRNA (miRNA) sponges, preventing the role of miRNAs that repress their targets. However, there is no pipeline to systematically assess the sponging potential of circRNAs. Results: We developed circRNA-sponging, a nextflow pipeline that (i) identifies circRNAs via backsplicing junctions detected in RNA-seq data, (ii) quantifies their expression values in relation to their linear counterparts spliced from the same gene, (iii) performs differential expression analysis, (iv) identifies and quantifies miRNA expression from miRNA-sequencing (miRNA-seq) data, (v) predicts miRNA binding sites on circRNAs, (vi) systematically investigates potential circRNA-miRNA sponging events, (vii) creates a network of competing endogenous RNAs and (viii) identifies potential circRNA biomarkers. We showed the functionality of the circRNA-sponging pipeline using RNA sequencing data from brain tissues, where we identified two distinct types of circRNAs characterized by a specific ratio of the number of the binding site to the length of the transcript. The circRNA-sponging pipeline is the first end-to-end pipeline to identify circRNAs and their sponging systematically with raw total RNA-seq and miRNA-seq files, allowing us to better indicate the functional impact of circRNAs as a routine aspect in transcriptomic research. Availability and implementation: https://github.com/biomedbigdata/circRNA-sponging. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

17.
J Med Internet Res ; 25: e42621, 2023 07 12.
Article in English | MEDLINE | ID: mdl-37436815

ABSTRACT

BACKGROUND: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures. OBJECTIVE: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond. METHODS: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime. RESULTS: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites. CONCLUSIONS: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.


Subject(s)
Algorithms , Artificial Intelligence , Humans , Health Occupations , Software , Computer Communication Networks , Privacy
18.
Cells ; 12(10)2023 05 10.
Article in English | MEDLINE | ID: mdl-37408191

ABSTRACT

Architectural proteins are essential epigenetic regulators that play a critical role in organizing chromatin and controlling gene expression. CTCF (CCCTC-binding factor) is a key architectural protein responsible for maintaining the intricate 3D structure of chromatin. Because of its multivalent properties and plasticity to bind various sequences, CTCF is similar to a Swiss knife for genome organization. Despite the importance of this protein, its mechanisms of action are not fully elucidated. It has been hypothesized that its versatility is achieved through interaction with multiple partners, forming a complex network that regulates chromatin folding within the nucleus. In this review, we delve into CTCF's interactions with other molecules involved in epigenetic processes, particularly histone and DNA demethylases, as well as several long non-coding RNAs (lncRNAs) that are able to recruit CTCF. Our review highlights the importance of CTCF partners to shed light on chromatin regulation and pave the way for future exploration of the mechanisms that enable the finely-tuned role of CTCF as a master regulator of chromatin.


Subject(s)
Chromatin , DNA , CCCTC-Binding Factor/genetics , DNA/metabolism , Cell Nucleus/metabolism , Genome
19.
Cells ; 12(13)2023 07 06.
Article in English | MEDLINE | ID: mdl-37443829

ABSTRACT

Glomerular disease due to podocyte malfunction is a major factor in the pathogenesis of chronic kidney disease. Identification of podocyte-specific signaling pathways is therefore a prerequisite to characterizing relevant disease pathways and developing novel treatment approaches. Here, we employed loss of function studies for EPB41L5 (Yurt) as a central podocyte gene to generate a cell type-specific disease model. Loss of Yurt in fly nephrocytes caused protein uptake and slit diaphragm defects. Transcriptomic and proteomic analysis of human EPB41L5 knockout podocytes demonstrated impaired mechanotransduction via the YAP/TAZ signaling pathway. Further analysis of specific inhibition of the YAP/TAZ-TEAD transcription factor complex by TEADi led to the identification of ARGHAP29 as an EPB41L5 and YAP/TAZ-dependently expressed podocyte RhoGAP. Knockdown of ARHGAP29 caused increased RhoA activation, defective lamellipodia formation, and increased maturation of integrin adhesion complexes, explaining similar phenotypes caused by loss of EPB41L5 and TEADi expression in podocytes. Detection of increased levels of ARHGAP29 in early disease stages of human glomerular disease implies a novel negative feedback loop for mechanotransductive RhoA-YAP/TAZ signaling in podocyte physiology and disease.


Subject(s)
Podocytes , Humans , Podocytes/metabolism , Adaptor Proteins, Signal Transducing/metabolism , YAP-Signaling Proteins , Mechanotransduction, Cellular , Integrins/metabolism , Proteomics , rhoA GTP-Binding Protein/metabolism , Signal Transduction , GTPase-Activating Proteins/metabolism , Membrane Proteins/metabolism
20.
ArXiv ; 2023 Jul 04.
Article in English | MEDLINE | ID: mdl-37332567

ABSTRACT

In recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities. To tackle these challenges, we introduce Drugst.One, a platform that assists specialized computational medicine tools in becoming user-friendly, web-based utilities for drug repurposing. With just three lines of code, Drugst.One turns any systems biology software into an interactive web tool for modeling and analyzing complex protein-drug-disease networks. Demonstrating its broad adaptability, Drugst.One has been successfully integrated with 21 computational systems medicine tools. Available at https://drugst.one, Drugst.One has significant potential for streamlining the drug discovery process, allowing researchers to focus on essential aspects of pharmaceutical treatment research.

SELECTION OF CITATIONS
SEARCH DETAIL
...