Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
J Pers Med ; 14(5)2024 May 11.
Article in English | MEDLINE | ID: mdl-38793096

ABSTRACT

Despite the extensive literature on missing data theory and cautionary articles emphasizing the importance of realistic analysis for healthcare data, a critical gap persists in incorporating domain knowledge into the missing data methods. In this paper, we argue that the remedy is to identify the key scenarios that lead to data missingness and investigate their theoretical implications. Based on this proposal, we first introduce an analysis framework where we investigate how different observation agents, such as physicians, influence the data availability and then scrutinize each scenario with respect to the steps in the missing data analysis. We apply this framework to the case study of observational data in healthcare facilities. We identify ten fundamental missingness scenarios and show how they influence the identification step for missing data graphical models, inverse probability weighting estimation, and exponential tilting sensitivity analysis. To emphasize how domain-informed analysis can improve method reliability, we conduct simulation studies under the influence of various missingness scenarios. We compare the results of three common methods in medical data analysis: complete-case analysis, Missforest imputation, and inverse probability weighting estimation. The experiments are conducted for two objectives: variable mean estimation and classification accuracy. We advocate for our analysis approach as a reference for the observational health data analysis. Beyond that, we also posit that the proposed analysis framework is applicable to other medical domains.

2.
BMC Pediatr ; 24(1): 249, 2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38605404

ABSTRACT

BACKGROUND: Long-term survival after premature birth is significantly determined by development of morbidities, primarily affecting the cardio-respiratory or central nervous system. Existing studies are limited to pairwise morbidity associations, thereby lacking a holistic understanding of morbidity co-occurrence and respective risk profiles. METHODS: Our study, for the first time, aimed at delineating and characterizing morbidity profiles at near-term age and investigated the most prevalent morbidities in preterm infants: bronchopulmonary dysplasia (BPD), pulmonary hypertension (PH), mild cardiac defects, perinatal brain pathology and retinopathy of prematurity (ROP). For analysis, we employed two independent, prospective cohorts, comprising a total of 530 very preterm infants: AIRR ("Attention to Infants at Respiratory Risks") and NEuroSIS ("Neonatal European Study of Inhaled Steroids"). Using a data-driven strategy, we successfully characterized morbidity profiles of preterm infants in a stepwise approach and (1) quantified pairwise morbidity correlations, (2) assessed the discriminatory power of BPD (complemented by imaging-based structural and functional lung phenotyping) in relation to these morbidities, (3) investigated collective co-occurrence patterns, and (4) identified infant subgroups who share similar morbidity profiles using machine learning techniques. RESULTS: First, we showed that, in line with pathophysiologic understanding, BPD and ROP have the highest pairwise correlation, followed by BPD and PH as well as BPD and mild cardiac defects. Second, we revealed that BPD exhibits only limited capacity in discriminating morbidity occurrence, despite its prevalence and clinical indication as a driver of comorbidities. Further, we demonstrated that structural and functional lung phenotyping did not exhibit higher association with morbidity severity than BPD. Lastly, we identified patient clusters that share similar morbidity patterns using machine learning in AIRR (n=6 clusters) and NEuroSIS (n=8 clusters). CONCLUSIONS: By capturing correlations as well as more complex morbidity relations, we provided a comprehensive characterization of morbidity profiles at discharge, linked to shared disease pathophysiology. Future studies could benefit from identifying risk profiles to thereby develop personalized monitoring strategies. TRIAL REGISTRATION: AIRR: DRKS.de, DRKS00004600, 28/01/2013. NEuroSIS: ClinicalTrials.gov, NCT01035190, 18/12/2009.


Subject(s)
Bronchopulmonary Dysplasia , Infant, Premature, Diseases , Retinopathy of Prematurity , Infant , Female , Pregnancy , Infant, Newborn , Humans , Infant, Premature , Prospective Studies , Infant, Very Low Birth Weight , Infant, Premature, Diseases/epidemiology , Bronchopulmonary Dysplasia/complications , Morbidity , Retinopathy of Prematurity/epidemiology , Gestational Age
3.
Bioinform Adv ; 3(1): vbad093, 2023.
Article in English | MEDLINE | ID: mdl-37485422

ABSTRACT

Motivation: Circular RNAs (circRNAs) are long noncoding RNAs (lncRNAs) often associated with diseases and considered potential biomarkers for diagnosis and treatment. Among other functions, circRNAs have been shown to act as microRNA (miRNA) sponges, preventing the role of miRNAs that repress their targets. However, there is no pipeline to systematically assess the sponging potential of circRNAs. Results: We developed circRNA-sponging, a nextflow pipeline that (i) identifies circRNAs via backsplicing junctions detected in RNA-seq data, (ii) quantifies their expression values in relation to their linear counterparts spliced from the same gene, (iii) performs differential expression analysis, (iv) identifies and quantifies miRNA expression from miRNA-sequencing (miRNA-seq) data, (v) predicts miRNA binding sites on circRNAs, (vi) systematically investigates potential circRNA-miRNA sponging events, (vii) creates a network of competing endogenous RNAs and (viii) identifies potential circRNA biomarkers. We showed the functionality of the circRNA-sponging pipeline using RNA sequencing data from brain tissues, where we identified two distinct types of circRNAs characterized by a specific ratio of the number of the binding site to the length of the transcript. The circRNA-sponging pipeline is the first end-to-end pipeline to identify circRNAs and their sponging systematically with raw total RNA-seq and miRNA-seq files, allowing us to better indicate the functional impact of circRNAs as a routine aspect in transcriptomic research. Availability and implementation: https://github.com/biomedbigdata/circRNA-sponging. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

4.
bioRxiv ; 2023 Jun 23.
Article in English | MEDLINE | ID: mdl-36789427

ABSTRACT

MOTIVATION: Circular RNAs (circRNAs) are long non-coding RNAs (lncRNAs) often associated with diseases and considered potential biomarkers for diagnosis and treatment. Among other functions, circRNAs have been shown to act as microRNA (miRNA) sponges, preventing the role of miRNAs that repress their targets. However, there is no pipeline to systematically assess the sponging potential of circRNAs. RESULTS: We developed circRNA-sponging, a nextflow pipeline that (1) identifies circRNAs via backsplicing junctions detected in RNA-seq data, (2) quantifies their expression values in relation to their linear counterparts spliced from the same gene, (3) performs differential expression analysis, (4) identifies and quantifies miRNA expression from miRNA-sequencing (miRNA-seq) data, (5) predicts miRNA binding sites on circRNAs, (6) systematically investigates potential circRNA-miRNA sponging events, (7) creates a network of competing endogenous RNAs, and (8) identifies potential circRNA biomarkers. We showed the functionality of the circRNA-sponging pipeline using RNA sequencing data from brain tissues, where we identified two distinct types of circRNAs characterized by a specific ratio of the number of the binding site to the length of the transcript. The circRNA-sponging pipeline is the first end-to-end pipeline to identify circRNAs and their sponging systematically with raw total RNA-seq and miRNA-seq files, allowing us to better indicate the functional impact of circRNAs as a routine aspect in transcriptomic research. AVAILABILITY: https://github.com/biomedbigdata/circRNA-sponging Contact: markus.daniel.hoffmann@tum.de; markus.list@tum.de Supplementary Material: Supplementary data are available at Bioinformatic Advances online.

5.
Proc Natl Acad Sci U S A ; 119(16): e2118210119, 2022 04 19.
Article in English | MEDLINE | ID: mdl-35412913

ABSTRACT

The improving access to increasing amounts of biomedical data provides completely new chances for advanced patient stratification and disease subtyping strategies. This requires computational tools that produce uniformly robust results across highly heterogeneous molecular data. Unsupervised machine learning methodologies are able to discover de novo patterns in such data. Biclustering is especially suited by simultaneously identifying sample groups and corresponding feature sets across heterogeneous omics data. The performance of available biclustering algorithms heavily depends on individual parameterization and varies with their application. Here, we developed MoSBi (molecular signature identification using biclustering), an automated multialgorithm ensemble approach that integrates results utilizing an error model-supported similarity network. We systematically evaluated the performance of 11 available and established biclustering algorithms together with MoSBi. For this, we used transcriptomics, proteomics, and metabolomics data, as well as synthetic datasets covering various data properties. Profiting from multialgorithm integration, MoSBi identified robust group and disease-specific signatures across all scenarios, overcoming single algorithm specificities. Furthermore, we developed a scalable network-based visualization of bicluster communities that supports biological hypothesis generation. MoSBi is available as an R package and web service to make automated biclustering analysis accessible for application in molecular sample stratification.


Subject(s)
Disease , Gene Expression Profiling , Metabolomics , Patients , Proteomics , Software , Algorithms , Cluster Analysis , Disease/classification , Humans , Patients/classification
SELECTION OF CITATIONS
SEARCH DETAIL
...