Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Med Image Anal ; 91: 103042, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38000257

ABSTRACT

Appendicitis is among the most frequent reasons for pediatric abdominal surgeries. Previous decision support systems for appendicitis have focused on clinical, laboratory, scoring, and computed tomography data and have ignored abdominal ultrasound, despite its noninvasive nature and widespread availability. In this work, we present interpretable machine learning models for predicting the diagnosis, management and severity of suspected appendicitis using ultrasound images. Our approach utilizes concept bottleneck models (CBM) that facilitate interpretation and interaction with high-level concepts understandable to clinicians. Furthermore, we extend CBMs to prediction problems with multiple views and incomplete concept sets. Our models were trained on a dataset comprising 579 pediatric patients with 1709 ultrasound images accompanied by clinical and laboratory data. Results show that our proposed method enables clinicians to utilize a human-understandable and intervenable predictive model without compromising performance or requiring time-consuming image annotation when deployed. For predicting the diagnosis, the extended multiview CBM attained an AUROC of 0.80 and an AUPR of 0.92, performing comparably to similar black-box neural networks trained and tested on the same dataset.


Subject(s)
Appendicitis , Humans , Child , Appendicitis/diagnostic imaging , Ultrasonography/methods , Machine Learning , Tomography, X-Ray Computed , Neural Networks, Computer
2.
Front Pediatr ; 11: 1296904, 2023.
Article in English | MEDLINE | ID: mdl-38155742

ABSTRACT

Background: The overarching goal of blood glucose forecasting is to assist individuals with type 1 diabetes (T1D) in avoiding hyper- or hypoglycemic conditions. While deep learning approaches have shown promising results for blood glucose forecasting in adults with T1D, it is not known if these results generalize to children. Possible reasons are physical activity (PA), which is often unplanned in children, as well as age and development of a child, which both have an effect on the blood glucose level. Materials and Methods: In this study, we collected time series measurements of glucose levels, carbohydrate intake, insulin-dosing and physical activity from children with T1D for one week in an ethics approved prospective observational study, which included daily physical activities. We investigate the performance of state-of-the-art deep learning methods for adult data-(dilated) recurrent neural networks and a transformer-on our dataset for short-term (30 min) and long-term (2 h) prediction. We propose to integrate static patient characteristics, such as age, gender, BMI, and percentage of basal insulin, to account for the heterogeneity of our study group. Results: Integrating static patient characteristics (SPC) proves beneficial, especially for short-term prediction. LSTMs and GRUs with SPC perform best for a prediction horizon of 30 min (RMSE of 1.66 mmol/l), a vanilla RNN with SPC performs best across different prediction horizons, while the performance significantly decays for long-term prediction. For prediction during the night, the best method improves to an RMSE of 1.50 mmol/l. Overall, the results for our baselines and RNN models indicate that blood glucose forecasting for children conducting regular physical activity is more challenging than for previously studied adult data. Conclusion: We find that integrating static data improves the performance of deep-learning architectures for blood glucose forecasting of children with T1D and achieves promising results for short-term prediction. Despite these improvements, additional clinical studies are warranted to extend forecasting to longer-term prediction horizons.

3.
Front Pediatr ; 11: 1229462, 2023.
Article in English | MEDLINE | ID: mdl-37876524

ABSTRACT

Background: Hyperbilirubinemia of the newborn infant is a common disease worldwide. However, recognized early and treated appropriately, it typically remains innocuous. We recently developed an early phototherapy prediction tool (EPPT) by means of machine learning (ML) utilizing just one bilirubin measurement and few clinical variables. The aim of this study is to test applicability and performance of the EPPT on a new patient cohort from a different population. Materials and methods: This work is a retrospective study of prospectively recorded neonatal data from infants born in 2018 in an academic hospital, Regensburg, Germany, meeting the following inclusion criteria: born with 34 completed weeks of gestation or more, at least two total serum bilirubin (TSB) measurement prior to phototherapy. First, the original EPPT-an ensemble of a logistic regression and a random forest-was used in its freely accessible version and evaluated in terms of the area under the receiver operating characteristic curve (AUROC). Second, a new version of the EPPT model was re-trained on the data from the new cohort. Third, the predictive performance, variable importance, sensitivity and specificity were analyzed and compared across the original and re-trained models. Results: In total, 1,109 neonates were included with a median (IQR) gestational age of 38.4 (36.6-39.9) and a total of 3,940 bilirubin measurements prior to any phototherapy treatment, which was required in 154 neonates (13.9%). For the phototherapy treatment prediction, the original EPPT achieved a predictive performance of 84.6% AUROC on the new cohort. After re-training the model on a subset of the new dataset, 88.8% AUROC was achieved as evaluated by cross validation. The same five variables as for the original model were found to be most important for the prediction on the new cohort, namely gestational age at birth, birth weight, bilirubin to weight ratio, hours since birth, bilirubin value. Discussion: The individual risk for treatment requirement in neonatal hyperbilirubinemia is robustly predictable in different patient cohorts with a previously developed ML tool (EPPT) demanding just one TSB value and only four clinical parameters. Further prospective validation studies are needed to develop an effective and safe clinical decision support system.

4.
Front Immunol ; 14: 1158905, 2023.
Article in English | MEDLINE | ID: mdl-37313411

ABSTRACT

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) induces B and T cell responses, contributing to virus neutralization. In a cohort of 2,911 young adults, we identified 65 individuals who had an asymptomatic or mildly symptomatic SARS-CoV-2 infection and characterized their humoral and T cell responses to the Spike (S), Nucleocapsid (N) and Membrane (M) proteins. We found that previous infection induced CD4 T cells that vigorously responded to pools of peptides derived from the S and N proteins. By using statistical and machine learning models, we observed that the T cell response highly correlated with a compound titer of antibodies against the Receptor Binding Domain (RBD), S and N. However, while serum antibodies decayed over time, the cellular phenotype of these individuals remained stable over four months. Our computational analysis demonstrates that in young adults, asymptomatic and paucisymptomatic SARS-CoV-2 infections can induce robust and long-lasting CD4 T cell responses that exhibit slower decays than antibody titers. These observations imply that next-generation COVID-19 vaccines should be designed to induce stronger cellular responses to sustain the generation of potent neutralizing antibodies.


Subject(s)
COVID-19 , Humans , COVID-19 Vaccines , SARS-CoV-2 , Antibodies, Neutralizing , Machine Learning
5.
Medicina (Kaunas) ; 59(3)2023 Mar 20.
Article in English | MEDLINE | ID: mdl-36984618

ABSTRACT

Background and Objectives: Remote patient monitoring (RPM) of vital signs and symptoms for lung transplant recipients (LTRs) has become increasingly relevant in many situations. Nevertheless, RPM research integrating multisensory home monitoring in LTRs is scarce. We developed a novel multisensory home monitoring device and tested it in the context of COVID-19 vaccinations. We hypothesize that multisensory RPM and smartphone-based questionnaire feedback on signs and symptoms will be well accepted among LTRs. To assess the usability and acceptability of a remote monitoring system consisting of wearable devices, including home spirometry and a smartphone-based questionnaire application for symptom and vital sign monitoring using wearable devices, during the first and second SARS-CoV-2 vaccination. Materials and Methods: Observational usability pilot study for six weeks of home monitoring with the COVIDA Desk for LTRs. During the first week after the vaccination, intensive monitoring was performed by recording data on physical activity, spirometry, temperature, pulse oximetry and self-reported symptoms, signs and additional measurements. During the subsequent days, the number of monitoring assessments was reduced. LTRs reported on their perceptions of the usability of the monitoring device through a purpose-designed questionnaire. Results: Ten LTRs planning to receive the first COVID-19 vaccinations were recruited. For the intensive monitoring study phase, LTRs recorded symptoms, signs and additional measurements. The most frequent adverse events reported were local pain, fatigue, sleep disturbance and headache. The duration of these symptoms was 5-8 days post-vaccination. Adherence to the main monitoring devices was high. LTRs rated usability as high. The majority were willing to continue monitoring. Conclusions: The COVIDA Desk showed favorable technical performance and was well accepted by the LTRs during the vaccination phase of the pandemic. The feasibility of the RPM system deployment was proven by the rapid recruitment uptake, technical performance (i.e., low number of errors), favorable user experience questionnaires and detailed individual user feedback.


Subject(s)
COVID-19 Vaccines , COVID-19 , Transplant Recipients , Wearable Electronic Devices , Humans , COVID-19/prevention & control , COVID-19 Vaccines/administration & dosage , Pilot Projects , Vaccination , Lung Transplantation
6.
Digit Health ; 8: 20552076221074488, 2022.
Article in English | MEDLINE | ID: mdl-35173981

ABSTRACT

Using artificial intelligence to improve patient care is a cutting-edge methodology, but its implementation in clinical routine has been limited due to significant concerns about understanding its behavior. One major barrier is the explainability dilemma and how much explanation is required to use artificial intelligence safely in healthcare. A key issue is the lack of consensus on the definition of explainability by experts, regulators, and healthcare professionals, resulting in a wide variety of terminology and expectations. This paper aims to fill the gap by defining minimal explainability standards to serve the views and needs of essential stakeholders in healthcare. In that sense, we propose to define minimal explainability criteria that can support doctors' understanding, meet patients' needs, and fulfill legal requirements. Therefore, explainability need not to be exhaustive but sufficient for doctors and patients to comprehend the artificial intelligence models' clinical implications and be integrated safely into clinical practice. Thus, minimally acceptable standards for explainability are context-dependent and should respond to the specific need and potential risks of each clinical scenario for a responsible and ethical implementation of artificial intelligence.

8.
Pediatr Infect Dis J ; 41(3): 248-254, 2022 03 01.
Article in English | MEDLINE | ID: mdl-34508027

ABSTRACT

BACKGROUND: Current strategies for risk stratification and prediction of neonatal early-onset sepsis (EOS) are inefficient and lack diagnostic performance. The aim of this study was to use machine learning to analyze the diagnostic accuracy of risk factors (RFs), clinical signs and biomarkers and to develop a prediction model for culture-proven EOS. We hypothesized that the contribution to diagnostic accuracy of biomarkers is higher than of RFs or clinical signs. STUDY DESIGN: Secondary analysis of the prospective international multicenter NeoPInS study. Neonates born after completed 34 weeks of gestation with antibiotic therapy due to suspected EOS within the first 72 hours of life participated. Primary outcome was defined as predictive performance for culture-proven EOS with variables known at the start of antibiotic therapy. Machine learning was used in form of a random forest classifier. RESULTS: One thousand six hundred eighty-five neonates treated for suspected infection were analyzed. Biomarkers were superior to clinical signs and RFs for prediction of culture-proven EOS. C-reactive protein and white blood cells were most important for the prediction of the culture result. Our full model achieved an area-under-the-receiver-operating-characteristic-curve of 83.41% (±8.8%) and an area-under-the-precision-recall-curve of 28.42% (±11.5%). The predictive performance of the model with RFs alone was comparable with random. CONCLUSIONS: Biomarkers have to be considered in algorithms for the management of neonates suspected of EOS. A 2-step approach with a screening tool for all neonates in combination with our model in the preselected population with an increased risk for EOS may have the potential to reduce the start of unnecessary antibiotics.


Subject(s)
Biomarkers/blood , Machine Learning , Neonatal Sepsis/diagnosis , Anti-Bacterial Agents/therapeutic use , C-Reactive Protein/analysis , Female , Humans , Infant , Infant, Newborn , Male , Neonatal Sepsis/drug therapy , Prospective Studies , ROC Curve , Risk Factors
9.
Front Pediatr ; 9: 662183, 2021.
Article in English | MEDLINE | ID: mdl-33996697

ABSTRACT

Background: Given the absence of consolidated and standardized international guidelines for managing pediatric appendicitis and the few strictly data-driven studies in this specific, we investigated the use of machine learning (ML) classifiers for predicting the diagnosis, management and severity of appendicitis in children. Materials and Methods: Predictive models were developed and validated on a dataset acquired from 430 children and adolescents aged 0-18 years, based on a range of information encompassing history, clinical examination, laboratory parameters, and abdominal ultrasonography. Logistic regression, random forests, and gradient boosting machines were used for predicting the three target variables. Results: A random forest classifier achieved areas under the precision-recall curve of 0.94, 0.92, and 0.70, respectively, for the diagnosis, management, and severity of appendicitis. We identified smaller subsets of 6, 17, and 18 predictors for each of targets that sufficed to achieve the same performance as the model based on the full set of 38 variables. We used these findings to develop the user-friendly online Appendicitis Prediction Tool for children with suspected appendicitis. Discussion: This pilot study considered the most extensive set of predictor and target variables to date and is the first to simultaneously predict all three targets in children: diagnosis, management, and severity. Moreover, this study presents the first ML model for appendicitis that was deployed as an open access easy-to-use online tool. Conclusion: ML algorithms help to overcome the diagnostic and management challenges posed by appendicitis in children and pave the way toward a more personalized approach to medical decision-making. Further validation studies are needed to develop a finished clinical decision support system.

10.
J Am Med Inform Assoc ; 28(4): 868-873, 2021 03 18.
Article in English | MEDLINE | ID: mdl-33338231

ABSTRACT

Unplanned hospital readmissions are a burden to patients and increase healthcare costs. A wide variety of machine learning (ML) models have been suggested to predict unplanned hospital readmissions. These ML models were often specifically trained on patient populations with certain diseases. However, it is unclear whether these specialized ML models-trained on patient subpopulations with certain diseases or defined by other clinical characteristics-are more accurate than a general ML model trained on an unrestricted hospital cohort. In this study based on an electronic health record cohort of consecutive inpatient cases of a single tertiary care center, we demonstrate that accurate prediction of hospital readmissions may be obtained by general, disease-independent, ML models. This general approach may substantially decrease the cost of development and deployment of respective ML models in daily clinical routine, as all predictions are obtained by the use of a single model.


Subject(s)
Hospitalization , Machine Learning , Models, Statistical , Patient Readmission , Area Under Curve , Cardiovascular Diseases , Chronic Disease , Cohort Studies , Datasets as Topic , Electronic Health Records , Female , Humans , Lung Diseases , Male , Neoplasms , Prognosis , Tertiary Care Centers , Treatment Outcome
11.
Nephrol Dial Transplant ; 36(3): 519-528, 2021 02 20.
Article in English | MEDLINE | ID: mdl-32510143

ABSTRACT

BACKGROUND: The mortality risk remains significant in paediatric and adult patients on chronic haemodialysis (HD) treatment. We aimed to identify factors associated with mortality in patients who started HD as children and continued HD as adults. METHODS: The data originated from a cohort of patients <30 years of age who started HD in childhood (≤19 years) on thrice-weekly HD in outpatient DaVita dialysis centres between 2004 and 2016. Patients with at least 5 years of follow-up since the initiation of HD or death within 5 years were included; 105 variables relating to demographics, HD treatment and laboratory measurements were evaluated as predictors of 5-year mortality utilizing a machine learning approach (random forest). RESULTS: A total of 363 patients were included in the analysis, with 84 patients having started HD at <12 years of age. Low albumin and elevated lactate dehydrogenase (LDH) were the two most important predictors of 5-year mortality. Other predictors included elevated red blood cell distribution width or blood pressure and decreased red blood cell count, haemoglobin, albumin:globulin ratio, ultrafiltration rate, z-score weight for age or single-pool Kt/V (below target). Mortality was predicted with an accuracy of 81%. CONCLUSIONS: Mortality in paediatric and young adult patients on chronic HD is associated with multifactorial markers of nutrition, inflammation, anaemia and dialysis dose. This highlights the importance of multimodal intervention strategies besides adequate HD treatment as determined by Kt/V alone. The association with elevated LDH was not previously reported and may indicate the relevance of blood-membrane interactions, organ malperfusion or haematologic and metabolic changes during maintenance HD in this population.


Subject(s)
Anemia/mortality , Biomarkers/analysis , Inflammation/mortality , Kidney Failure, Chronic/mortality , Machine Learning , Renal Dialysis/mortality , Adolescent , Adult , Anemia/etiology , Anemia/pathology , Body Weight , Child , Child, Preschool , Female , Humans , Infant , Infant, Newborn , Inflammation/etiology , Inflammation/pathology , Kidney Failure, Chronic/pathology , Kidney Failure, Chronic/therapy , Male , Nutritional Status , Prognosis , Renal Dialysis/adverse effects , Retrospective Studies , Survival Rate , Young Adult
12.
Clin Pharmacol Ther ; 107(4): 926-933, 2020 04.
Article in English | MEDLINE | ID: mdl-31930487

ABSTRACT

Clinical pharmacology is a multidisciplinary data sciences field that utilizes mathematical and statistical methods to generate maximal knowledge from data. Pharmacometrics (PMX) is a well-recognized tool to characterize disease progression, pharmacokinetics, and risk factors. Because the amount of data produced keeps growing with increasing pace, the computational effort necessary for PMX models is also increasing. Additionally, computationally efficient methods, such as machine learning (ML) are becoming increasingly important in medicine. However, ML is currently not an integrated part of PMX, for various reasons. The goals of this article are to (i) provide an introduction to ML classification methods, (ii) provide examples for a ML classification analysis to identify covariates based on specific research questions, (iii) examine a clinically relevant example to investigate possible relationships of ML and PMX, and (iv) present a summary of ML and PMX tasks to develop clinical decision support tools.


Subject(s)
Data Analysis , Databases, Factual/statistics & numerical data , Decision Trees , Machine Learning/statistics & numerical data , Pharmacology, Clinical/statistics & numerical data , Humans , Pharmacology, Clinical/methods
13.
Front Cell Infect Microbiol ; 10: 594030, 2020.
Article in English | MEDLINE | ID: mdl-33489933

ABSTRACT

Rationale: Tuberculosis diagnosis in children remains challenging. Microbiological confirmation of tuberculosis disease is often lacking, and standard immunodiagnostic including the tuberculin skin test and interferon-γ release assay for tuberculosis infection has limited sensitivity. Recent research suggests that inclusion of novel Mycobacterium tuberculosis antigens has the potential to improve standard immunodiagnostic tests for tuberculosis. Objective: To identify optimal antigen-cytokine combinations using novel Mycobacterium tuberculosis antigens and cytokine read-outs by machine learning algorithms to improve immunodiagnostic assays for tuberculosis. Methods: A total of 80 children undergoing investigation of tuberculosis were included (15 confirmed tuberculosis disease, five unconfirmed tuberculosis disease, 28 tuberculosis infection and 32 unlikely tuberculosis). Whole blood was stimulated with 10 novel Mycobacterium tuberculosis antigens and a fusion protein of early secretory antigenic target (ESAT)-6 and culture filtrate protein (CFP) 10. Cytokines were measured using xMAP multiplex assays. Machine learning algorithms defined a discriminative classifier with performance measured using area under the receiver operating characteristics. Measurements and main results: We found the following four antigen-cytokine pairs had a higher weight in the discriminative classifier compared to the standard ESAT-6/CFP-10-induced interferon-γ: Rv2346/47c- and Rv3614/15c-induced interferon-gamma inducible protein-10; Rv2031c-induced granulocyte-macrophage colony-stimulating factor and ESAT-6/CFP-10-induced tumor necrosis factor-α. A combination of the 10 best antigen-cytokine pairs resulted in area under the curve of 0.92 ± 0.04. Conclusion: We exploited the use of machine learning algorithms as a key tool to evaluate large immunological datasets. This identified several antigen-cytokine pairs with the potential to improve immunodiagnostic tests for tuberculosis in children.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis , Algorithms , Antigens, Bacterial , Bacterial Proteins , Child , Humans , Immunity , Machine Learning , Tuberculosis/diagnosis
14.
Clin Pharmacol Ther ; 107(4): 786-795, 2020 04.
Article in English | MEDLINE | ID: mdl-31863465

ABSTRACT

Despite the application of advanced statistical and pharmacometric approaches to pediatric trial data, a large pediatric evidence gap still remains. Here, we discuss how to collect more data from children by using real-world data from electronic health records, mobile applications, wearables, and social media. The large datasets collected with these approaches enable and may demand the use of artificial intelligence and machine learning to allow the data to be analyzed for decision making. Applications of this approach are presented, which include the prediction of future clinical complications, medical image analysis, identification of new pediatric end points and biomarkers, the prediction of treatment nonresponders, and the prediction of placebo-responders for trial enrichment. Finally, we discuss how to bring machine learning from science to pediatric clinical practice. We conclude that advantage should be taken of the current opportunities offered by innovations in data science and machine learning to close the pediatric evidence gap.


Subject(s)
Data Science/trends , Evidence-Based Medicine/trends , Inventions/trends , Machine Learning/trends , Pediatrics/trends , Artificial Intelligence/trends , Child , Data Science/methods , Evidence-Based Medicine/methods , Humans , Pediatrics/methods , Randomized Controlled Trials as Topic/methods
15.
Pediatr Res ; 86(1): 122-127, 2019 07.
Article in English | MEDLINE | ID: mdl-30928997

ABSTRACT

BACKGROUND: Machine learning models may enhance the early detection of clinically relevant hyperbilirubinemia based on patient information available in every hospital. METHODS: We conducted a longitudinal study on preterm and term born neonates with serial measurements of total serum bilirubin in the first two weeks of life. An ensemble, that combines a logistic regression with a random forest classifier, was trained to discriminate between the two classes phototherapy treatment vs. no treatment. RESULTS: Of 362 neonates included in this study, 98 had a phototherapy treatment, which our model was able to predict up to 48 h in advance with an area under the ROC-curve of 95.20%. From a set of 44 variables, including potential laboratory and clinical confounders, a subset of just four (bilirubin, weight, gestational age, hours since birth) suffices for a strong predictive performance. The resulting early phototherapy prediction tool (EPPT) is provided as an open web application. CONCLUSION: Early detection of clinically relevant hyperbilirubinemia can be enhanced by the application of machine learning. Existing guidelines can be further improved to optimize timing of bilirubin measurements to avoid toxic hyperbilirubinemia in high-risk patients while minimizing unneeded measurements in neonates who are at low risk.


Subject(s)
Bilirubin/blood , Hyperbilirubinemia, Neonatal/blood , Hyperbilirubinemia, Neonatal/diagnosis , Machine Learning , Phototherapy , Area Under Curve , Female , Gestational Age , Humans , Infant, Newborn , Infant, Premature , Internet , Longitudinal Studies , Male , ROC Curve , Regression Analysis , Retrospective Studies , Sensitivity and Specificity
16.
Infect Control Hosp Epidemiol ; 39(12): 1457-1462, 2018 12.
Article in English | MEDLINE | ID: mdl-30394238

ABSTRACT

To exploit the full potential of big routine data in healthcare and to efficiently communicate and collaborate with information technology specialists and data analysts, healthcare epidemiologists should have some knowledge of large-scale analysis techniques, particularly about machine learning. This review focuses on the broad area of machine learning and its first applications in the emerging field of digital healthcare epidemiology.


Subject(s)
Big Data , Epidemiologic Studies , Machine Learning , Biomedical Research/methods , Electronic Health Records , Humans
17.
J Pathol Clin Res ; 2(2): 80-92, 2016 Apr.
Article in English | MEDLINE | ID: mdl-27499918

ABSTRACT

Molecular classification of hepatocellular carcinomas (HCC) could guide patient stratification for personalized therapies targeting subclass-specific cancer 'driver pathways'. Currently, there are several transcriptome-based molecular classifications of HCC with different subclass numbers, ranging from two to six. They were established using resected tumours that introduce a selection bias towards patients without liver cirrhosis and with early stage HCCs. We generated and analyzed gene expression data from paired HCC and non-cancerous liver tissue biopsies from 60 patients as well as five normal liver samples. Unbiased consensus clustering of HCC biopsy profiles identified 3 robust classes. Class membership correlated with survival, tumour size and with Edmondson and Barcelona Clinical Liver Cancer (BCLC) stage. When focusing only on the gene expression of the HCC biopsies, we could validate previously reported classifications of HCC based on expression patterns of signature genes. However, the subclass-specific gene expression patterns were no longer preserved when the fold-change relative to the normal tissue was used. The majority of genes believed to be subclass-specific turned out to be cancer-related genes differentially regulated in all HCC patients, with quantitative rather than qualitative differences between the molecular subclasses. With the exception of a subset of samples with a definitive ß-catenin gene signature, biological pathway analysis could not identify class-specific pathways reflecting the activation of distinct oncogenic programs. In conclusion, we have found that gene expression profiling of HCC biopsies has limited potential to direct therapies that target specific driver pathways, but can identify subgroups of patients with different prognosis.

18.
Article in English | MEDLINE | ID: mdl-26357313

ABSTRACT

A major challenge in computational biology is to find simple representations of high-dimensional data that best reveal the underlying structure. In this work, we present an intuitive and easy-to-implement method based on ranked neighborhood comparisons that detects structure in unsupervised data. The method is based on ordering objects in terms of similarity and on the mutual overlap of nearest neighbors. This basic framework was originally introduced in the field of social network analysis to detect actor communities. We demonstrate that the same ideas can successfully be applied to biomedical data sets in order to reveal complex underlying structure. The algorithm is very efficient and works on distance data directly without requiring a vectorial embedding of data. Comprehensive experiments demonstrate the validity of this approach. Comparisons with state-of-the-art clustering methods show that the presented method outperforms hierarchical methods as well as density based clustering methods and model-based clustering. A further advantage of the method is that it simultaneously provides a visualization of the data. Especially in biomedical applications, the visualization of data can be used as a first pre-processing step when analyzing real world data sets to get an intuition of the underlying data structure. We apply this model to synthetic data as well as to various biomedical data sets which demonstrate the high quality and usefulness of the inferred structure.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Data Mining/methods , Databases, Factual , Pattern Recognition, Automated/methods , Algorithms , Anti-HIV Agents , Drug Discovery , HIV Infections/drug therapy , Humans , Neoplasms/classification , Neoplasms/genetics , Neoplasms/metabolism , Neoplasms/pathology
19.
J Clin Invest ; 124(4): 1568-81, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24569457

ABSTRACT

The use of pegylated interferon-α (pegIFN-α) has replaced unmodified recombinant IFN-α for the treatment of chronic viral hepatitis. While the superior antiviral efficacy of pegIFN-α is generally attributed to improved pharmacokinetic properties, the pharmacodynamic effects of pegIFN-α in the liver have not been studied. Here, we analyzed pegIFN-α-induced signaling and gene regulation in paired liver biopsies obtained prior to treatment and during the first week following pegIFN-α injection in 18 patients with chronic hepatitis C. Despite sustained high concentrations of pegIFN-α in serum, the Jak/STAT pathway was activated in hepatocytes only on the first day after pegIFN-α administration. Evaluation of liver biopsies revealed that pegIFN-α induces hundreds of genes that can be classified into four clusters based on different temporal expression profiles. In all clusters, gene transcription was mainly driven by IFN-stimulated gene factor 3 (ISGF3). Compared with conventional IFN-α therapy, pegIFN-α induced a broader spectrum of gene expression, including many genes involved in cellular immunity. IFN-induced secondary transcription factors did not result in additional waves of gene expression. Our data indicate that the superior antiviral efficacy of pegIFN-α is not the result of prolonged Jak/STAT pathway activation in hepatocytes, but rather is due to induction of additional genes that are involved in cellular immune responses.


Subject(s)
Interferon-alpha/pharmacology , Janus Kinases/metabolism , Liver/drug effects , Liver/metabolism , Polyethylene Glycols/pharmacology , STAT Transcription Factors/metabolism , Adult , Aged , Antiviral Agents/pharmacology , Endopeptidases/genetics , Endopeptidases/metabolism , Female , Gene Expression/drug effects , Hepatitis C, Chronic/drug therapy , Hepatitis C, Chronic/genetics , Hepatitis C, Chronic/metabolism , Humans , Immunity, Cellular/drug effects , Immunity, Cellular/genetics , Interferon alpha-2 , Interferon-Stimulated Gene Factor 3, gamma Subunit/genetics , Interferon-Stimulated Gene Factor 3, gamma Subunit/metabolism , Janus Kinases/genetics , Kinetics , Liver/immunology , Male , Middle Aged , Recombinant Proteins/pharmacology , STAT Transcription Factors/genetics , STAT1 Transcription Factor/genetics , STAT1 Transcription Factor/metabolism , Signal Transduction/drug effects , Suppressor of Cytokine Signaling 1 Protein , Suppressor of Cytokine Signaling 3 Protein , Suppressor of Cytokine Signaling Proteins/genetics , Suppressor of Cytokine Signaling Proteins/metabolism , Ubiquitin Thiolesterase
20.
Stat Med ; 32(21): 3737-51, 2013 Sep 20.
Article in English | MEDLINE | ID: mdl-23609602

ABSTRACT

We present a Bayesian approach for estimating the relative frequencies of multi-single nucleotide polymorphism (SNP) haplotypes in populations of the malaria parasite Plasmodium falciparum by using microarray SNP data from human blood samples. Each sample comes from a malaria patient and contains one or several parasite clones that may genetically differ. Samples containing multiple parasite clones with different genetic markers pose a special challenge. The situation is comparable with a polyploid organism. The data from each blood sample indicates whether the parasites in the blood carry a mutant or a wildtype allele at various selected genomic positions. If both mutant and wildtype alleles are detected at a given position in a multiply infected sample, the data indicates the presence of both alleles, but the ratio is unknown. Thus, the data only partially reveals which specific combinations of genetic markers (i.e. haplotypes across the examined SNPs) occur in distinct parasite clones. In addition, SNP data may contain errors at non-negligible rates. We use a multinomial mixture model with partially missing observations to represent this data and a Markov chain Monte Carlo method to estimate the haplotype frequencies in a population. Our approach addresses both challenges, multiple infections and data errors.


Subject(s)
Data Interpretation, Statistical , Genetic Variation/genetics , Malaria, Falciparum/genetics , Models, Statistical , Plasmodium falciparum/genetics , Polymorphism, Single Nucleotide/genetics , Algorithms , Animals , Haplotypes/genetics , Humans , Malaria, Falciparum/blood , Malaria, Falciparum/parasitology , Markov Chains , Monte Carlo Method , Oligonucleotide Array Sequence Analysis , Papua New Guinea
SELECTION OF CITATIONS
SEARCH DETAIL
...