Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Stat Methods Med Res ; 33(4): 669-680, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38490184

ABSTRACT

Diagnostic accuracy studies assess the sensitivity and specificity of a new index test in relation to an established comparator or the reference standard. The development and selection of the index test are usually assumed to be conducted prior to the accuracy study. In practice, this is often violated, for instance, if the choice of the (apparently) best biomarker, model or cutpoint is based on the same data that is used later for validation purposes. In this work, we investigate several multiple comparison procedures which provide family-wise error rate control for the emerging multiple testing problem. Due to the nature of the co-primary hypothesis problem, conventional approaches for multiplicity adjustment are too conservative for the specific problem and thus need to be adapted. In an extensive simulation study, five multiple comparison procedures are compared with regard to statistical error rates in least-favourable and realistic scenarios. This covers parametric and non-parametric methods and one Bayesian approach. All methods have been implemented in the new open-source R package cases which allows us to reproduce all simulation results. Based on our numerical results, we conclude that the parametric approaches (maxT and Bonferroni) are easy to apply but can have inflated type I error rates for small sample sizes. The two investigated Bootstrap procedures, in particular the so-called pairs Bootstrap, allow for a family-wise error rate control in finite samples and in addition have a competitive statistical power.


Subject(s)
Diagnostic Tests, Routine , Bayes Theorem , Data Interpretation, Statistical , Computer Simulation , Sample Size
2.
JMIR Form Res ; 7: e44549, 2023 Jun 27.
Article in English | MEDLINE | ID: mdl-37368487

ABSTRACT

BACKGROUND: During the COVID-19 pandemic, local health authorities were responsible for managing and reporting current cases in Germany. Since March 2020, employees had to contain the spread of COVID-19 by monitoring and contacting infected persons as well as tracing their contacts. In the EsteR project, we implemented existing and newly developed statistical models as decision support tools to assist in the work of the local health authorities. OBJECTIVE: The main goal of this study was to validate the EsteR toolkit in two complementary ways: first, investigating the stability of the answers provided by our statistical tools regarding model parameters in the back end and, second, evaluating the usability and applicability of our web application in the front end by test users. METHODS: For model stability assessment, a sensitivity analysis was carried out for all 5 developed statistical models. The default parameters of our models as well as the test ranges of the model parameters were based on a previous literature review on COVID-19 properties. The obtained answers resulting from different parameters were compared using dissimilarity metrics and visualized using contour plots. In addition, the parameter ranges of general model stability were identified. For the usability evaluation of the web application, cognitive walk-throughs and focus group interviews were conducted with 6 containment scouts located at 2 different local health authorities. They were first asked to complete small tasks with the tools and then express their general impressions of the web application. RESULTS: The simulation results showed that some statistical models were more sensitive to changes in their parameters than others. For each of the single-person use cases, we determined an area where the respective model could be rated as stable. In contrast, the results of the group use cases highly depended on the user inputs, and thus, no area of parameters with general model stability could be identified. We have also provided a detailed simulation report of the sensitivity analysis. In the user evaluation, the cognitive walk-throughs and focus group interviews revealed that the user interface needed to be simplified and more information was necessary as guidance. In general, the testers rated the web application as helpful, especially for new employees. CONCLUSIONS: This evaluation study allowed us to refine the EsteR toolkit. Using the sensitivity analysis, we identified suitable model parameters and analyzed how stable the statistical models were in terms of changes in their parameters. Furthermore, the front end of the web application was improved with the results of the conducted cognitive walk-throughs and focus group interviews regarding its user-friendliness.

3.
Hepatology ; 78(1): 258-271, 2023 07 01.
Article in English | MEDLINE | ID: mdl-36994719

ABSTRACT

BACKGROUND AND AIMS: Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. APPROACH AND RESULTS: Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models.Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). CONCLUSIONS: Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.


Subject(s)
Non-alcoholic Fatty Liver Disease , Adult , Humans , Non-alcoholic Fatty Liver Disease/diagnosis , Non-alcoholic Fatty Liver Disease/pathology , Liver/pathology , Fibrosis , Algorithms , Biomarkers , Machine Learning , Biopsy , Liver Cirrhosis/diagnosis , Liver Cirrhosis/pathology
4.
Stud Health Technol Inform ; 296: 17-24, 2022 Aug 17.
Article in English | MEDLINE | ID: mdl-36073484

ABSTRACT

In Germany, the current COVID-19 cases are managed and reported by the local health authorities. The workload of their employees during the pandemic is high, especially in periods of high infection numbers. In this work a decision support toolkit for local health authorities is introduced. A demonstrator web application was developed with the R Shiny framework and is publicly accessible online. It contains five separate tools based on statistical models for specific use cases and corresponding questions of COVID-19 cases and their contacts. The underlying statistical methods have been implemented in a new open-source R package. The toolkit has the potential to support local health authorities' employees in their daily work. A simulated-based validation of the statistical models and a usability evaluation of the demonstrator application in a user study will be carried out in the future.


Subject(s)
COVID-19 , Esters , Humans , Models, Statistical , Pandemics , Software
6.
Mod Pathol ; 35(12): 1759-1769, 2022 12.
Article in English | MEDLINE | ID: mdl-36088478

ABSTRACT

Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations on compiling test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help pathologists and regulatory agencies verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future.


Subject(s)
Artificial Intelligence , Pathology , Humans , Forecasting , Datasets as Topic
7.
Stat Med ; 41(5): 891-909, 2022 02 28.
Article in English | MEDLINE | ID: mdl-35075684

ABSTRACT

Major advances have been made regarding the utilization of machine learning techniques for disease diagnosis and prognosis based on complex and high-dimensional data. Despite all justified enthusiasm, overoptimistic assessments of predictive performance are still common in this area. However, predictive models and medical devices based on such models should undergo a throughout evaluation before being implemented into clinical practice. In this work, we propose a multiple testing framework for (comparative) phase III diagnostic accuracy studies with sensitivity and specificity as co-primary endpoints. Our approach challenges the frequent recommendation to strictly separate model selection and evaluation, that is, to only assess a single diagnostic model in the evaluation study. We show that our parametric simultaneous test procedure asymptotically allows strong control of the family-wise error rate. A multiplicity correction is also available for point and interval estimates. Moreover, we demonstrate in an extensive simulation study that our multiple testing strategy on average leads to a better final diagnostic model and increased statistical power. To plan such studies, we propose a Bayesian approach to determine the optimal number of models to evaluate simultaneously. For this purpose, our algorithm optimizes the expected final model performance given previous (hold-out) data from the model development phase. We conclude that an assessment of multiple promising diagnostic models in the same evaluation study has several advantages when suitable adjustments for multiple comparisons are employed.


Subject(s)
Algorithms , Machine Learning , Bayes Theorem , Humans , Prognosis , Sensitivity and Specificity
8.
Article in English | MEDLINE | ID: mdl-34501757

ABSTRACT

In Germany, local health departments are responsible for surveillance of the current pandemic situation. One of their major tasks is to monitor infected persons. For instance, the direct contacts of infectious persons at group meetings have to be traced and potentially quarantined. Such quarantine requirements may be revoked, when all contact persons obtain a negative polymerase chain reaction (PCR) test result. However, contact tracing and testing is time-consuming, costly and not always feasible. In this work, we present a statistical model for the probability that no transmission of COVID-19 occurred given an arbitrary number of negative test results among contact persons. Hereby, the time-dependent sensitivity and specificity of the PCR test are taken into account. We employ a parametric Bayesian model which combines an adaptable Beta-Binomial prior and two likelihood components in a novel fashion. This is illustrated for group events in German school classes. The first evaluation on a real-world dataset showed that our approach can support important quarantine decisions with the goal to achieve a better balance between necessary containment of the pandemic and preservation of social and economic life. Future work will focus on further refinement and evaluation of quarantine decisions based on our statistical model.


Subject(s)
COVID-19 , Quarantine , Bayes Theorem , Contact Tracing , Humans , Models, Statistical , SARS-CoV-2
9.
Stat Methods Med Res ; 29(6): 1728-1745, 2020 06.
Article in English | MEDLINE | ID: mdl-31510862

ABSTRACT

Model selection and performance assessment for prediction models are important tasks in machine learning, e.g. for the development of medical diagnosis or prognosis rules based on complex data. A common approach is to select the best model via cross-validation and to evaluate this final model on an independent dataset. In this work, we propose to instead evaluate several models simultaneously. These may result from varied hyperparameters or completely different learning algorithms. Our main goal is to increase the probability to correctly identify a model that performs sufficiently well. In this case, adjusting for multiplicity is necessary in the evaluation stage to avoid an inflation of the family wise error rate. We apply the so-called maxT-approach which is based on the joint distribution of test statistics and suitable to (approximately) control the family-wise error rate for a wide variety of performance measures. We conclude that evaluating only a single final model is suboptimal. Instead, several promising models should be evaluated simultaneously, e.g. all models within one standard error of the best validation model. This strategy has proven to increase the probability to correctly identify a good model as well as the final model performance in extensive simulation studies.


Subject(s)
Algorithms , Machine Learning , Computer Simulation , Prognosis
10.
Eur J Radiol ; 94: 78-84, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28647206

ABSTRACT

PURPOSE: To describe early signs for restrictive subtype of chronic lung allograft dysfunction (CLAD) after lung transplantation in computed tomography (CT) and to evaluate the predictive value for disease progression and survival. MATERIAL AND METHODS: 52 CT examinations in lung transplant patients at CLAD onset were scored for CT features referring to airways disease, parenchymal or pleural abnormality. Patients with and without later development of restrictive CLAD (TLC≤80%) were compared. A radiological score for inflammation including pleural effusion, central and peripheral ground glass opacities and consolidations was calculated and used for survival analysis. RESULTS: CT of patients with later development of restrictive CLAD showed significantly more often abnormalities at CLAD onset, in particular consolidations (57% vs. 4%; p<0.001) and ground glass attenuations (71% vs. 7%; p<0.001) than those of patients without the restrictive phenotype. CT score for inflammation was significantly higher in patients with than without later restrictive CLAD (3.4 vs. 0.6; p<0.001). Survival of patients with a high score (>2) for inflammation in CT at CLAD onset was significantly lower than of those with a low score (443 vs. 2415 days; p=0.019). CONCLUSIONS: CT at CLAD onset differs in patients with/without later development of the restrictive phenotype. It is therefore an indicator for future development of restrictive CLAD and predictor for survival. It should be implemented in the diagnostic work-up at diagnosis of CLAD.


Subject(s)
Lung Diseases/diagnostic imaging , Lung Transplantation/adverse effects , Primary Graft Dysfunction/diagnostic imaging , Tomography, X-Ray Computed , Adult , Disease Progression , Female , Follow-Up Studies , Humans , Lung Diseases/mortality , Lung Diseases/physiopathology , Male , Middle Aged , Phenotype , Predictive Value of Tests , Primary Graft Dysfunction/mortality , Primary Graft Dysfunction/physiopathology , Retrospective Studies , Survival Analysis , Tomography, X-Ray Computed/methods
11.
PLoS One ; 12(5): e0177757, 2017.
Article in English | MEDLINE | ID: mdl-28542322

ABSTRACT

BACKGROUND: Transition to adult health services is a vulnerable phase in young persons with chronic disease. We describe how young persons with inflammatory bowel disease in Germany and Austria experience care during the transitional age, focusing on differences by type of provider (pediatric vs. adult specialist, no specialist). METHODS: This was a follow up survey in patients previously registered with a pediatric IBD registry. Patients aged 15 to 25 received a postal questionnaire, including a measure of health care experience and satisfaction. Descriptive analyses were stratified by age. Sub-analyses in the 18-20 year age group compared health care experience by type of provider. Determinants of early or late transfer were examined using multinomial logistic regression. RESULTS: 619 patients responded to the survey; 605 questionnaires were available for analysis. Usual age of completing transition was 18. Earlier transfer was more common with low parental SES (OR 1.8, 95% CI 0.7 to 4.6), and less common with advanced schooling (OR 0.5, 95% CI 0.2 to 1.2). Structured transition was uncommon. 48% of the respondents had not received any preceding transition advice. Overall satisfaction with IBD care was high, especially with respect to interpersonal aspects, but less so in aspects of continuity of care. CONCLUSIONS: Despite high overall patient satisfaction, relevant deficiencies in transitional care were documented. Some of these were associated with lower parental social status. Differences in health care satisfaction by type of provider (adult vs. pediatric) were small.


Subject(s)
Inflammatory Bowel Diseases , Surveys and Questionnaires , Transition to Adult Care , Adolescent , Adult , Female , Health Personnel , Humans , Inflammatory Bowel Diseases/therapy , Male , Patient Satisfaction , Young Adult
12.
Liver Int ; 37(2): 196-204, 2017 02.
Article in English | MEDLINE | ID: mdl-27428078

ABSTRACT

BACKGROUND & AIMS: Identifying advanced fibrosis in chronic hepatitis delta patients and thus in need of urgent treatment is crucial. To avoid liver biopsy, non-invasive fibrosis scores may be helpful but have not been evaluated for chronic hepatitis delta yet. METHODS: We evaluated eight non-invasive fibrosis scores in 100 HDV RNA-positive patients with available central histological reading. New cut-off values were calculated by using Receiver Operating Characteristics and Youden indexes. Predictors for the presence of ISHAK F3-6 were revealed by t-tests or Mann-Whitney tests. RESULTS: None of the tested scores had an area under the curve (AUROC) > 0.8 and performed according to our predefined requirements of a sensitivity of >80% and a positive predictive value (PPV) >90% - even after adaption. However, the ELF score was able to identify advanced fibrosis with a high sensitivity (93%) and PPV (81%), but relies on expensive extracellular matrix markers with bad availability in many endemic regions of HDV. Thus, we developed a novel non-invasive approach and identified low cholinesterase (P=.002), low albumin (P=.041), higher gamma glutamyl transferase, as well as older age (P<.001) as predictors of fibrosis resulting in the Delta Fibrosis Score (DFS). The DFS performed with a sensitivity of 85% and PPV of 93% with an AUROC of 0.87. CONCLUSIONS: Existing non-invasive fibrosis scores are either impracticable or do not perform well in chronic hepatitis delta patients. However, the new Delta Fibrosis Score is the first non-invasive fibrosis score specifically developed for chronic hepatitis delta and requires only standard parameters.


Subject(s)
Hepatitis D, Chronic/complications , Liver Cirrhosis/diagnosis , Liver Cirrhosis/virology , Liver/pathology , Adult , Biomarkers/blood , Biopsy , Clinical Trials, Phase II as Topic , Disease Progression , Female , Germany , Hepatitis D, Chronic/pathology , Hepatitis Delta Virus , Humans , Liver Cirrhosis/pathology , Male , Middle Aged , Multicenter Studies as Topic , Predictive Value of Tests , ROC Curve , Randomized Controlled Trials as Topic , Severity of Illness Index , Young Adult , gamma-Glutamyltransferase/blood
13.
Hepatology ; 65(2): 414-425, 2017 02.
Article in English | MEDLINE | ID: mdl-27770553

ABSTRACT

Hepatitis delta virus (HDV) is the most severe form of viral hepatitis. Pegylated interferon alfa (PEG-IFNα) is effective in only 25%-30% of patients and is associated with frequent side effects. The aim of this study was to analyze the clinical long-term outcome of hepatitis delta in relation to different antiviral treatment strategies. We studied 136 anti-HDV-positive patients who were followed for at least 6 months in a retrospective single-center cohort (mean time of follow-up, 5.2 years; range, 0.6-18.8). Liver cirrhosis was already present in 62 patients at first presentation. Twenty-nine percent of patients did not receive any antiviral treatment, 38% were treated with interferon alfa (IFNα)-based therapies, and 33% received nucleos(t)ide analogues (NAs) only. Clinical endpoints defined as hepatic decompensation (ascites, encephalopathy, and variceal bleeding), hepatocellular carcinoma, liver transplantation, and liver-related death developed in 55 patients (40%). Patients who received IFNα-based therapies developed clinical endpoints less frequently than those treated with NA (P = 0.02; HR, 4.0) or untreated patients (P = 0.05; HR, 2.2; 17%, 64%, and 44%), respectively, which was significant in both chi-square and Kaplan-Meier analysis. In addition, considering various clinical and virological parameters, IFNα therapy was independently associated with a more benign clinical long-term outcome in multivariate logistic regression analysis (P = 0.04; odds ratio, 0.25; 95% confidence interval, 0.07-0.9). Loss of HDV RNA during follow-up was more frequent in IFNα-treated patients and strongly linked with a lower likelihood to experience liver-related complications. CONCLUSION: IFNα-based antiviral therapy of hepatitis delta was independently associated with a lower likelihood for clinical disease progression. Durable undetectability of HDV RNA is a valid surrogate endpoint in the treatment of hepatitis delta. (Hepatology 2017;65:414-425).


Subject(s)
Antiviral Agents/therapeutic use , Hepatitis D/drug therapy , Hepatitis D/mortality , Interferon-alpha/therapeutic use , Liver Cirrhosis/pathology , Liver Neoplasms/pathology , Adolescent , Adult , Analysis of Variance , Antiviral Agents/adverse effects , Cause of Death , Chi-Square Distribution , Cohort Studies , Disease Progression , Dose-Response Relationship, Drug , Drug Administration Schedule , Drug Therapy, Combination , Female , Follow-Up Studies , Germany , Hepatitis D/diagnosis , Hepatitis Delta Virus/drug effects , Hepatitis Delta Virus/isolation & purification , Humans , Interferon-alpha/adverse effects , Kaplan-Meier Estimate , Liver/drug effects , Liver/pathology , Liver Cirrhosis/etiology , Liver Cirrhosis/mortality , Liver Neoplasms/etiology , Liver Neoplasms/mortality , Logistic Models , Male , Middle Aged , Multivariate Analysis , Reference Values , Retrospective Studies , Risk Assessment , Severity of Illness Index , Survival Analysis , Time Factors , Treatment Outcome , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...