Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Cochrane Database Syst Rev ; 6: CD015890, 2024 Jun 11.
Article in English | MEDLINE | ID: mdl-38860538

ABSTRACT

BACKGROUND: Tuberculosis (TB) is a leading cause of mortality due to an infectious disease, with an estimated 1.6 million deaths due to TB in 2022. Approximately 25% of the global population has TB infection, giving rise to 10.6 million episodes of TB disease in 2022. Undernutrition is a key risk factor for TB and was linked to an estimated 2.2 million TB episodes in 2022, as outlined in the World Health Organization (WHO) Global Tuberculosis Report. OBJECTIVES: To determine the prognostic value of undernutrition in the general population of adults, adolescents, and children for predicting tuberculosis disease over any time period. SEARCH METHODS: We searched the literature databases MEDLINE (via PubMed) and WHO Global Index Medicus, as well as the WHO International Clinical Trials Registry Platform (ICTRP) on 3 May 2023 (date of last search for all databases). We placed no restrictions on the language of publication. SELECTION CRITERIA: We included retrospective and prospective cohort studies, irrespective of publication status or language. The target population comprised adults, adolescents, and children from diverse settings, encompassing outpatient and inpatient cohorts, with varying comorbidities and risk of exposure to tuberculosis. DATA COLLECTION AND ANALYSIS: We used standard Cochrane methodology and the Quality In Prognosis Studies (QUIPS) tool to assess the risk of bias of the studies. Prognostic factors included undernutrition, defined as wasting, stunting, and underweight, with specific measures such as body mass index (BMI) less than two standard deviations below the median for children and adolescents and low BMI scores (< 18.5) for adults and adolescents. Prognostication occurred at enrolment/baseline. The primary outcome was the incidence of TB disease. The secondary outcome was recurrent TB disease. We performed a random-effects meta-analysis for the adjusted hazard ratios (HR), risk ratios (RR), or odds ratios (OR), employing the restricted maximum likelihood estimation. We rated the certainty of the evidence using the GRADE approach. MAIN RESULTS: We included 51 cohort studies with over 27 million participants from the six WHO regions. Sixteen large population-based studies were conducted in China, Singapore, South Korea, and the USA, and 25 studies focused on people living with HIV, which were mainly conducted in the African region. Most studies were in adults, four in children, and three in children and adults. Undernutrition as an exposure was usually defined according to standard criteria; however, the diagnosis of TB did not include a confirmatory culture or molecular diagnosis using a WHO-approved rapid diagnostic test in eight studies. The median follow-up time was 3.5 years, and the studies primarily reported an adjusted hazard ratio from a multivariable Cox-proportional hazard model. Hazard ratios (HR) The HR estimates represent the highest certainty of the evidence, explored through sensitivity analyses and excluding studies at high risk of bias. We present 95% confidence intervals (CI) and prediction intervals, which present between-study heterogeneity represented in a measurement of the variability of effect sizes (i.e. the interval within which the effect size of a new study would fall considering the same population of studies included in the meta-analysis). Undernutrition may increase the risk of TB disease (HR 2.23, 95% CI 1.83 to 2.72; prediction interval 0.98 to 5.05; 23 studies; 2,883,266 participants). The certainty of the evidence is low due to a moderate risk of bias across studies and inconsistency. When stratified by follow-up time, the results are more consistent across < 10 years follow-up (HR 2.02, 95% CI 1.74 to 2.34; prediction interval 1.20 to 3.39; 22 studies; 2,869,077 participants). This results in a moderate certainty of evidence due to a moderate risk of bias across studies. However, at 10 or more years of follow-up, we found only one study with a wider CI and higher HR (HR 12.43, 95% CI 5.74 to 26.91; 14,189 participants). The certainty of the evidence is low due to the moderate risk of bias and indirectness. Odds ratio (OR) Undernutrition may increase the odds of TB disease, but the results are uncertain (OR 1.56, 95% CI 1.13 to 2.17; prediction interval 0.61 to 3.99; 8 studies; 173,497 participants). Stratification by follow-up was not possible as all studies had a follow-up of < 10 years. The certainty of the evidence is very low due to the high risk of bias and inconsistency. Contour-enhanced funnel plots were not reported due to the few studies included. Risk ratio (RR) Undernutrition may increase the risk of TB disease (RR 1.95, 95% CI 1.72 to 2.20; prediction interval 1.49 to 2.55; 4 studies; 1,475,867 participants). Stratification by follow-up was not possible as all studies had a follow-up of < 10 years. The certainty of the evidence is low due to the high risk of bias. Contour-enhanced funnel plots were not reported due to the few studies included. AUTHORS' CONCLUSIONS: Undernutrition probably increases the risk of TB two-fold in the short term (< 10 years) and may also increase the risk in the long term (> 10 years). Policies targeted towards the reduction of the burden of undernutrition are not only needed to alleviate human suffering due to undernutrition and its many adverse consequences, but are also an important part of the critical measures for ending the TB epidemic by 2030. Large population-based cohorts, including those derived from high-quality national registries of exposures (undernutrition) and outcomes (TB disease), are needed to provide high-certainty estimates of this risk across different settings and populations, including low and middle-income countries from different WHO regions. Moreover, studies including children and adolescents and state-of-the-art methods for diagnosing TB would provide more up-to-date information relevant to practice and policy. FUNDING: World Health Organization (203256442). REGISTRATION: PROSPERO registration: CRD42023408807 Protocol: https://doi.org/10.1002/14651858.CD015890.


Subject(s)
Malnutrition , Tuberculosis , Humans , Malnutrition/complications , Malnutrition/epidemiology , Risk Factors , Child , Adolescent , Tuberculosis/epidemiology , Adult , Prognosis , Retrospective Studies , Prospective Studies
3.
Cochrane Database Syst Rev ; 12: CD013139, 2021 12 21.
Article in English | MEDLINE | ID: mdl-34931303

ABSTRACT

BACKGROUND: The Revised Cardiac Risk Index (RCRI) is a widely acknowledged prognostic model to estimate preoperatively the probability of developing in-hospital major adverse cardiac events (MACE) in patients undergoing noncardiac surgery. However, the RCRI does not always make accurate predictions, so various studies have investigated whether biomarkers added to or compared with the RCRI could improve this. OBJECTIVES: Primary: To investigate the added predictive value of biomarkers to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. Secondary: To investigate the prognostic value of biomarkers compared to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. Tertiary: To investigate the prognostic value of other prediction models compared to the RCRI to preoperatively predict in-hospital MACE and other adverse outcomes in patients undergoing noncardiac surgery. SEARCH METHODS: We searched MEDLINE and Embase from 1 January 1999 (the year that the RCRI was published) until 25 June 2020. We also searched ISI Web of Science and SCOPUS for articles referring to the original RCRI development study in that period. SELECTION CRITERIA: We included studies among adults who underwent noncardiac surgery, reporting on (external) validation of the RCRI and: - the addition of biomarker(s) to the RCRI; or - the comparison of the predictive accuracy of biomarker(s) to the RCRI; or - the comparison of the predictive accuracy of the RCRI to other models. Besides MACE, all other adverse outcomes were considered for inclusion. DATA COLLECTION AND ANALYSIS: We developed a data extraction form based on the CHARMS checklist. Independent pairs of authors screened references, extracted data and assessed risk of bias and concerns regarding applicability according to PROBAST. For biomarkers and prediction models that were added or compared to the RCRI in ≥ 3 different articles, we described study characteristics and findings in further detail. We did not apply GRADE as no guidance is available for prognostic model reviews. MAIN RESULTS: We screened 3960 records and included 107 articles.   Over all objectives we rated risk of bias as high in ≥ 1 domain in 90% of included studies, particularly in the analysis domain. Statistical pooling or meta-analysis of reported results was impossible due to heterogeneity in various aspects: outcomes used, scale by which the biomarker was added/compared to the RCRI, prediction horizons and studied populations.  Added predictive value of biomarkers to the RCRI Fifty-one studies reported on the added value of biomarkers to the RCRI. Sixty-nine different predictors were identified derived from blood (29%), imaging (33%) or other sources (38%). Addition of NT-proBNP, troponin or their combination improved the RCRI for predicting MACE (median delta c-statistics: 0.08, 0.14 and 0.12 for NT-proBNP, troponin and their combination, respectively). The median total net reclassification index (NRI) was 0.16 and 0.74 after addition of troponin and NT-proBNP to the RCRI, respectively. Calibration was not reported. To predict myocardial infarction, the median delta c-statistic when NT-proBNP was added to the RCRI was 0.09, and 0.06 for prediction of all-cause mortality and MACE combined. For BNP and copeptin, data were not sufficient to provide results on their added predictive performance, for any of the outcomes. Comparison of the predictive value of biomarkers to the RCRI  Fifty-one studies assessed the predictive performance of biomarkers alone compared to the RCRI. We identified 60 unique predictors derived from blood (38%), imaging (30%) or other sources, such as the American Society of Anesthesiologists (ASA) classification (32%). Predictions were similar between the ASA classification and the RCRI for all studied outcomes. In studies different from those identified in objective 1, the median delta c-statistic was 0.15 and 0.12 in favour of  BNP and NT-proBNP alone, respectively, when compared to the RCRI, for the prediction of MACE. For C-reactive protein, the predictive performance was similar to the RCRI. For other biomarkers and outcomes, data were insufficient to provide summary results. One study reported on calibration and none on reclassification. Comparison of the predictive value of other prognostic models to the RCRI   Fifty-two articles compared the predictive ability of the RCRI to other prognostic models. Of these, 42% developed a new prediction model, 22% updated the RCRI, or another prediction model, and 37% validated an existing prediction model. None of the other prediction models showed better performance in predicting MACE than the RCRI. To predict myocardial infarction and cardiac arrest, ACS-NSQIP-MICA had a higher median delta c-statistic of 0.11 compared to the RCRI. To predict all-cause mortality, the median delta c-statistic was 0.15 higher in favour of ACS-NSQIP-SRS compared to the RCRI. Predictive performance was not better for CHADS2, CHA2DS2-VASc, R2CHADS2, Goldman index, Detsky index or VSG-CRI compared to the RCRI for any of the outcomes. Calibration and reclassification were reported in only one and three studies, respectively. AUTHORS' CONCLUSIONS: Studies included in this review suggest that the predictive performance of the RCRI in predicting MACE is improved when NT-proBNP, troponin or their combination are added. Other studies indicate that BNP and NT-proBNP, when used in isolation, may even have a higher discriminative performance than the RCRI. There was insufficient evidence of a difference between the predictive accuracy of the RCRI and other prediction models in predicting MACE. However, ACS-NSQIP-MICA and ACS-NSQIP-SRS outperformed the RCRI in predicting myocardial infarction and cardiac arrest combined, and all-cause mortality, respectively. Nevertheless, the results cannot be interpreted as conclusive due to high risks of bias in a majority of papers, and pooling was impossible due to heterogeneity in outcomes, prediction horizons, biomarkers and studied populations. Future research on the added prognostic value of biomarkers to existing prediction models should focus on biomarkers with good predictive accuracy in other settings (e.g. diagnosis of myocardial infarction) and identification of biomarkers from omics data. They should be compared to novel biomarkers with so far insufficient evidence compared to established ones, including NT-proBNP or troponins. Adherence to recent guidance for prediction model studies (e.g. TRIPOD; PROBAST) and use of standardised outcome definitions in primary studies is highly recommended to facilitate systematic review and meta-analyses in the future.


Subject(s)
Heart Arrest , Myocardial Infarction , Adult , Bias , Biomarkers , Humans , Peptide Fragments , Predictive Value of Tests , Prognosis , Risk Assessment
4.
Cochrane Database Syst Rev ; 3: CD013639, 2021 03 16.
Article in English | MEDLINE | ID: mdl-33724443

ABSTRACT

BACKGROUND: The respiratory illness caused by SARS-CoV-2 infection continues to present diagnostic challenges. Our 2020 edition of this review showed thoracic (chest) imaging to be sensitive and moderately specific in the diagnosis of coronavirus disease 2019 (COVID-19). In this update, we include new relevant studies, and have removed studies with case-control designs, and those not intended to be diagnostic test accuracy studies. OBJECTIVES: To evaluate the diagnostic accuracy of thoracic imaging (computed tomography (CT), X-ray and ultrasound) in people with suspected COVID-19. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, The Stephen B. Thacker CDC Library, and repositories of COVID-19 publications through to 30 September 2020. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of all designs, except for case-control, that recruited participants of any age group suspected to have COVID-19 and that reported estimates of test accuracy or provided data from which we could compute estimates. DATA COLLECTION AND ANALYSIS: The review authors independently and in duplicate screened articles, extracted data and assessed risk of bias and applicability concerns using the QUADAS-2 domain-list. We presented the results of estimated sensitivity and specificity using paired forest plots, and we summarised pooled estimates in tables. We used a bivariate meta-analysis model where appropriate. We presented the uncertainty of accuracy estimates using 95% confidence intervals (CIs). MAIN RESULTS: We included 51 studies with 19,775 participants suspected of having COVID-19, of whom 10,155 (51%) had a final diagnosis of COVID-19. Forty-seven studies evaluated one imaging modality each, and four studies evaluated two imaging modalities each. All studies used RT-PCR as the reference standard for the diagnosis of COVID-19, with 47 studies using only RT-PCR and four studies using a combination of RT-PCR and other criteria (such as clinical signs, imaging tests, positive contacts, and follow-up phone calls) as the reference standard. Studies were conducted in Europe (33), Asia (13), North America (3) and South America (2); including only adults (26), all ages (21), children only (1), adults over 70 years (1), and unclear (2); in inpatients (2), outpatients (32), and setting unclear (17). Risk of bias was high or unclear in thirty-two (63%) studies with respect to participant selection, 40 (78%) studies with respect to reference standard, 30 (59%) studies with respect to index test, and 24 (47%) studies with respect to participant flow. For chest CT (41 studies, 16,133 participants, 8110 (50%) cases), the sensitivity ranged from 56.3% to 100%, and specificity ranged from 25.4% to 97.4%. The pooled sensitivity of chest CT was 87.9% (95% CI 84.6 to 90.6) and the pooled specificity was 80.0% (95% CI 74.9 to 84.3). There was no statistical evidence indicating that reference standard conduct and definition for index test positivity were sources of heterogeneity for CT studies. Nine chest CT studies (2807 participants, 1139 (41%) cases) used the COVID-19 Reporting and Data System (CO-RADS) scoring system, which has five thresholds to define index test positivity. At a CO-RADS threshold of 5 (7 studies), the sensitivity ranged from 41.5% to 77.9% and the pooled sensitivity was 67.0% (95% CI 56.4 to 76.2); the specificity ranged from 83.5% to 96.2%; and the pooled specificity was 91.3% (95% CI 87.6 to 94.0). At a CO-RADS threshold of 4 (7 studies), the sensitivity ranged from 56.3% to 92.9% and the pooled sensitivity was 83.5% (95% CI 74.4 to 89.7); the specificity ranged from 77.2% to 90.4% and the pooled specificity was 83.6% (95% CI 80.5 to 86.4). For chest X-ray (9 studies, 3694 participants, 2111 (57%) cases) the sensitivity ranged from 51.9% to 94.4% and specificity ranged from 40.4% to 88.9%. The pooled sensitivity of chest X-ray was 80.6% (95% CI 69.1 to 88.6) and the pooled specificity was 71.5% (95% CI 59.8 to 80.8). For ultrasound of the lungs (5 studies, 446 participants, 211 (47%) cases) the sensitivity ranged from 68.2% to 96.8% and specificity ranged from 21.3% to 78.9%. The pooled sensitivity of ultrasound was 86.4% (95% CI 72.7 to 93.9) and the pooled specificity was 54.6% (95% CI 35.3 to 72.6). Based on an indirect comparison using all included studies, chest CT had a higher specificity than ultrasound. For indirect comparisons of chest CT and chest X-ray, or chest X-ray and ultrasound, the data did not show differences in specificity or sensitivity. AUTHORS' CONCLUSIONS: Our findings indicate that chest CT is sensitive and moderately specific for the diagnosis of COVID-19. Chest X-ray is moderately sensitive and moderately specific for the diagnosis of COVID-19. Ultrasound is sensitive but not specific for the diagnosis of COVID-19. Thus, chest CT and ultrasound may have more utility for excluding COVID-19 than for differentiating SARS-CoV-2 infection from other causes of respiratory illness. Future diagnostic accuracy studies should pre-define positive imaging findings, include direct comparisons of the various modalities of interest in the same participant population, and implement improved reporting practices.


Subject(s)
COVID-19/diagnostic imaging , Radiography, Thoracic , Tomography, X-Ray Computed , Ultrasonography , Adolescent , Adult , Aged , Bias , COVID-19 Nucleic Acid Testing/standards , Child , Confidence Intervals , Humans , Lung/diagnostic imaging , Middle Aged , Radiography, Thoracic/standards , Radiography, Thoracic/statistics & numerical data , Reference Standards , Sensitivity and Specificity , Tomography, X-Ray Computed/standards , Tomography, X-Ray Computed/statistics & numerical data , Ultrasonography/standards , Ultrasonography/statistics & numerical data , Young Adult
5.
Cochrane Database Syst Rev ; 9: CD013639, 2020 09 30.
Article in English | MEDLINE | ID: mdl-32997361

ABSTRACT

BACKGROUND: The diagnosis of infection by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) presents major challenges. Reverse transcriptase polymerase chain reaction (RT-PCR) testing is used to diagnose a current infection, but its utility as a reference standard is constrained by sampling errors, limited sensitivity (71% to 98%), and dependence on the timing of specimen collection. Chest imaging tests are being used in the diagnosis of COVID-19 disease, or when RT-PCR testing is unavailable. OBJECTIVES: To determine the diagnostic accuracy of chest imaging (computed tomography (CT), X-ray and ultrasound) in people with suspected or confirmed COVID-19. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, and The Stephen B. Thacker CDC Library. In addition, we checked repositories of COVID-19 publications. We did not apply any language restrictions. We conducted searches for this review iteration up to 5 May 2020. SELECTION CRITERIA: We included studies of all designs that produce estimates of test accuracy or provide data from which estimates can be computed. We included two types of cross-sectional designs: a) where all patients suspected of the target condition enter the study through the same route and b) where it is not clear up front who has and who does not have the target condition, or where the patients with the target condition are recruited in a different way or from a different population from the patients without the target condition. When studies used a variety of reference standards, we included all of them. DATA COLLECTION AND ANALYSIS: We screened studies and extracted data independently, in duplicate. We also assessed the risk of bias and applicability concerns independently, in duplicate, using the QUADAS-2 checklist and presented the results of estimated sensitivity and specificity, using paired forest plots, and summarised in tables. We used a hierarchical meta-analysis model where appropriate. We presented uncertainty of the accuracy estimates using 95% confidence intervals (CIs). MAIN RESULTS: We included 84 studies, falling into two categories: studies with participants with confirmed diagnoses of COVID-19 at the time of recruitment (71 studies with 6331 participants) and studies with participants suspected of COVID-19 (13 studies with 1948 participants, including three case-control studies with 549 cases and controls). Chest CT was evaluated in 78 studies (8105 participants), chest X-ray in nine studies (682 COVID-19 cases), and chest ultrasound in two studies (32 COVID-19 cases). All evaluations of chest X-ray and ultrasound were conducted in studies with confirmed diagnoses only. Twenty-five per cent (21/84) of all studies were available only as preprints, 15/71 studies in the confirmed cases group and 6/13 of the studies in the suspected group. Among 71 studies that included confirmed cases, 41 studies had included symptomatic cases only, 25 studies had included cases regardless of their symptoms, five studies had included asymptomatic cases only, three of which included a combination of confirmed and suspected cases. Seventy studies were conducted in Asia, 2 in Europe, 2 in North America and one in South America. Fifty-one studies included inpatients while the remaining 24 studies were conducted in mixed or unclear settings. Risk of bias was high in most studies, mainly due to concerns about selection of participants and applicability. Among the 13 studies that included suspected cases, nine studies were conducted in Asia, and one in Europe. Seven studies included inpatients while the remaining three studies were conducted in mixed or unclear settings. In studies that included confirmed cases the pooled sensitivity of chest CT was 93.1% (95%CI: 90.2 - 95.0 (65 studies, 5759 cases); and for X-ray 82.1% (95%CI: 62.5 to 92.7 (9 studies, 682 cases). Heterogeneity judged by visual assessment of the ROC plots was considerable. Two studies evaluated the diagnostic accuracy of point-of-care ultrasound and both reported zero false negatives (with 10 and 22 participants having undergone ultrasound, respectively). These studies only reported True Positive and False Negative data, therefore it was not possible to pool and derive estimates of specificity. In studies that included suspected cases, the pooled sensitivity of CT was 86.2% (95%CI: 71.9 to 93.8 (13 studies, 2346 participants) and specificity was 18.1% (95%CI: 3.71 to 55.8). Heterogeneity judged by visual assessment of the forest plots was high. Chest CT may give approximately the same proportion of positive results for patients with and without a SARS-CoV-2 infection: the chances of getting a positive CT result are 86% (95% CI: 72 to 94) in patient with a SARS-CoV-2 infection and 82% (95% CI: 44 to 96) in patients without. AUTHORS' CONCLUSIONS: The uncertainty resulting from the poor study quality and the heterogeneity of included studies limit our ability to confidently draw conclusions based on our results. Our findings indicate that chest CT is sensitive but not specific for the diagnosis of COVID-19 in suspected patients, meaning that CT may not be capable of differentiating SARS-CoV-2 infection from other causes of respiratory illness. This low specificity could also be the result of the poor sensitivity of the reference standard (RT-PCR), as CT could potentially be more sensitive than RT-PCR in some cases. Because of limited data, accuracy estimates of chest X-ray and ultrasound of the lungs for the diagnosis of COVID-19 should be carefully interpreted. Future diagnostic accuracy studies should avoid cases-only studies and pre-define positive imaging findings. Planned updates of this review will aim to: increase precision around the accuracy estimates for CT (ideally with low risk of bias studies); obtain further data to inform accuracy of chest X rays and ultrasound; and continue to search for studies that fulfil secondary objectives to inform the utility of imaging along different diagnostic pathways.


Subject(s)
Betacoronavirus , Clinical Laboratory Techniques/methods , Coronavirus Infections/diagnostic imaging , Pneumonia, Viral/diagnostic imaging , Adult , COVID-19 , COVID-19 Testing , Child , Coronavirus Infections/diagnosis , Humans , Lung/diagnostic imaging , Pandemics , Radiography, Thoracic/statistics & numerical data , SARS-CoV-2 , Sensitivity and Specificity , Tomography, X-Ray Computed/statistics & numerical data , Ultrasonography/statistics & numerical data
6.
Cochrane Database Syst Rev ; 7: CD012022, 2020 07 31.
Article in English | MEDLINE | ID: mdl-32735048

ABSTRACT

BACKGROUND: Chronic lymphocytic leukaemia (CLL) is the most common cancer of the lymphatic system in Western countries. Several clinical and biological factors for CLL have been identified. However, it remains unclear which of the available prognostic models combining those factors can be used in clinical practice to predict long-term outcome in people newly-diagnosed with CLL. OBJECTIVES: To identify, describe and appraise all prognostic models developed to predict overall survival (OS), progression-free survival (PFS) or treatment-free survival (TFS) in newly-diagnosed (previously untreated) adults with CLL, and meta-analyse their predictive performances. SEARCH METHODS: We searched MEDLINE (from January 1950 to June 2019 via Ovid), Embase (from 1974 to June 2019) and registries of ongoing trials (to 5 March 2020) for development and validation studies of prognostic models for untreated adults with CLL. In addition, we screened the reference lists and citation indices of included studies. SELECTION CRITERIA: We included all prognostic models developed for CLL which predict OS, PFS, or TFS, provided they combined prognostic factors known before treatment initiation, and any studies that tested the performance of these models in individuals other than the ones included in model development (i.e. 'external model validation studies'). We included studies of adults with confirmed B-cell CLL who had not received treatment prior to the start of the study. We did not restrict the search based on study design. DATA COLLECTION AND ANALYSIS: We developed a data extraction form to collect information based on the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). Independent pairs of review authors screened references, extracted data and assessed risk of bias according to the Prediction model Risk Of Bias ASsessment Tool (PROBAST). For models that were externally validated at least three times, we aimed to perform a quantitative meta-analysis of their predictive performance, notably their calibration (proportion of people predicted to experience the outcome who do so) and discrimination (ability to differentiate between people with and without the event) using a random-effects model. When a model categorised individuals into risk categories, we pooled outcome frequencies per risk group (low, intermediate, high and very high). We did not apply GRADE as guidance is not yet available for reviews of prognostic models. MAIN RESULTS: From 52 eligible studies, we identified 12 externally validated models: six were developed for OS, one for PFS and five for TFS. In general, reporting of the studies was poor, especially predictive performance measures for calibration and discrimination; but also basic information, such as eligibility criteria and the recruitment period of participants was often missing. We rated almost all studies at high or unclear risk of bias according to PROBAST. Overall, the applicability of the models and their validation studies was low or unclear; the most common reasons were inappropriate handling of missing data and serious reporting deficiencies concerning eligibility criteria, recruitment period, observation time and prediction performance measures. We report the results for three models predicting OS, which had available data from more than three external validation studies: CLL International Prognostic Index (CLL-IPI) This score includes five prognostic factors: age, clinical stage, IgHV mutational status, B2-microglobulin and TP53 status. Calibration: for the low-, intermediate- and high-risk groups, the pooled five-year survival per risk group from validation studies corresponded to the frequencies observed in the model development study. In the very high-risk group, predicted survival from CLL-IPI was lower than observed from external validation studies. Discrimination: the pooled c-statistic of seven external validation studies (3307 participants, 917 events) was 0.72 (95% confidence interval (CI) 0.67 to 0.77). The 95% prediction interval (PI) of this model for the c-statistic, which describes the expected interval for the model's discriminative ability in a new external validation study, ranged from 0.59 to 0.83. Barcelona-Brno score Aimed at simplifying the CLL-IPI, this score includes three prognostic factors: IgHV mutational status, del(17p) and del(11q). Calibration: for the low- and intermediate-risk group, the pooled survival per risk group corresponded to the frequencies observed in the model development study, although the score seems to overestimate survival for the high-risk group. Discrimination: the pooled c-statistic of four external validation studies (1755 participants, 416 events) was 0.64 (95% CI 0.60 to 0.67); 95% PI 0.59 to 0.68. MDACC 2007 index score The authors presented two versions of this model including six prognostic factors to predict OS: age, B2-microglobulin, absolute lymphocyte count, gender, clinical stage and number of nodal groups. Only one validation study was available for the more comprehensive version of the model, a formula with a nomogram, while seven studies (5127 participants, 994 events) validated the simplified version of the model, the index score. Calibration: for the low- and intermediate-risk groups, the pooled survival per risk group corresponded to the frequencies observed in the model development study, although the score seems to overestimate survival for the high-risk group. Discrimination: the pooled c-statistic of the seven external validation studies for the index score was 0.65 (95% CI 0.60 to 0.70); 95% PI 0.51 to 0.77. AUTHORS' CONCLUSIONS: Despite the large number of published studies of prognostic models for OS, PFS or TFS for newly-diagnosed, untreated adults with CLL, only a minority of these (N = 12) have been externally validated for their respective primary outcome. Three models have undergone sufficient external validation to enable meta-analysis of the model's ability to predict survival outcomes. Lack of reporting prevented us from summarising calibration as recommended. Of the three models, the CLL-IPI shows the best discrimination, despite overestimation. However, performance of the models may change for individuals with CLL who receive improved treatment options, as the models included in this review were tested mostly on retrospective cohorts receiving a traditional treatment regimen. In conclusion, this review shows a clear need to improve the conducting and reporting of both prognostic model development and external validation studies. For prognostic models to be used as tools in clinical practice, the development of the models (and their subsequent validation studies) should adapt to include the latest therapy options to accurately predict performance. Adaptations should be timely.


Subject(s)
Leukemia, Lymphocytic, Chronic, B-Cell/mortality , Models, Theoretical , Adult , Age Factors , Bias , Biomarkers, Tumor , Calibration , Confidence Intervals , Discriminant Analysis , Disease-Free Survival , Female , Genes, p53/genetics , Humans , Immunoglobulin Heavy Chains/genetics , Immunoglobulin Variable Region/genetics , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Male , Neoplasm Staging , Prognosis , Progression-Free Survival , Receptors, Antigen, B-Cell/genetics , Reproducibility of Results , Tumor Suppressor Protein p53/genetics
7.
Stat Methods Med Res ; 28(9): 2768-2786, 2019 09.
Article in English | MEDLINE | ID: mdl-30032705

ABSTRACT

It is widely recommended that any developed-diagnostic or prognostic-prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package "metamisc".


Subject(s)
Meta-Analysis as Topic , Models, Statistical , Research Design , Risk Assessment/methods , Bayes Theorem , Calibration , Humans , Prognosis , Systematic Reviews as Topic
SELECTION OF CITATIONS
SEARCH DETAIL
...