RESUMEN
BACKGROUND@#This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines, named Scientific, Transparent and Applicable Rankings tool (STAR), and test its reliability, validity, and usability.@*METHODS@#This study set up a multidisciplinary working group including guideline methodologists, statisticians, journal editors, clinicians, and other experts. Scoping review, Delphi methods, and hierarchical analysis were used to develop the STAR tool. We evaluated the instrument's intrinsic and interrater reliability, content and criterion validity, and usability.@*RESULTS@#STAR contained 39 items grouped into 11 domains. The mean intrinsic reliability of the domains, indicated by Cronbach's α coefficient, was 0.588 (95% confidence interval [CI]: 0.414, 0.762). Interrater reliability as assessed with Cohen's kappa coefficient was 0.774 (95% CI: 0.740, 0.807) for methodological evaluators and 0.618 (95% CI: 0.587, 0.648) for clinical evaluators. The overall content validity index was 0.905. Pearson's r correlation for criterion validity was 0.885 (95% CI: 0.804, 0.932). The mean usability score of the items was 4.6 and the median time spent to evaluate each guideline was 20 min.@*CONCLUSION@#The instrument performed well in terms of reliability, validity, and efficiency, and can be used for comprehensively evaluating and ranking guidelines.