RESUMO
OBJECTIVE: We assessed multiple readers' positive predictive values (PPVs) for ACR BI-RADS 3, 4a, 4b, 4c and 5 masses on ultrasound (US) pre- and post-proposed guidelines. METHODS: This retrospective, IRB-approved study included four American and four non-American readers who assigned BI-RADS categories for US images of 374 biopsy-proved masses. Readers were offered guidelines and re-classified the masses. We assessed readers' abilities to achieve ACR benchmarks BI-RADS categories pre- and post-guidelines. RESULTS: PPVs increased with BI-RADS category. The PPVs pre- and post-guidelines were 6.0% and 4.4% for category 3, 27.3% and 30.5% for category 4a, 49.9% and 51.5% for category 4b, 69.0% and 67.4% for category 4c, and 79.3% and 80.1% for category 5. Readers achieved the PPV benchmark for category 4c, but not for categories 3, 4a, 4b and 5, with no significant improvement after guidelines. Regular BI-RADS 4 subcategory users missed benchmarks by less than non-regular users. CONCLUSION: Pre- and post-guidelines, readers' PPVs increased with BI-RADS categories, ACR PPV benchmarks were achieved in category 4c, missed in other categories, especially in the critical 4a subcategory, where the PPV was too high. BI-RADS 4 subcategory users performed better than non-users. KEY POINTS: ⢠Readers failed to achieve benchmarks for BI-RADS 4 subcategories, especially 4a. ⢠USA and Brazilian readers performed similarly in ACR BI-RADS 4 subcategorization. ⢠Proposed guidelines did not improve overall, USA or Brazilian reader performance. ⢠Regularly BI-RADS 4 subcategory users performed better than did non-users. ⢠US features distinguished between benign and malignant, not BI-RADS 4 subcategories.