Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
1.
Radiology ; 311(1): e231991, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38687218

ABSTRACT

Background Digital breast tomosynthesis (DBT) is often inadequate for screening women with a personal history of breast cancer (PHBC). The ongoing prospective Tomosynthesis or Contrast-Enhanced Mammography, or TOCEM, trial includes three annual screenings with both DBT and contrast-enhanced mammography (CEM). Purpose To perform interim assessment of cancer yield, stage, and recall rate when CEM is added to DBT in women with PHBC. Materials and Methods From October 2019 to December 2022, two radiologists interpreted both examinations: Observer 1 reviewed DBT first and then CEM, and observer 2 reviewed CEM first and then DBT. Effects of adding CEM to DBT on incremental cancer detection rate (ICDR), cancer type and node status, recall rate, and other performance characteristics of the primary radiologist decisions were assessed. Results Among the participants (mean age at entry, 63.6 years ± 9.6 [SD]), 1273, 819, and 227 women with PHBC completed year 1, 2, and 3 screening, respectively. For observer 1, year 1 cancer yield was 20 of 1273 (15.7 per 1000 screenings) for DBT and 29 of 1273 (22.8 per 1000 screenings; ICDR, 7.1 per 1000 screenings [95% CI: 3.2, 13.4]) for DBT plus CEM (P < .001). Year 2 plus 3 cancer yield was four of 1046 (3.8 per 1000 screenings) for DBT and eight of 1046 (7.6 per 1000 screenings; ICDR, 3.8 per 1000 screenings [95% CI: 1.0, 7.6]) for DBT plus CEM (P = .001). Year 1 recall rate for observer 1 was 103 of 1273 (8.1%) for (incidence) DBT alone and 187 of 1273 (14.7%) for DBT plus CEM (difference = 84 of 1273, 6.6% [95% CI: 5.3, 8.1]; P < .001). Year 2 plus 3 recall rate was 40 of 1046 (3.8%) for DBT and 92 of 1046 (8.8%) for DBT plus CEM (difference = 52 of 1046, 5.0% [95% CI: 3.7, 6.3]; P < .001). In 18 breasts with cancer detected only at CEM after integration of both observers, 13 (72%) cancers were invasive (median tumor size, 0.6 cm) and eight of nine (88%) with staging were N0. Among 1883 screenings with adequate reference standard, there were three interval cancers (one at the scar, two in axillae). Conclusion CEM added to DBT increased early breast cancer detection each year in women with PHBC, with an accompanying approximately 5.0%-6.6% recall rate increase. Clinical trial registration no. NCT04085510 © RSNA, 2024 Supplemental material is available for this article.


Subject(s)
Breast Neoplasms , Contrast Media , Mammography , Humans , Female , Breast Neoplasms/diagnostic imaging , Mammography/methods , Prospective Studies , Middle Aged , Early Detection of Cancer/methods , Aged , Radiographic Image Enhancement/methods , Breast/diagnostic imaging
4.
Acad Radiol ; 21(4): 445-9, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24314598

ABSTRACT

RATIONALE AND OBJECTIVES: To assess the interaction between the availability of prior examinations and digital breast tomosynthesis (DBT) in decisions to recall a woman during interpretation of mammograms. MATERIALS AND METHODS: Eight radiologists independently interpreted twice 36 mammography examinations, each of which had current and prior full-field digital mammography images (FFDM) and DBT under a Health Insurance Portability and Accountability Act-compliant, institutional review board-approved protocol (written consent waived). During the first reading, three sequential ratings were provided using FFDM only, followed by FFDM + DBT, and then followed by FFDM + DBT + priors. The second reading included FFDM only, then FFDM + priors, and then FFDM + priors + DBT. Twenty-two benign cases clinically recalled, 12 negative/benign examinations (not recalled), and two verified cancer cases were included. Recall recommendations and interaction between the effect of priors and DBT on decisions were assessed (P = .05 significance level) using generalized linear model (PROC GLIMMIX, SAS, version 9.3; SAS Institute, Cary, NC) accounting for case and reader variability. RESULTS: Average recall rates in noncancer cases were significantly reduced (51%; P < .001) with the addition of DBT and with addition of priors (23%; P = .01). In absolute terms, the addition of DBT to FFDM reduced the recall rates from 0.67 to 0.42 and from 0.54 to 0.27 when DBT was available before and after priors, respectively. Recall reductions were from 0.64 to 0.54 and from 0.42 to 0.33 when priors were available before and after DBT, respectively. Regardless of the sequence in presentation, there were no statistically significant interactions between the effect of availability of DBT and priors (P = .80). CONCLUSIONS: Availability of both priors and DBT are independent primary factors in reducing recall recommendations during mammographic interpretations.


Subject(s)
Breast Neoplasms/diagnosis , Diagnostic Errors/prevention & control , Mammography/methods , Radiographic Image Enhancement/methods , Tomography, X-Ray Computed/methods , Adult , Aged , Combined Modality Therapy/methods , False Negative Reactions , Female , Humans , Middle Aged , Observer Variation , Reproducibility of Results , Sensitivity and Specificity
5.
Acad Radiol ; 17(4): 450-5, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20036584

ABSTRACT

RATIONALE AND OBJECTIVES: To compare time to interpretation and diagnostic performance levels during repeat readings of full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) in a retrospective study. MATERIALS AND METHODS: Three experienced radiologists twice interpreted 125 selected examinations, 35 with verified cancers and 90 negative for cancer during a period of 22 months using FFDM alone followed by a combined FFDM + DBT mode. Changes in time to "review and rate" these examinations as well as in diagnostic performance levels where assessed. A fixed-effect analysis accounting for cross-correlation due to the review of the same examinations by the same readers was performed. RESULTS: The total (combined) time to review and rate an examination increased on average by 33% between the first and second readings of the same examinations (P < .001). Radiologists reduced their time to review FFDM before making the DBT available for viewing. However, they spent more time reviewing the combined FFDM + DBT mode. The recall rates for examinations depicting cancer remained largely unchanged. Among the groups of examinations with concordant and discordant recall recommendations during the two readings only the group examinations that were "newly recalled" during repeat reading, took significantly longer (P < .01). CONCLUSION: DBT-based breast imaging may ultimately result in a substantial increase in performance; however, without efficiency improvements DBT may take longer to interpret. Addition of "false-positive recalls" was most strongly associated with increase in interpretation time while elimination of "false-positive recalls" did not require longer interpretation time.


Subject(s)
Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Professional Competence/statistics & numerical data , Tomography, X-Ray Computed/methods , Workload/statistics & numerical data , Female , Humans , Male , Observer Variation , Pennsylvania/epidemiology , Reproducibility of Results , Sensitivity and Specificity , Time Factors
6.
Acad Radiol ; 15(12): 1567-73, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19000873

ABSTRACT

RATIONALE AND OBJECTIVES: To investigate consistency of the orders of performance levels when interpreting mammograms under three different reading paradigms. MATERIALS AND METHODS: We performed a retrospective observer study in which nine experienced radiologists rated an enriched set of mammography examinations that they personally had read in the clinic ("individualized") mixed with a set that none of them had read in the clinic ("common set"). Examinations were interpreted under three different reading paradigms: binary using screening Breast Imaging Reporting and Data System (BI-RADS), receiver-operating characteristic (ROC), and free-response ROC (FROC). The performance in discriminating between cancer and noncancer findings under each of the paradigms was summarized using Youden's index/2+0.5 (Binary), nonparameteric area under the ROC curve (AUC), and an overall FROC index (JAFROC-2). Pearson correlation coefficients were then computed to assess consistency in the ordering of observers' performance levels. Statistical significance of the computed correlation coefficients was assessed using bootstrap confidence intervals obtained by resampling sets of examination-specific observations. RESULTS: All but one of the computed pair-wise correlation coefficients were larger than 0.66 and were significantly different from zero. The correlation between the overall performance measures under the Binary and ROC paradigms was the lowest (0.43) and was not significantly different from zero (95% confidence interval -0.078 to 0.733). CONCLUSION: The use of different evaluation paradigms in the laboratory tends to lead to consistent ordering of the overall performance levels of observers. However, one should recognize that conceptually similar performance indexes resulting from different paradigms often measure different performance characteristics and thus disagreements are not only possible but frequently quite natural.


Subject(s)
Breast Neoplasms/diagnostic imaging , Data Interpretation, Statistical , Image Interpretation, Computer-Assisted/methods , Mammography/methods , Observer Variation , Professional Competence , Task Performance and Analysis , Female , Humans , ROC Curve , Reproducibility of Results , Sensitivity and Specificity
7.
Med Phys ; 35(10): 4404-9, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18975686

ABSTRACT

The authors investigated radiologists, performances during retrospective interpretation of screening mammograms when using a binary decision whether to recall a woman for additional procedures or not and compared it with their receiver operating characteristic (ROC) type performance curves using a semi-continuous rating scale. Under an Institutional Review Board approved protocol nine experienced radiologists independently rated an enriched set of 155 examinations that they had not personally read in the clinic, mixed with other enriched sets of examinations that they had individually read in the clinic, using both a screening BI-RADS rating scale (recall/not recall) and a semi-continuous ROC type rating scale (0 to 100). The vertical distance, namely the difference in sensitivity levels at the same specificity levels, between the empirical ROC curve and the binary operating point were computed for each reader. The vertical distance averaged over all readers was used to assess the proximity of the performance levels under the binary and ROC-type rating scale. There does not appear to be any systematic tendency of the readers towards a better performance when using either of the two rating approaches, namely four readers performed better using the semi-continuous rating scale, four readers performed better with the binary scale, and one reader had the point exactly on the empirical ROC curve. Only one of the nine readers had a binary "operating point" that was statistically distant from the same reader's empirical ROC curve. Reader-specific differences ranged from -0.046 to 0.128 with an average width of the corresponding 95% confidence intervals of 0.2 and p-values ranging for individual readers from 0.050 to 0.966. On average, radiologists performed similarly when using the two rating scales in that the average distance between the run in individual reader's binary operating point and their ROC curve was close to zero. The 95% confidence interval for the fixed-reader average (0.016) was (-0.0206, 0.0631) (two-sided p-value 0.35). In conclusion the authors found that in retrospective observer performance studies the use of a binary response or a semi-continuous rating scale led to consistent results in terms of performance as measured by sensitivity-specificity operating points.


Subject(s)
Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Mammography/statistics & numerical data , Radiographic Image Interpretation, Computer-Assisted/methods , Radiographic Image Interpretation, Computer-Assisted/statistics & numerical data , Task Performance and Analysis , Female , Humans , Laboratories , Observer Variation , Pennsylvania/epidemiology , Reproducibility of Results , Sensitivity and Specificity
8.
Radiology ; 249(1): 47-53, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18682584

ABSTRACT

PURPOSE: To compare radiologists' performance during interpretation of screening mammograms in the clinic with their performance when reading the same mammograms in a retrospective laboratory study. MATERIALS AND METHODS: This study was conducted under an institutional review board-approved, HIPAA-compliant protocol; the need for informed consent was waived. Nine experienced radiologists rated an enriched set of mammograms that they had personally read in the clinic (the "reader-specific" set) mixed with an enriched "common" set of mammograms that none of the participants had previously read in the clinic by using a screening Breast Imaging Reporting and Data System (BI-RADS) rating scale. The original clinical recommendations to recall the women for a diagnostic work-up, for both reader-specific and common sets, were compared with their recommendations during the retrospective experiment. The results are presented in terms of reader-specific and group-averaged sensitivity and specificity levels and the dispersion (spread) of reader-specific performance estimates. RESULTS: On average, the radiologists' performance was significantly better in the clinic than in the laboratory (P = .035). Interreader dispersion of the computed performance levels was significantly lower during the clinical interpretations (P < .01). CONCLUSION: Retrospective laboratory experiments may not represent either expected performance levels or interreader variability during clinical interpretations of the same set of mammograms in the clinical environment well.


Subject(s)
Clinical Competence , Mammography , Female , Humans , Laboratories , Mammography/standards , Retrospective Studies , Sensitivity and Specificity
9.
Radiology ; 238(3): 793-800, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16505392

ABSTRACT

PURPOSE: To prospectively survey women undergoing screening mammography to assess their attitudes toward and preference for the level of recall rates given the possibility that an increase in recall rates may result in earlier detection of cancer. MATERIALS AND METHODS: This HIPAA-compliant survey was performed with an institutional review board-approved protocol. Women who arrived for their routine screening mammographic examination from November 2004 to March 2005 were informed before they consented to participate. The distribution of responses for each survey question was summarized, and proportions for the entire group and different subgroups were computed. The z score statistic was used to assess significant differences between subgroups. RESULTS: Fifteen hundred seventy anonymized questionnaires were collected; 1171 (75%) were from women between 40 and 59 years of age. Of 1528 respondents, 1486 (97%) believed that a false-positive result would not deter them from continuing with regular screening, and most would have been willing to be recalled more often for either a noninvasive (86% [1308 of 1519 respondents]) or an invasive (82% [1248 of 1515 respondents]) procedure if it might increase the chance of detecting a cancer (if present) earlier. Compared with respondents undergoing their initial screening mammographic examination, women who had undergone at least one prior screening examination reported that they were more likely to continue with screening if they had received a previous false-positive result (P = .02). Women younger than 60 years and those previously recalled were more willing to be called back more often for a noninvasive or, when indicated, an invasive procedure (P < .05). CONCLUSION: A substantial fraction of women in this study would have preferred the inconvenience of and anxiety associated with a higher recall rate if it resulted in the possibility of detecting breast cancer earlier.


Subject(s)
Attitude to Health , Breast Neoplasms/diagnostic imaging , Mammography , Patient Acceptance of Health Care/psychology , Adult , Anxiety/psychology , Breast Neoplasms/psychology , Early Diagnosis , Female , Humans , Mass Screening , Middle Aged , Prospective Studies , Surveys and Questionnaires
10.
Acad Radiol ; 12(3): 286-90, 2005 Mar.
Article in English | MEDLINE | ID: mdl-15766687

ABSTRACT

RATIONALE AND OBJECTIVE: To evaluate breast radiologists' recognition of mammograms showing cancers that they correctly detected or "missed" during clinical interpretations. MATERIALS AND METHODS: Two similar experiments were conducted. In the first, 33 bilateral screening mammograms were reviewed by four breast imagers. These included five cancers that each radiologist had detected, two cancers that each radiologist had "missed," and five mammograms recalled by other radiologists that were not cancer. Radiologists were asked if they had interpreted the mammogram in clinic and if the mammogram was suspicious for cancer. In the second experiment, four different breast imagers reviewed 48 mammograms that included five cancers that each radiologist had detected, two cancers that each radiologist had "missed," and five mammograms that were recalled by each radiologist but were not cancer. Using chi-square analysis, the performance of the radiologists on screening mammograms they had read in clinic was compared with their performance on mammograms read in clinic by other radiologists. RESULTS: Seven of eight radiologists did not remember interpreting any of the mammograms in clinic. One radiologist correctly remembered interpreting one mammogram in clinic, but interpreted it incorrectly. Average performance showed no significant difference (P = .60) between mammograms they had interpreted in clinic and those interpreted by others. CONCLUSION: Radiologists do not remember most mammograms showing cancer that they have interpreted, either correctly or incorrectly, after they are mixed with mammograms showing cancer that were interpreted by other radiologists. Screening mammograms can be used in observer performance studies in which the interpreting radiologist participates as an observer.


Subject(s)
Breast Neoplasms/diagnostic imaging , Mammography , Memory , Radiology/standards , Diagnostic Errors , Employee Performance Appraisal , False Negative Reactions , False Positive Reactions , Female , Humans , Male , Mammography/standards , Observer Variation , Research Design , Retrospective Studies
11.
Cancer ; 100(8): 1590-4, 2004 Apr 15.
Article in English | MEDLINE | ID: mdl-15073844

ABSTRACT

BACKGROUND: The authors investigated the correlation between recall and detection rates in a group of 10 radiologists who had read a high volume of screening mammograms in an academic institution. METHODS: Practice-related and outcome-related databases of verified cases were used to compute recall rates and tumor detection rates for a group of 10 Mammography Quality Standard Act (MQSA)-certified radiologists who interpreted a total of 98,668 screening mammograms during the years 2000, 2001, and 2002. The relation between recall and detection rates for these individuals was investigated using parametric Pearson (r) and nonparametric Spearman (rho) correlation coefficients. The effect of the volume of mammograms interpreted by individual radiologists was assessed using partial correlations controlling for total reading volumes. RESULTS: A wide variability of recall rates (range, 7.7-17.2%) and detection rates (range, 2.6-5.4 per 1000 mammograms) was observed in the current study. A statistically significant correlation (P < 0.05) between recall and detection rates was observed in this group of 10 experienced radiologists. The results remained significant (P < 0.05) after accounting for the volume of mammograms interpreted by each radiologist. CONCLUSIONS: Optimal performance in screening mammography should be evaluated quantitatively. The general pressure to reduce recall rates through "practice guidelines" to below a fixed level for all radiologists should be assessed carefully.


Subject(s)
Breast Neoplasms/diagnostic imaging , Mammography , Mass Screening , Practice Guidelines as Topic , Databases, Factual , Female , Humans , Observer Variation , Practice Patterns, Physicians'/statistics & numerical data , Quality Assurance, Health Care , Radiology/statistics & numerical data , Sensitivity and Specificity
12.
Acad Radiol ; 10(3): 283-8, 2003 Mar.
Article in English | MEDLINE | ID: mdl-12643555

ABSTRACT

RATIONALE AND OBJECTIVES: The authors evaluated performance changes in the detection of masses on "current" (latest) and "prior" images by computer-aided diagnosis (CAD) schemes that had been optimized with databases of current and prior mammograms. MATERIALS AND METHODS: The authors selected 260 pairs of matched consecutive mammograms. Each current image depicted one or two verified masses. All prior images had been interpreted originally as negative or probably benign. A CAD scheme initially detected 261 mass regions and 465 false-positive regions on the current images, and 252 corresponding mass regions (early signs) and 471 false-positive regions on prior images. These regions were divided into two training and two testing databases. The current and prior training databases were used to optimize two CAD schemes with a genetic algorithm. These schemes were evaluated with two independent testing databases. RESULTS: The scheme optimized with current images produced areas under the receiver operating characteristic curve of (0.89 +/- 0.01 and 0.65 +/- 0.02 when tested with current images and prior images, respectively. The scheme optimized with prior images produced areas under the receiver operating characteristic curve of 0.81 +/- 0.02 and 0.71 +/- 0.02 when tested with current images and prior images, respectively. Performance changes for both current and prior testing databases were significant (P < .01) for the two schemes. CONCLUSION: CAD schemes trained with current images do not perform optimally in detecting masses depicted on prior images. To optimize CAD schemes for early detection, it may be important to include in the training database a large fraction of prior images originally reported as negative and later proven to be positive.


Subject(s)
Breast Diseases/diagnostic imaging , Mammography , Radiographic Image Interpretation, Computer-Assisted , Algorithms , Breast Neoplasms/diagnostic imaging , False Positive Reactions , Female , Humans , ROC Curve
SELECTION OF CITATIONS
SEARCH DETAIL
...