Pesquisa | Portal Regional da BVS

1.

Defining tumor growth in vestibular schwannomas: a volumetric inter-observer variability study in contrast-enhanced T1-weighted MRI.

Cornelissen, Stefan; Schouten, Sammy M; Langenhuizen, Patrick P J H; Lie, Suan Te; Kunst, Henricus P M; de With, Peter H N; Verheul, Jeroen B.

Neuroradiology ; 2024 Jul 09.

Artigo em Inglês | MEDLINE | ID: mdl-38980343

RESUMO

PURPOSE: For patients with vestibular schwannomas (VS), a conservative observational approach is increasingly used. Therefore, the need for accurate and reliable volumetric tumor monitoring is important. Currently, a volumetric cutoff of 20% increase in tumor volume is widely used to define tumor growth in VS. The study investigates the tumor volume dependency on the limits of agreement (LoA) for volumetric measurements of VS by means of an inter-observer study. METHODS: This retrospective study included 100 VS patients who underwent contrast-enhanced T1-weighted MRI. Five observers volumetrically annotated the images. Observer agreement and reliability was measured using the LoA, estimated using the limits of agreement with the mean (LOAM) method, and the intraclass correlation coefficient (ICC). RESULTS: The 100 patients had a median average tumor volume of 903 mm3 (IQR: 193-3101). Patients were divided into four volumetric size categories based on tumor volume quartile. The smallest tumor volume quartile showed a LOAM relative to the mean of 26.8% (95% CI: 23.7-33.6), whereas for the largest tumor volume quartile this figure was found to be 7.3% (95% CI: 6.5-9.7) and when excluding peritumoral cysts: 4.8% (95% CI: 4.2-6.2). CONCLUSION: Agreement limits within volumetric annotation of VS are affected by tumor volume, since the LoA improves with increasing tumor volume. As a result, for tumors larger than 200 mm3, growth can reliably be detected at an earlier stage, compared to the currently widely used cutoff of 20%. However, for very small tumors, growth should be assessed with higher agreement limits than previously thought.

2.

Reliability of study endpoint adjudication in a pragmatic trial on brain arteriovenous malformations.

Darsaut, Tim E; Benomar, Anass; Magro, Elsa; Gentric, Jean-Christophe; Heppner, Jonathan; Lopez, Camille; Jabre, Roland; Roy, Daniel; Gevry, Guylaine; Raymond, Jean.

Neurochirurgie ; 70(4): 101566, 2024 May 14.

Artigo em Inglês | MEDLINE | ID: mdl-38749318

RESUMO

BACKGROUND: The results of a clinical trial are given in terms of primary and secondary outcomes that are obtained for each patient. Just as an instrument should provide the same result when the same object is measured repeatedly, the agreement of the adjudication of a clinical outcome between various raters is fundamental to interpret study results. The reliability of the adjudication of study endpoints determined by examination of the electronic case report forms of a pragmatic trial has not previously been tested. METHODS: The electronic case report forms of 62/434 (14%) patients selected to be observed in a study on brain AVMs were independently examined twice (4 weeks apart) by 8 raters who judged whether each patient had reached the following study endpoints: (1) new intracranial hemorrhage related to AVM or to treatment; (2) new non-hemorrhagic neurological event; (3) increase in mRS ≥1; (4) serious adverse events (SAE). Inter and intra-rater reliability were assessed using Gwet's AC1 (κG) statistics, and correlations with mRS score using Cramer's V test. RESULTS: There was almost perfect agreement for intracranial hemorrhage (92% agreement; κG = 0.84 (95%CI: 0.76-0.93), and substantial agreement for SAEs (88% agreement; κG = 0.77 (95%CI: 0.67-0.86) and new non-hemorrhagic neurological event (80% agreement; κG = 0.61 (95%CI: 0.50-0.72). Most endpoints correlated (V = 0.21-0.57) with an increase in mRS of ≥1, an endpoint which was itself moderately reliable (76% agreement; κG = 0.54 (95%CI: 0.43-0.64). CONCLUSION: Study endpoints of a pragmatic trial were shown to be reliable. More studies on the reliability of pragmatic trial endpoints are needed.

3.

When Two Eyes Don't Suffice-Learning Difficult Hyperfluorescence Segmentations in Retinal Fundus Autofluorescence Images via Ensemble Learning.

Santarossa, Monty; Beyer, Tebbo Tassilo; Scharf, Amelie Bernadette Antonia; Tatli, Ayse; von der Burchard, Claus; Nazarenus, Jakob; Roider, Johann Baptist; Koch, Reinhard.

J Imaging ; 10(5)2024 May 09.

Artigo em Inglês | MEDLINE | ID: mdl-38786570

RESUMO

Hyperfluorescence (HF) and reduced autofluorescence (RA) are important biomarkers in fundus autofluorescence images (FAF) for the assessment of health of the retinal pigment epithelium (RPE), an important indicator of disease progression in geographic atrophy (GA) or central serous chorioretinopathy (CSCR). Autofluorescence images have been annotated by human raters, but distinguishing biomarkers (whether signals are increased or decreased) from the normal background proves challenging, with borders being particularly open to interpretation. Consequently, significant variations emerge among different graders, and even within the same grader during repeated annotations. Tests on in-house FAF data show that even highly skilled medical experts, despite previously discussing and settling on precise annotation guidelines, reach a pair-wise agreement measured in a Dice score of no more than 63-80% for HF segmentations and only 14-52% for RA. The data further show that the agreement of our primary annotation expert with herself is a 72% Dice score for HF and 51% for RA. Given these numbers, the task of automated HF and RA segmentation cannot simply be refined to the improvement in a segmentation score. Instead, we propose the use of a segmentation ensemble. Learning from images with a single annotation, the ensemble reaches expert-like performance with an agreement of a 64-81% Dice score for HF and 21-41% for RA with all our experts. In addition, utilizing the mean predictions of the ensemble networks and their variance, we devise ternary segmentations where FAF image areas are labeled either as confident background, confident HF, or potential HF, ensuring that predictions are reliable where they are confident (97% Precision), while detecting all instances of HF (99% Recall) annotated by all experts.

4.

Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan.

Koyama, Yoichi; Nakashima, Kazuaki; Orihara, Shunichiro; Tsunoda, Hiroko; Kimura, Fuyo; Uenaka, Natsuki; Ban, Kanako; Michishita, Yukiko; Kanemaki, Yoshihide; Kurihara, Arisa; Tawaraya, Kanae; Taguri, Masataka; Ishikawa, Takashi; Uematsu, Takayoshi.

Breast Cancer ; 31(4): 671-683, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38619787

RESUMO

BACKGROUND: Visual assessment of mammographic breast composition remains the most common worldwide, although subjective variability limits its reproducibility. This study aimed to investigate the inter- and intra-observer variability in qualitative visual assessment of mammographic breast composition through a multi-institutional observer performance study for the first time in Japan. METHODS: This study enrolled 10 Japanese physicians from five different institutions. They used the new Japanese breast-composition classification system 4th edition to subjectively evaluate the breast composition in 200 pairs of right and left normal mediolateral oblique mammograms (number determined using precise sample size calculations) twice, with a 1-month interval (median patient age: 59 years [range 40-69 years]). The primary endpoint of this study was the inter-observer variability using kappa (κ) value. RESULTS: Inter-observer variability for the four and two classes of breast-composition assessment revealed moderate agreement (Fleiss' κ: first and second reading = 0.553 and 0.587, respectively) and substantial agreement (Fleiss' κ: first and second reading = 0.689 and 0.70, respectively). Intra-observer variability for the four and two classes of breast-composition assessment demonstrated substantial agreement (Cohen's κ, median = 0.758) and almost perfect agreement (Cohen's κ, median = 0.813). Assessments of consensus between the 10 physicians and the automated software Volpara® revealed slight agreement (Cohen's κ; first and second reading: 0.104 and 0.075, respectively). CONCLUSIONS: Qualitative visual assessment of mammographic breast composition using the new Japanese classification revealed excellent intra-observer reproducibility. However, persistent inter-observer variability, presenting a challenge in establishing it as the gold standard in Japan.

Assuntos

Neoplasias da Mama , Mamografia , Variações Dependentes do Observador , Humanos , Pessoa de Meia-Idade , Feminino , Mamografia/métodos , Adulto , Japão , Idoso , Reprodutibilidade dos Testes , Neoplasias da Mama/diagnóstico por imagem , Mama/diagnóstico por imagem , Mama/patologia , Médicos , Densidade da Mama

5.

Reproducibility of radiomics quality score: an intra- and inter-rater reliability study.

Akinci D'Antonoli, Tugba; Cavallo, Armando Ugo; Vernuccio, Federica; Stanzione, Arnaldo; Klontzas, Michail E; Cannella, Roberto; Ugga, Lorenzo; Baran, Agah; Fanni, Salvatore Claudio; Petrash, Ekaterina; Ambrosini, Ilaria; Cappellini, Luca Alessandro; van Ooijen, Peter; Kotter, Elmar; Pinto Dos Santos, Daniel; Cuocolo, Renato.

Eur Radiol ; 34(4): 2791-2804, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37733025

RESUMO

OBJECTIVES: To investigate the intra- and inter-rater reliability of the total radiomics quality score (RQS) and the reproducibility of individual RQS items' score in a large multireader study. METHODS: Nine raters with different backgrounds were randomly assigned to three groups based on their proficiency with RQS utilization: Groups 1 and 2 represented the inter-rater reliability groups with or without prior training in RQS, respectively; group 3 represented the intra-rater reliability group. Thirty-three original research papers on radiomics were evaluated by raters of groups 1 and 2. Of the 33 papers, 17 were evaluated twice with an interval of 1 month by raters of group 3. Intraclass coefficient (ICC) for continuous variables, and Fleiss' and Cohen's kappa (k) statistics for categorical variables were used. RESULTS: The inter-rater reliability was poor to moderate for total RQS (ICC 0.30-055, p < 0.001) and very low to good for item's reproducibility (k - 0.12 to 0.75) within groups 1 and 2 for both inexperienced and experienced raters. The intra-rater reliability for total RQS was moderate for the less experienced rater (ICC 0.522, p = 0.009), whereas experienced raters showed excellent intra-rater reliability (ICC 0.91-0.99, p < 0.001) between the first and second read. Intra-rater reliability on RQS items' score reproducibility was higher and most of the items had moderate to good intra-rater reliability (k - 0.40 to 1). CONCLUSIONS: Reproducibility of the total RQS and the score of individual RQS items is low. There is a need for a robust and reproducible assessment method to assess the quality of radiomics research. CLINICAL RELEVANCE STATEMENT: There is a need for reproducible scoring systems to improve quality of radiomics research and consecutively close the translational gap between research and clinical implementation. KEY POINTS: â¢ Radiomics quality score has been widely used for the evaluation of radiomics studies. â¢ Although the intra-rater reliability was moderate to excellent, intra- and inter-rater reliability of total score and point-by-point scores were low with radiomics quality score. â¢ A robust, easy-to-use scoring system is needed for the evaluation of radiomics research.

Assuntos

Radiômica , Leitura , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes

6.

Organ-contour-driven auto-matching algorithm in image-guided radiotherapy.

Kishigami, Yukako; Nakamura, Mitsuhiro; Okamoto, Hiroyuki; Takahashi, Ayaka; Iramina, Hiraku; Sasaki, Makoto; Kawata, Kohei; Igaki, Hiroshi.

J Appl Clin Med Phys ; 25(1): e14220, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37994694

RESUMO

PURPOSE: This study aimed to demonstrate the potential clinical applicability of an organ-contour-driven auto-matching algorithm in image-guided radiotherapy. METHODS: This study included eleven consecutive patients with cervical cancer who underwent radiotherapy in 23 or 25 fractions. Daily and reference magnetic resonance images were converted into mesh models. A weight-based algorithm was implemented to optimize the distance between the mesh model vertices and surface of the reference model during the positioning process. Within the cost function, weight parameters were employed to prioritize specific organs for positioning. In this study, three scenarios with different weight parameters were prepared. The optimal translation and rotation values for the cervix and uterus were determined based on the calculated translations alone or in combination with rotations, with a rotation limit of ±3°. Subsequently, the coverage probabilities of the following two planning target volumes (PTV), an isotropic 5 mm and anisotropic margins derived from a previous study, were evaluated. RESULTS: The percentage of translations exceeding 10 mm varied from 9% to 18% depending on the scenario. For small PTV sizes, more than 80% of all fractions had a coverage of 80% or higher. In contrast, for large PTV sizes, more than 90% of all fractions had a coverage of 95% or higher. The difference between the median coverage with translational positioning alone and that with both translational and rotational positioning was 1% or less. CONCLUSION: This algorithm facilitates quantitative positioning by utilizing a cost function that prioritizes organs for positioning. Consequently, consistent displacement values were algorithmically generated. This study also revealed that the impact of rotational corrections, limited to ±3°, on PTV coverage was minimal.

Assuntos

Radioterapia Guiada por Imagem , Radioterapia de Intensidade Modulada , Feminino , Humanos , Radioterapia Guiada por Imagem/métodos , Dosagem Radioterapêutica , Planejamento da Radioterapia Assistida por Computador/métodos , Radioterapia de Intensidade Modulada/métodos , Algoritmos

7.

Utility of Indigenously Developed Square Grid Method for Evaluation of Tumor-Stroma Ratio and Stromal Tumor-Infiltrating Lymphocytes in Invasive Breast Carcinoma: A Pilot Study.

Kumarguru, B N; Ramaswamy, A S; Arathi, C A; Swathi, D.

Iran J Pathol ; 18(3): 335-346, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37942205

RESUMO

Background & Objective: Invasive breast carcinoma (IBC) is the most commonly diagnosed cancer among women in India. The conventional visual method of evaluation of Tumor-Stroma Ratio (TSR) and Stromal Tumor-Infiltrating Lymphocytes (sTIL) appears to be subjective. The present study aims to evaluate the utility of the indigenously designed square grid method for the evaluation of tumor-stroma ratio and stromal tumor-infiltrating lymphocytes in invasive breast carcinoma by assessing the inter-observer variability. Methods: This was a retrospective study conducted at a rural tertiary care referral institute from July 2018 to June 2020. In each case, microphotographs were taken from 10 representative fields in H&E-stained sections for evaluating TSR in low-power and sTIL in high-power. Both the parameters were evaluated employing an indigenously designed square grid applied onto microphotographs in the power-point slides by making use of principles of the Pythagorean theorem. Both parameters were separately evaluated by two pathologists. Cohen kappa statistics was the statistical tool used to analyze inter-observer variability. Results: Thirty cases were analyzed. Invasive breast carcinoma of no special type (IBC-NST) was the most common histopathological type (26 cases (86.67%)). For TRS evaluation, a Kappa value of 0.78 suggested substantial agreement with an agreement of 91.67%. For sTIL evaluation, a Kappa value of 0.51 suggested moderate agreement with an agreement of 88.33%. The P-values were statistically highly significant (P<0.001). Conclusion: Square grid method is a novel technique for evaluating TSR and sTIL in invasive breast carcinoma. It can be considered an example of the application of Pythagoras' theorem in Pathology.

8.

Concordancia inter e intraobservador en la medida de los agujeros maculares por tomografía de coherencia óptica / Inter- and intra-observer agreement in the measurement of macular holes by optical coherence tomography

Gil-Hernández, I; Vidal-Olivera, L; Alarcón-Correcher, F; López-Montero, A; García-Ibor, F; Ruiz-del Río, N; Duch-Samper, A. M.

Arch. Soc. Esp. Oftalmol ; 98(11): 614-618, nov. 2023. ilus, tab

Artigo em Espanhol | IBECS | ID: ibc-227199

RESUMO

Antecedentes y objetivo El agujero macular de espesor completo (AMEC) es una lesión foveal causada por un defecto del espesor completo de la retina neurosensorial. En su diagnóstico y en la indicación de tratamiento quirúrgico se tiene en cuenta la medida del agujero según la herramienta proporcionada por la OCT. Dicha medida puede ser realizada por varios oftalmólogos a lo largo del seguimiento de un paciente. El objetivo de este estudio es averiguar si existe variabilidad intraindividual e interindividual en dichas mediciones. Material y métodos Revisión retrospectiva de imágenes de b-scans de OCT con diagnóstico de AMEC. Se realizaron mediciones del diámetro mínimo del AMEC mediante la herramienta manual disponible en el DRI-Triton (Topcon, Japón) en las escalas 1:1 y 1:2, en días diferentes, por 2 especialistas en retina y 2 residentes. Se compararon dichas mediciones para valorar la correspondencia interobservador e intraobservador Resultados Se analizan 34 imágenes. Para la variabilidad intraobservador se obtuvo un índice de correlación superior a 0,98 en todos los casos. Para la variabilidad interobservador, el coeficiente de correlación intraclase fue de 0,94 (IC del 95%, 0,91-0,97) para la escala 1:1, y de 0,94 (IC del 95%, 0.91-0,97) para la escala 1:2. Conclusiones Los valores del tamaño de los AMEC medidos por OCT son reproducibles entre oftalmólogos especialistas y residentes y son independientes de la escala de la imagen en la que se realice dicha medición (AU)

Background and objective A full-thickness macular hole (FTMH) is a foveal lesion caused by a defect in the full thickness of the neurosensory retina. Its diagnosis and the indication for surgical treatment take into account the measurement of the hole according to the tool provided by the OCT. This measurement can be performed by several ophthalmologists during the follow-up of a patient. The aim of this study is to find out whether there is intra-individual and inter-individual variability in these measurements. Material and methods Retrospective review of OCT b-scan images with a diagnosis of FTMH. Measurements of the minimum diameter of the FTMH were performed using the hand-held tool available on the DRI-Triton (Topcon, Japan) at 1:1 and 1:2 scales, on different days, by 2retina specialists and 2residents. These measurements were compared to assess inter-observer and intra-observer correspondence. Results Thirty-four images were analysed. For intra-observer variability, a correlation index higher than 0.98 was obtained in all cases. For inter-observer variability, the intra-class correlation coefficient was 0.94 (95% CI: 0.91-0.97) for the 1:1 scale, and 0.94 (95% CI: 0.91-0.97) for the 1:2 scale. Conclusions OCT-measured AMEC size values are reproducible between ophthalmic specialists and residents and are independent of the imaging scale at which the measurement is made (AU)

Assuntos

Humanos , Perfurações Retinianas/diagnóstico por imagem , Variações Dependentes do Observador , Tomografia de Coerência Óptica , Estudos Retrospectivos

9.

Inter-observer variation of target volume delineation for CT-guided cervical cancer brachytherapy.

Elmali, Aysenur; Biltekin, Fatih; Sari, Sezin Yuce; Gultekin, Melis; Yuce, Deniz; Yildiz, Ferah.

J Contemp Brachytherapy ; 15(4): 253-260, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37799120

RESUMO

Purpose: Delineation is a critical and challenging step in radiotherapy planning. Differences in delineation among observers are common, despite the presence of contouring guidelines. This study aimed to identify the inter-observer variability in the target volume delineation of computed tomography (CT)-guided brachytherapy for cervical cancer. Material and methods: Four radiation oncologists (ROs) with different expertise levels delineated high-risk (HR) and intermediate-risk (IR) clinical target volume (CTV) according to GYN GEC-ESTRO recommendations, in a blinded manner on every CT set of ten locally advanced cervical cancer cases. The most experienced RO's contours were determined as the index and used for comparison. Dice similarity coefficient (DSC) and pairwise Hausdorff distance (HD) metrics were applied to compare the overlap and gross deviations of all contours. Results: Median DSC for HR-CTV and IR-CTV were 0.73 and 0.76, respectively, and a good concordance was achieved for both in majority of contours. While there was no difference in DSC measurements for HR-CTV among the three ROs, RO-3 provided improved DSC values for IR-CTV (p = 0.01). Median HD95 was 5.02 mm and 6.83 mm, and median HDave was 1.69 mm and 2.21 mm for HR-CTV and IR-CTV, respectively. There was no significant difference among ROs in HR-CTV for HD95 or HDave; however, IR-CTV value was significantly improved according to RO-3 (p = 0.01). Case-by-case HD analysis showed no significant inter-observer variations, except for two cases. Conclusions: The inter-observer agreement is generally high for target volumes in CT-guided brachytherapy for cervical cancer. The agreement is lower for IR-CTV than HR-CTV. The individual characteristics of each case and different expertise levels of the ROs may have caused the differences. Despite the good concordance for delineation, dosimetric consequences can still be clinically significant.

10.

Inter-observer variability in library plan selection on iterative CBCT and synthetic CT images of cervical cancer patients.

de Hond, Yvonne J M; van Haaren, Paul M A; Verrijssen, An-Sofie E; Tijssen, Rob H N; Hurkmans, Coen W.

J Appl Clin Med Phys ; 24(11): e14170, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37788333

RESUMO

INTRODUCTION: In the Library-of-Plans (LoP) approach, correct plan selection is essential for delivering radiotherapy treatment accurately. However, poor image quality of the cone-beam computed tomography (CBCT) may introduce inter-observer variability and thereby hamper accurate plan selection. In this study, we investigated whether new techniques to improve the CBCT image quality and improve consistency in plan selection, affects the accuracy of LoP selection in cervical cancer patients. MATERIALS AND METHODS: CBCT images of 12 patients were used to investigate the inter-observer variability of plan selection based on different CBCT image types. Six observers were asked to individually select a plan based on clinical X-ray Volumetric Imaging (XVI) CBCT, iterative reconstructed CBCT (iCBCT) and synthetic CTs (sCT). Selections were performed before and after a consensus meeting with the entire group, in which guidelines were created. A scoring by all observers on the image quality and plan selection procedure was also included. For plan selection, Fleiss' kappa (κ) statistical test was used to determine the inter-observer variability within one image type. RESULTS: The agreement between observers was significantly higher on sCT compared to CBCT. The consensus meeting improved the duration and inter-observer variability. In this manuscript, the guidelines attributed the overall results in the plan selection. Before the meeting, the gold standard was selected in 76% of the cases on XVI CBCT, 74% on iCBCT, and 76% on sCT. After the meeting, the gold standard was selected in 83% of the cases on XVI CBCT, 81% on iCBCT, and 90% on sCT. CONCLUSION: The use of sCTs can increase the agreement of plan selection among observers and the gold standard was indicated to be selected more often. It is important that clear guidelines for plan selection are implemented in order to benefit from the increased image quality, accurate selection, and decrease inter-observer variability.

Assuntos

Tomografia Computadorizada de Feixe Cônico Espiral , Neoplasias do Colo do Útero , Feminino , Humanos , Neoplasias do Colo do Útero/diagnóstico por imagem , Neoplasias do Colo do Útero/radioterapia , Variações Dependentes do Observador , Planejamento da Radioterapia Assistida por Computador/métodos , Tomografia Computadorizada de Feixe Cônico/métodos

11.

Automated Breast Density Assessment in MRI Using Deep Learning and Radiomics: Strategies for Reducing Inter-Observer Variability.

Jing, Xueping; Wielema, Mirjam; Monroy-Gonzalez, Andrea G; Stams, Thom R G; Mahesh, Shekar V K; Oudkerk, Matthijs; Sijens, Paul E; Dorrius, Monique D; van Ooijen, Peter M A.

J Magn Reson Imaging ; 2023 Oct 17.

Artigo em Inglês | MEDLINE | ID: mdl-37846440

RESUMO

BACKGROUND: Accurate breast density evaluation allows for more precise risk estimation but suffers from high inter-observer variability. PURPOSE: To evaluate the feasibility of reducing inter-observer variability of breast density assessment through artificial intelligence (AI) assisted interpretation. STUDY TYPE: Retrospective. POPULATION: Six hundred and twenty-one patients without breast prosthesis or reconstructions were randomly divided into training (N = 377), validation (N = 98), and independent test (N = 146) datasets. FIELD STRENGTH/SEQUENCE: 1.5 T and 3.0 T; T1-weighted spectral attenuated inversion recovery. ASSESSMENT: Five radiologists independently assessed each scan in the independent test set to establish the inter-observer variability baseline and to reach a reference standard. Deep learning and three radiomics models were developed for three classification tasks: (i) four Breast Imaging-Reporting and Data System (BI-RADS) breast composition categories (A-D), (ii) dense (categories C, D) vs. non-dense (categories A, B), and (iii) extremely dense (category D) vs. moderately dense (categories A-C). The models were tested against the reference standard on the independent test set. AI-assisted interpretation was performed by majority voting between the models and each radiologist's assessment. STATISTICAL TESTS: Inter-observer variability was assessed using linear-weighted kappa (κ) statistics. Kappa statistics, accuracy, and area under the receiver operating characteristic curve (AUC) were used to assess models against reference standard. RESULTS: In the independent test set, five readers showed an overall substantial agreement on tasks (i) and (ii), but moderate agreement for task (iii). The best-performing model showed substantial agreement with reference standard for tasks (i) and (ii), but moderate agreement for task (iii). With the assistance of the AI models, almost perfect inter-observer variability was obtained for tasks (i) (mean κ = 0.86), (ii) (mean κ = 0.94), and (iii) (mean κ = 0.94). DATA CONCLUSION: Deep learning and radiomics models have the potential to help reduce inter-observer variability of breast density assessment. LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY: Stage 1.

12.

Impact of inter-observer variability on first axillary level dosimetry in breast cancer radiotherapy: An AIRO multi-institutional study.

Leonardi, Maria Cristina; Pepa, Matteo; Zaffaroni, Mattia; Vincini, Maria Giulia; Luraschi, Rosa; Vigorito, Sabrina; Morra, Anna; Dicuonzo, Samantha; Mazzola, Giovanni Carlo; Gerardi, Marianna Alessandra; Zerella, Maria Alessia; Cante, Domenico; Petrucci, Edoardo; Borzì, Giuseppina; Marrocco, Maristella; Chieregato, Matteo; Iadanza, Luciano; Lobefalo, Francesca; Valenti, Marco; Cavallo, Anna; Russo, Serenella; Guernieri, Marika; Malatesta, Tiziana; Meaglia, Ilaria; Liotta, Marco; Palumbo, Isabella; Marcantonini, Marta; Mezzenga, Emilio; Falivene, Sara; Arrichiello, Cecilia; Barbero, Maria Paola; Ivaldi, Giovanni Battista; Catalano, Gianpiero; Vidali, Cristiana; Giannitto, Caterina; Ciabattoni, Antonella; Meattini, Icro; Aristei, Cynthia; Orecchia, Roberto; Cattani, Federica; Jereczek-Fossa, Barbara Alicja.

Tumori ; 109(6): 570-575, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37688419

RESUMO

This study quantified the incidental dose to the first axillary level (L1) in locoregional treatment plan for breast cancer. Eighteen radiotherapy centres contoured L1-L4 on three different patients (P1,2,3), created the L2-L4 planning target volume (single centre planning target volume, SC-PTV) and elaborated a locoregional treatment plan. The L2-L4 gold standard clinical target volume (CTV) along with the gold standard L1 contour (GS-L1) were created by an expert consensus. The SC-PTV was then replaced by the GS-PTV and the incidental dose to GS-L1 was measured. Dosimetric data were analysed with Kruskal-Wallis test. Plans were intensity modulated radiotherapy (IMRT)-based. P3 with 90° arm setup had statistically significant higher L1 dose across the board than P1 and P2, with the mean dose (Dmean) reaching clinical significance. Dmean of P1 and P2 was consistent with the literature (77.4% and 74.7%, respectively). The incidental dose depended mostly on L1 proportion included in the breast fields, underlining the importance of the setup, even in case of IMRT.

Assuntos

Neoplasias da Mama , Radioterapia de Intensidade Modulada , Humanos , Feminino , Neoplasias da Mama/radioterapia , Planejamento da Radioterapia Assistida por Computador , Dosagem Radioterapêutica , Variações Dependentes do Observador , Mama

13.

A comparative study of the inter-observer variability on Gleason grading against Deep Learning-based approaches for prostate cancer.

Marrón-Esquivel, José M; Duran-Lopez, L; Linares-Barranco, A; Dominguez-Morales, Juan P.

Comput Biol Med ; 159: 106856, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37075600

RESUMO

BACKGROUND: Among all the cancers known today, prostate cancer is one of the most commonly diagnosed in men. With modern advances in medicine, its mortality has been considerably reduced. However, it is still a leading type of cancer in terms of deaths. The diagnosis of prostate cancer is mainly conducted by biopsy test. From this test, Whole Slide Images are obtained, from which pathologists diagnose the cancer according to the Gleason scale. Within this scale from 1 to 5, grade 3 and above is considered malignant tissue. Several studies have shown an inter-observer discrepancy between pathologists in assigning the value of the Gleason scale. Due to the recent advances in artificial intelligence, its application to the computational pathology field with the aim of supporting and providing a second opinion to the professional is of great interest. METHOD: In this work, the inter-observer variability of a local dataset of 80 whole-slide images annotated by a team of 5 pathologists from the same group was analyzed at both area and label level. Four approaches were followed to train six different Convolutional Neural Network architectures, which were evaluated on the same dataset on which the inter-observer variability was analyzed. RESULTS: An inter-observer variability of 0.6946 κ was obtained, with 46% discrepancy in terms of area size of the annotations performed by the pathologists. The best trained models achieved 0.826±0.014κ on the test set when trained with data from the same source. CONCLUSIONS: The obtained results show that deep learning-based automatic diagnosis systems could help reduce the widely-known inter-observer variability that is present among pathologists and support them in their decision, serving as a second opinion or as a triage tool for medical centers.

Assuntos

Aprendizado Profundo , Neoplasias da Próstata , Masculino , Humanos , Inteligência Artificial , Gradação de Tumores , Variações Dependentes do Observador , Reprodutibilidade dos Testes

14.

Size measurement of lung nodules on CT: which diameter is most stable to inter-observer variability?

Bianconi, Francesco; Fravolini, Mario Luca; Palumbo, Barbara.

Clin Imaging ; 99: 38-40, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-37060680

RESUMO

Indeterminate lung nodules detected on CT are common findings in the clinical practice, and the correct assessment of their size is critical for patient evaluation and management. We compared the stability of three definitions of nodule diameter (Feret's mean diameter, Martin's mean diameter and area-equivalent diameter) to inter-observer variability on a population of 336 solid nodules from 207 subjects. We found that inter-observer agreement was highest with Martin's mean diameter (intra-class correlation coefficient = 0.977, 95% Confidence interval = 0.977-0.978), followed by area-equivalent diameter (0.972, 0.971-0.973) and Feret's mean diameter (0.965, 0.964-0.966). The differences were statistically significant. In conclusion, although all the three diameter definitions achieved very good inter-observer agreement (ICC > 0.96), Martin's mean diameter was significantly better than the others. Future guidelines may consider adopting Martin's mean diameter as an alternative to the currently used Feret's (caliper) diameter for assessing the size of lung nodules on CT.

Assuntos

Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Tomografia Computadorizada por Raios X , Variações Dependentes do Observador , Pulmão

15.

A pair of deep learning auto-contouring models for prostate cancer patients injected with a radio-transparent versus radiopaque hydrogel spacer.

Wang, Yi; Boyd, Graham; Zieminski, Stephen; Kamran, Sophia C; Zietman, Anthony L; Miyamoto, David T; Kirk, Maxwell C; Efstathiou, Jason A.

Med Phys ; 50(6): 3324-3337, 2023 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-36940384

RESUMO

BACKGROUND: Absorbable hydrogel spacer injected between prostate and rectum is gaining popularity for rectal sparing. The spacer alters patient anatomy and thus requires new auto-contouring models. PURPOSE: To report the development and comprehensive evaluation of two deep-learning models for patients injected with a radio-transparent (model I) versus radiopaque (model II) spacer. METHODS AND MATERIALS: Model I was trained and cross-validated by 135 cases with transparent spacer and tested on 24 cases. Using refined training methods, model II was trained and cross-validated by the same dataset, but with the Hounsfield Unit distribution in the spacer overridden by that obtained from ten cases with opaque spacer. Model II was tested on 64 cases. The models auto-contour eight regions of interest (ROIs): spacer, prostate, proximal seminal vesicles (SVs), left and right femurs, bladder, rectum, and penile bulb. Qualitatively, each auto contour (AC), as well as the composite set, was assessed against manual contour (MC), by a radiation oncologist using a 1 (accepted directly or after minor editing), 2 (accepted after moderate editing), 3 (accepted after major editing), and 4 (rejected) scoring scale. The efficiency gain was characterized by the mean score as nearly complete [1-1.75], substantial (1.75-2.5], meaningful (2.5-3.25], and no (3.25-4.00]. Quantitatively, the geometric similarity between AC and MC was evaluated by dice similarity coefficient (DSC) and mean distance to agreement (MDA), using tolerance recommended by AAPM TG-132 Report. The results by the two models were compared to examine the outcome of the refined training methods. The large number of testing cases for model II allowed further investigation of inter-observer variability in clinical dataset. The correlation between score and DSC/MDA was studied on the ROIs with 10 or more counts of each acceptable score (1, 2, 3). RESULTS: For model I/model II: the mean score was 3.63/1.30 for transparent/opaque spacer, 2.71/2.16 for prostate, 3.25/2.44 for proximal SVs, 1.13/1.02 for both femurs, 2.25/1.25 for bladder, 3.00/2.06 for rectum, 3.38/2.42 for penile bulb, and 2.79/2.20 for the composite set; the mean DSC was 0.52/0.84 for spacer, 0.84/0.85 for prostate, 0.60/0.62 for proximal SVs, 0.94/0.96 for left femur, 0.95/0.96 for right femur, 0.91/0.95 for bladder, 0.81/0.84 for rectum, and 0.65/0.65 for penile bulb; and the mean MDA was 2.9/0.9 mm for spacer, 1.9/1.7 mm for prostate, 2.4/2.3 mm for proximal SVs, 0.8/0.5 mm for left femur, 0.7/0.5 mm for right femur, 1.5/0.9 mm for bladder, 2.3/1.9 mm for rectum, and 2.2/2.2 mm for penile bulb. Model II showed significantly improved scores for all ROIs, and metrics for spacer, femurs, bladder, and rectum. Significant inter-observer variability was only found for prostate. Highly linear correlation between the score and DSC was found for the two qualified ROIs (prostate and rectum). CONCLUSIONS: The overall efficiency gain was meaningful for model I and substantial for model II. The ROIs meeting the clinical deployment criteria (mean score below 3.25, DSC above 0.8, and MDA below 2.5 mm) included prostate, both femurs, bladder and rectum for both models, and spacer for model II.

Assuntos

Aprendizado Profundo , Neoplasias da Próstata , Masculino , Humanos , Hidrogéis , Planejamento da Radioterapia Assistida por Computador/métodos , Neoplasias da Próstata/diagnóstico por imagem , Neoplasias da Próstata/radioterapia , Próstata/diagnóstico por imagem , Próstata/anatomia & histologia

16.

Magnetic resonance imaging organ at risk delineation for nasopharyngeal radiotherapy: Measuring the effectiveness of an educational intervention.

Ryan, Olivia; Dundas, Kylie; Surjan, Yolanda; Elwadia, Doaa; Nguyen, Kimberley; Cardoso, Michael; Kumar, Shivani.

J Med Radiat Sci ; 70 Suppl 2: 59-69, 2023 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-36751021

RESUMO

INTRODUCTION: Magnetic resonance imaging (MRI) demonstrates superior soft tissue contrast and is increasingly being used in radiotherapy planning. This study evaluated the impact of an education workshop in minimising inter-observer variation (IOV) for nasopharyngeal organs at risk (OAR) delineation on MRI. METHODS: Ten observers delineated 14 OARs on 4 retrospective nasopharyngeal MRI data sets. Standard contouring guidelines were provided pre-workshop. Following an education workshop on MRI OAR delineation, observers blinded to their original contours repeated the 14 OAR delineations. For comparison, reference volumes were delineated by two head and neck radiation oncologists. IOV was evaluated using dice similarity coefficient (DSC), Hausdorff distance (HD) and relative volume. Location of largest deviations was evaluated with centroid values. Observer confidence pre- and post-workshop was also recorded using a 6-point Likert scale. The workshop was deemed beneficial for an OAR if ≥50% of observers mean scores improved in any metric and ≥50% of observers' confidence improved. RESULTS: All OARs had ≥50% of observers improve in at least one metric. Base of tongue, larynx, spinal cord and right temporal lobe were the only OARs achieving a mean DSC score of ≥0.7. Base of tongue, left and right lacrimal glands, larynx, left optic nerve and right parotid gland all exhibited statistically significant HD improvements post-workshop (P < 0.05). Brainstem and left and right temporal lobes all had statistically significant relative volume improvements post-workshop (P < 0.05). Post-workshop observer confidence improvement was observed for all OARs (P < 0.001). CONCLUSIONS: The educational workshop reduced IOV and improved observers' confidence when delineating nasopharyngeal OARs on MRI.

Assuntos

Imageamento por Ressonância Magnética , Radioterapia (Especialidade) , Humanos , Estudos Retrospectivos , Pescoço , Órgãos em Risco , Planejamento da Radioterapia Assistida por Computador/métodos , Variações Dependentes do Observador

17.

Interobserver variation in clinical target volume (CTV) delineation for stereotactic radiotherapy to non-spinal bone metastases in prostate cancer: CT, MRI and PET/CT fusion.

Chapman, Ewan Richard; Nicholls, Luke; Suh, Yae-Eun; Khoo, Vincent; Levine, Daniel; Ap Dafydd, Derfel; Van As, Nicholas.

Radiother Oncol ; 180: 109461, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36634852

RESUMO

BACKGROUND AND PURPOSE: The use of SBRT for the treatment of oligometastatic prostate cancer is increasing rapidly. While consensus guidelines are available for non-spinal bone metastases practice continues to vary widely. The aim of this study is to look at inter-observer variability in the contouring of prostate cancer non-spinal bone metastases with different imaging modalities. MATERIALS AND METHODS: 15 metastases from 13 patients treated at our centre were selected. 4 observers independently contoured clinical target volumes (CTV) on planning CT alone, planning CT with MRI fusion, planning CT with PET-CT fusion and planning CT with both MRI and PET-CT fusion combined. The mean inter-observer agreement on each modality was compared by measuring the delineated volume, generalized conformity index (CIgen), and the distance of the centre of mass (dCOM), calculated per metastasis and imaging modality. RESULTS: Mean CTV volume delineated on planning CT with MRI and PET-CT fusion combined was significantly larger compared to other imaging modalities (p = 0.0001). CIgen showed marked variation between modalities with the highest agreement between planning CT + PET-CT (mean CIgen 0.55, range 0.32-0.73) and planning CT + MRI + PET-CT (mean CIgen 0.59, range 0.34-0.73). dCOM showed small variations between imaging modalities but a significantly shorter distance found on planning CT + PET-CT when compared with planning CT + PET-CT + MRI combined (p = 0.03). CONCLUSIONS: Highest consistency in CTV delineation between observers was seen with planning CT + PET-CT and planning CT + PET-CT + MRI combined.

Assuntos

Neoplasias Ósseas , Neoplasias da Próstata , Radiocirurgia , Planejamento da Radioterapia Assistida por Computador , Neoplasias Ósseas/diagnóstico por imagem , Neoplasias Ósseas/radioterapia , Imageamento por Ressonância Magnética , Metástase Neoplásica/diagnóstico por imagem , Metástase Neoplásica/radioterapia , Variações Dependentes do Observador , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias da Próstata/patologia , Neoplasias da Próstata/cirurgia , Tomografia Computadorizada por Raios X , Humanos , Masculino

18.

Classification regularized dimensionality reduction improves ultrasound thyroid nodule diagnostic accuracy and inter-observer consistency.

Dai, Wenli; Cui, Yan; Wang, Peiyi; Wu, Hao; Zhang, Lei; Bian, Yeping; Li, Yingying; Li, Yutao; Hu, Hairong; Zhao, Jiaqi; Xu, Dong; Kong, Dexing; Wang, Yajuan; Xu, Lei.

Comput Biol Med ; 154: 106536, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36708654

RESUMO

PROBLEM: Convolutional Neural Networks (CNNs) for medical image analysis usually only output a probability value, providing no further information about the original image or inter-relationships between different images. Dimensionality Reduction Techniques (DRTs) are used for visualization of high dimensional medical image data, but they are not intended for discriminative classification analysis. AIM: We develop an interactive phenotype distribution field visualization system for medical images to accurately reflect the pathological characteristics of lesions and their similarity to assist radiologists in diagnosis and medical research. METHODS: We propose a novel method, Classification Regularized Uniform Manifold Approximation and Projection (UMAP) referred as CReUMAP, combining the advantages of CNN and DRT, to project the extracted feature vector fused with the malignant probability predicted by a CNN to a two-dimensional space, and then apply a spatial segmentation classifier trained on 2614 ultrasound images for prediction of thyroid nodule malignancy and guidance to radiologists. RESULTS: The CReUMAP embedding correlates well with the TI-RADS categories of thyroid nodules. The parametric version that embeds external test dataset of 303 images in presence of the training data with known pathological diagnosis improves the benign and malignant nodule diagnostic accuracy (p-value = 0.016) and confidence (p-value = 1.902 × 10-6) of eight radiologists of different experience levels significantly as well as their inter-observer agreements (kappa≥0.75). CReUMAP achieve 90.8% accuracy, 92.1% sensitivity and 88.6% specificity in test set. CONCLUSION: CReUMAP embedding is well correlated with the pathological diagnosis of thyroid nodules, and helps radiologists achieve more accurate, confident and consistent diagnosis. It allows a medical center to generate its locally adapted embedding using an already-trained classification model in an updateable manner on an ever-growing local database as long as the extracted feature vectors and predicted diagnostic probabilities of the correspondent classification model can be outputted.

Assuntos

Neoplasias da Glândula Tireoide , Nódulo da Glândula Tireoide , Humanos , Nódulo da Glândula Tireoide/diagnóstico por imagem , Ultrassonografia/métodos , Redes Neurais de Computação , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Probabilidade

19.

Inter- and Intra-Observer Agreement Between Embryologists for Cytoplasmic String Assessment in Day 5/6 Human Blastocysts.

Eastick, Jessica; Venetis, Christos; Cooke, Simon; Chapman, Michael.

Reprod Sci ; 30(6): 1917-1926, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36538029

RESUMO

To investigate inter- and intra-observer agreement in the assessment of cytoplasmic string (CS) by embryologists on day 5/6 human blastocysts using the EmbryoViewer software. This was a prospective study involving five embryologists working between 2019 and 2020. Inter-observer agreement was calculated using assessments performed on 104 day 5/6 blastocysts regarding the presence, number, and location of CS and CS vesicle activity using timelapse videos. Intra-observer agreement was calculated when the same embryologists repeated the observations after a month's break. Inter- and intra-observer agreement was assessed using Fleiss' kappa coefficient and the intra-class correlation coefficient (ICC). The inter-observer agreement on the presence of CS (kappa: 0.477, 95% CI: 0.301-0.639) and their vesicles (kappa: 0.494, 95% CI: 0.345-0.643) was moderate, while the specific characteristics of CS assessment ranged from fair to moderate (kappa scores between: 0.157 and 0.563). The intra-observer agreement indicated an improvement on the level of agreement (kappa scores between: 0.162 and 0.795) compared to the inter-observer agreement. This study has shown a moderate level of inter- and intra-observer agreement when assessing day 5/6 human blastocysts for the presence of CS and their vesicles. When the specific characteristics of CS assessment occurred (such as the number of CS/vesicles) a slight to moderate level of agreement was seen among the embryologists. Agreement of specific characteristics of CS was not optimal, suggesting the need for further training using specifically designed CS quality assurance programme (QAP) modules, to determine if inter- and intra-observer agreement can be improved.

Assuntos

Blastocisto , Humanos , Estudos Prospectivos , Variações Dependentes do Observador , Citoplasma , Reprodutibilidade dos Testes

20.

Intra- and interobserver agreement of rectal cancer staging with MRI.

Kilickap, Gulsum; Dolek, Betul Akdal; Ercan, Karabekir.

Acta Radiol ; 64(5): 1747-1754, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-36476121

RESUMO

BACKGROUND: Reliable preoperative staging of rectal cancers is crucial for treatment decision making. PURPOSE: To assess the intra- and inter-observer agreement of rectal cancer staging, including the sub-categories, with magnetic resonance imaging (MRI). MATERIAL AND METHODS: The study includes 85 patients (35.3% women; mean age = 62.2 ± 11.2 years) who underwent MRI for rectal cancer staging between August 2020 and April 2021. All the stored images were evaluated independently by two radiologists with 10-15 years of experience. For intra-observer agreement, the evaluations were done two months apart. Analyses were made using kappa, prevalence and bias-adjusted kappa (PABAK), and intraclass correlation coefficient (ICC), where appropriate. RESULTS: There was a substantial inter-observer agreement for tumor localization (kappa = 0.665, PABAK = 0.682), mesorectal fascia invasion (kappa = 0.663, PABAK = 0.822), internal and external sphincter involvement (kappa 0.804 and 0.751, PABAK 0.859 and 0.929, respectively), and moderate to substantial agreement for M-staging (kappa = 0.451, PABAK = 0.742) and extramural vascular invasion (kappa = 0.569, PABAK = 0.741). There was also a good inter-observer agreement for T staging and N staging (ICC = 0.862, 95% confidence interval [CI] = 0.788-0.911; and ICC = 0.841, 95% CI = 0.595-0.922, respectively). As expected, intra-observer agreement was better than inter-observer agreement. CONCLUSION: Intra- and inter-observer agreement for MRI staging of rectal cancers using the structured reporting template is good.

Assuntos

Neoplasias Retais , Humanos , Feminino , Pessoa de Meia-Idade , Idoso , Masculino , Estadiamento de Neoplasias , Variações Dependentes do Observador , Neoplasias Retais/diagnóstico por imagem , Neoplasias Retais/patologia , Fáscia/patologia , Imageamento por Ressonância Magnética/métodos , Reprodutibilidade dos Testes

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA