Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 443
Filtrar
1.
Pathol Res Pract ; 263: 155599, 2024 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-39362133

RESUMO

Extremely well-differentiated gastric-type adenocarcinoma (EWDGA) is a rare type of gastric cancer composed of deceptively bland-looking malignant cells resembling normal foveolar or pyloric epithelium. The histological features of this tumor have not been recognized by many pathologists, and inter-observer variation studies are lacking. Here, we report seven EWDGAs and inter-observer variation of six preoperative biopsies was evaluated by 11 pathologists in a single institute. Based on the pathological diagnosis of the endoscopic biopsy slides, the average rate of definite malignancy diagnosis was 15.2 %, and the overall diagnostic concordance rate was 34.9 % among 11 pathologists. Microscopically, the surface epithelium was preserved and only a few atypical tumor glands were scattered in most endoscopic biopsies. Structural atypia was minimal, and the tumor glands were barely distinguishable from normal glands. Although nuclear atypia was minimal, enlarged nuclei, relatively large glands with irregular shapes, and abundant cytoplasmic mucin were observed in gastric pinch biopsies. In preoperative biopsies, no cases showed p53 overexpression, and Ki-67 labeling index ranged from 3 % to 35 % and was higher compared to non-neoplastic glands in 3 cases. After gastrectomy, four (57.1 %) patients had advanced gastric cancer and three (42.9 %) had lymph node metastasis. Genomic profiling of the four patients revealed mutations of TP53, BRAF, KRAS, STK11, and MDM2/CCND1 amplification. Immunohistochemistry for p53 was not helpful while Ki-67 may be helpful when staining pattern is distinct from the non-neoplastic mucosa. In conclusion, it is challenging to diagnose EWDGA using biopsy specimens. Recognizing and addressing this rare entity will increase diagnostic accuracy to ensure the early diagnosis of cancer.

2.
Biomater Investig Dent ; 11: 41161, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39228399

RESUMO

Objectives: To assess inter- and intrarater reliability and agreement for measurements of root lengths using multiplanar reconstruction (MPR) in cone beam computed tomography (CBCT) examinations.Furthermore, to determine whether using MPR from different CBCT machines was a reliable and reproducible method for assessment of root length during orthodontic treatment of adolescents. Materials and methods: A total of 40 CBCT examinations obtained before, during and after orthodontic treatment of 14 adolescents, with fixed appliances from a multicentre randomised controlled trial, were used. All roots from the incisors to the first molars were measured by two independent raters and in accordance with a protocol preceded by a multi-step calibration. Reliability was assessed by intra class correlation (ICC). Agreement was assessed by measurement error according to the Dahlberg formula and Bland-Altman plot. Results: The number of repeated measurements varied from 436 to 474 for the different timepoints. Good to excellent inter- and intrarater reliability for different tooth groups and timepoints were shown. Measurement error for inter- and intrarater agreement varied between 0.41 mm and 0.77 mm. The Bland-Altman plot with 95% limits of agreement varied between +1.43 mm and -2.01 mm for different tooth groups and timepoints. Conclusions: The results of this study indicate that CBCT using MPR from different machines is a reproducible method for measuring root length during different phases of orthodontic treatment. When interpreting root shortening measurements in CBCT using MPR for clinical or research purposes, values below 2 mm should be approached with caution, as they may contain measurement errors.

3.
Quant Imaging Med Surg ; 14(9): 6543-6555, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-39281119

RESUMO

Background: Follow-up management of pulmonary nodules is a crucial component of lung cancer screening. Consistency in follow-up recommendations is essential for effective lung cancer screening. This study aimed to assess inter-observer agreement on National Comprehensive Cancer Network (NCCN) guideline-based follow-up recommendation for subsolid nodules from low-dose computed tomography (LDCT) screening. Methods: A retrospective collection of LDCT reports from 2014 to 2017 for lung cancer screening was conducted using the Radiology Information System and keyword searches, focusing on subsolid nodules. A total of 110 LDCT cases containing subsolid nodules were identified. Two senior radiologists provided standardized follow-up recommendation. Follow-up recommendation was categorized into four groups (0-, 3-, 6-, and 12-month). To ensure overall balance and representativeness of the follow-up categories, 60 scans from 60 participants were included (distribution ratio 1:1:2:2). Cases were categorised into follow-up recommendation groups by five observers following NCCN guidelines. Fleiss' kappa statistic was used to evaluate inter-observer agreement. Results: Overall accuracy rate for follow-up recommendation among five observers was 72.3%. Chest radiologists' overall agreement was significantly higher than radiology residents (P<0.01). The overall agreement among the five observers was moderate, with a Fleiss' kappa of 0.437. For all paired readers, the mean Cohen's kappa value was 0.603, with 95% confidence interval (CI) from 0.489 to 0.716. Chest radiologists demonstrated substantial agreement, evidenced by a Cohen's kappa of 0.655 (95% CI: 0.503-0.807). In contrast, the mean Cohen's kappa among radiology residents was 0.533 (95% CI: 0.501-0.565). The majority of cases with discrepancies, accounting for 73.5%, were associated with the same risk-dominant nodules. A higher proportion of part-solid nodule was a risk factor for discrepancies. Of the 600 paired readings, major discrepancies and substantial discrepancies were observed in 27.5% and 4.8% (29/600) of the cases. Conclusions: In subsolid nodules, category evaluation of observer follow-up recommendation based on NCCN guidelines achieved moderate consistency. Disagreements were mainly caused by measurement and type disagreements of identical risk-dominant nodules. Part-solid nodule was a contributor for discrepancies in follow-up recommendation. Major and substantial management discrepancies were 27.5% and 4.8% in the paired evaluations.

4.
J Pathol Transl Med ; 2024 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-39257047

RESUMO

Background: The diagnosis of thyroid neoplasms necessitates the identification of distinct histological features. Various education/hospital centers located in cities across Indonesia likely result in discordances among pathologists when diagnosing thyroid neoplasms. Methods: This study examined the concordance among Indonesian pathologists in assessing nuclear features and capsular and vascular invasion of thyroid tumors. Fifteen pathologists from different centers independently assessed the same 14 digital slides of thyroid tumor specimens. All the specimens were thyroid neoplasms with known BRAFV600E and RAS mutational status, from a single center. We evaluated the pre- and post-training agreement using the Fleiss kappa. The significance of the training was evaluated using a paired T-test. Results: Baseline agreement on nuclear features was slight to fair based on a 3-point scoring system (k = 0.14 to 0.28) and poor to fair based on an eight-point system (k = -0.02 to 0.24). Agreements on vascular (κ = 0.35) and capsular invasion (κ = 0.27) were fair, whereas the estimated molecular type showed substantial agreement (κ = 0.74). Following the training, agreement using the eight-point system significantly improved (p = 0.001). Conclusions: The level of concordance among Indonesian pathologists in diagnosing thyroid neoplasm was relatively poor. Consensus in pathology assessment requires ongoing collaboration and education to refine diagnostic criteria.

6.
Artigo em Inglês, Espanhol | MEDLINE | ID: mdl-39128695

RESUMO

Vertebral compression fractures by osteoporosis (OVF) is usually a diagnostic problem and coincides on the age group of metastatic vertebral compression fractures (MVF). Although radiography is the first diagnostic technique, generally is not accurate for depicting demineralization and soft tissue lesions. Magnetic resonance (MRI) is the diagnostic choice. The most relevant signs are intravertebral fluid collection or fluid signal, other vertebral deformities without oedema and older age. Among the most relevant findings for diagnosis MVF are soft tissue mass and pedicle intensity signal asymmetries. However, reproducibility of these findings in clinical practice is moderate.

7.
Arch Dermatol Res ; 316(8): 543, 2024 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-39162820

RESUMO

Actinic keratosis (AK) is a common precancerous skin condition predominantly affecting older males with fair skin and significant UV exposure. The clinical significance of AK is related to its potential for malignant transformation and progression to squamous cell carcinoma (SCC). Accurate diagnosis of AK is essential for adequate treatment, evaluation of therapeutic efficacy, and mitigating the risk of developing SCC. However, clinician variability due to the subjective nature of current diagnostic tools presents significant challenges to achieving consistent and reliable AK diagnoses. Thus, there is no universally accepted standard for measuring AK.This review evaluates current methods for evaluating and diagnosing AK, focusing on clinician variability through inter- and intraobserver agreement. Eight peer-reviewed studies investigating the reliability of various approaches for AK evaluation show substantial variability in interobserver or intraobserver agreement, with most methods demonstrating only slight to moderate reliability. Some suggest that consensus discussions and simplified rating scales can modestly improve diagnostic reliability. However, remaining variability and the lack of a universally accepted standard for measuring AK underscore the need for more robust and standardized diagnostic and evaluation methods.The review emphasizes the need for improved diagnostic tools and standardized methods to enhance the accuracy and reliability of AK assessments. It also proposes applying a novel examination approach using 1,3-dihydroxyacetone (DHA) staining which may improve the visualization and identification of AK lesions. Advancements in these areas have significant potential, promising better clinical practices and patient outcomes in AK management.


Assuntos
Ceratose Actínica , Neoplasias Cutâneas , Humanos , Ceratose Actínica/diagnóstico , Ceratose Actínica/patologia , Ceratose Actínica/terapia , Reprodutibilidade dos Testes , Neoplasias Cutâneas/diagnóstico , Neoplasias Cutâneas/patologia , Variações Dependentes do Observador , Carcinoma de Células Escamosas/diagnóstico , Carcinoma de Células Escamosas/patologia , Pele/patologia , Lesões Pré-Cancerosas/diagnóstico , Lesões Pré-Cancerosas/patologia
8.
Artigo em Inglês, Espanhol | MEDLINE | ID: mdl-38878884

RESUMO

Vertebral compression fractures by osteoporosis (OVF) is usually a diagnostic problem and coincides on the age group of metastatic vertebral compression fractures (MVF). Although radiography is the first diagnostic technique, generally is not accurate for depicting demineralization and soft tissue lesions. Magnetic resonance (MRI) is the diagnostic choice. The most relevant signs are Intravertebral fluid collection or fluid signal, other vertebral deformities without edema and older age. Among the most relevant findings for diagnosis MVF are soft tissue mass and pedicle intensity signal asymmetries. However, reproducibility of these findings in clinical practice is moderate.

9.
Artigo em Inglês | MEDLINE | ID: mdl-38896105

RESUMO

BACKGROUND: Inter-observer agreement for the American Association of Gynecologic Laparoscopists (AAGL) 2021 Endometriosis Classification staging system has not been described. Its predecessor staging system, the revised American Society for Reproductive Medicine (rASRM), has historically demonstrated poor inter-observer agreement. AIMS: We aimed to determine the inter-observer agreement performance of the AAGL 2021 Endometriosis Classification staging system, and compare this with the rASRM staging system. MATERIALS AND METHODS: A database of 317 patients with coded surgical data was retrospectively analysed. Three independent observers allocated AAGL surgical stages (1-4), twice. Observers made their own interpretation of how to apply the tool in the first staging allocation. Consensus rules were then developed for a second staging allocation. RESULTS: First staging allocation: odds ratio (OR) (and 95% CI) for observer 1 to score higher than observer 2 was 8.08 (5.12-12.76). Observer 1 to score higher than observer 3 was 12.98 (7.99-21.11) and observer 2 to score higher than observer 3 was 1.61 (1.03-2.51). This represents poor agreement. Second staging allocation (after consensus): OR for observer 1 to score higher than observer 2 was 1.14 (0.64-2.03), observer 1 to score higher than observer 3 was 1.81 (0.99-3.28) and observer 2 to score higher than observer 3 was 1.59 (0.87-2.89). This represents good agreement. CONCLUSIONS: These findings suggest that in its current format the AAGL 2021 Endometriosis Classification staging system has poor inter-observer agreement, not superior to the rASRM staging system. However, performance improved when additional measures were taken to simplify and clarify areas of ambiguity in interpreting the staging system.

10.
Abdom Radiol (NY) ; 49(7): 2408-2415, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38896248

RESUMO

OBJECTIVES: Magnetic resonance (MR) imaging with secretin stimulation (MR-PFTs) is a non-invasive test for pancreatic exocrine function based on assessing the volume of secreted bowel fluid in vivo. Adoption of this methodology in clinical care and research is largely limited to qualitative assessment of secretion as current methods for secretory response quantification require manual thresholding and segmentation of MR images, which can be time-consuming and prone to interrater variability. We describe novel software (PFTquant) that preprocesses and thresholds MR images, performs heuristic detection of non-bowel fluid objects, and provides the user with intuitive semi-automated tools to segment and quantify bowel fluid in a fast and robust manner. We evaluate the performance of this software on a retrospective set of clinical MRIs. METHODS: Twenty MRIs performed in children (< 18 years) were processed independently by two observers using a manual technique and using PFTquant. Interrater agreement in measured secreted fluid volume was compared using intraclass correlation coefficients, Bland-Altman difference analysis, and Dice similarity coefficients. RESULTS: Interrater reliability of measured bowel fluid secretion using PFTquant was 0.90 (0.76-0.96 95% C.I.) with - 4.5 mL mean difference (-39.4-30.4 mL 95% limits of agreement) compared to 0.69 (0.36-0.86 95% C.I.) with - 0.9 mL mean difference (-77.3-75.5 mL 95% limits of agreement) for manual processing. Dice similarity coefficients were better using PFTquant (0.88 +/- 0.06) compared to manual processing (0.85 +/- 0.10) but not significantly (p = 0.11). Time to process was significantly (p < 0.001) faster using PFTquant (412 +/- 177 s) compared to manual processing (645 +/- 305 s). CONCLUSION: Novel software provides fast, reliable quantification of secreted fluid volume in children undergoing MR-PFTs. Use of the novel software could facilitate wider adoption of quantitative MR-PFTs in clinical care and research.


Assuntos
Imageamento por Ressonância Magnética , Software , Humanos , Imageamento por Ressonância Magnética/métodos , Criança , Masculino , Estudos Retrospectivos , Feminino , Reprodutibilidade dos Testes , Adolescente , Pré-Escolar , Interpretação de Imagem Assistida por Computador/métodos , Testes de Função Pancreática/métodos , Lactente , Secretina , Variações Dependentes do Observador , Pâncreas Exócrino/diagnóstico por imagem
11.
Biochem Med (Zagreb) ; 34(2): 020803, 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38882588

RESUMO

Introduction: Due to high inter-observer variability the 2015 International Council for Standardization in Haematology (ICSH) recommendations state to count band neutrophils as segmented neutrophils in the white blood cell (WBC) differential. However, the inclusion of bands as a separate cell entity within the WBC differential is still widely used in hematology laboratories in Croatia. The aim of this multicentric study was to assess the degree of inter-observer variability in enumerating band neutrophils within the WBC differential among Croatian laboratories. Materials and methods: Seven large Croatian hospital laboratories from different parts of the country participated in the study. In each of 7 participating laboratories, one blood smear, that was flagged by the analyzer as possibly having bands, was evaluated by all personnel participating in the analysis of hematology samples. Between-observer manual smear reproducibility was expressed as coefficient of variation (CV) and calculated using the following formula: CV (%) = (standard deviation (SD)/mean value) x 100%. Results: The CVs (%) and relative band neutrophil counts in participating laboratories were as follows: 15.4% (16-24), 19.2% (16-32), 19.5% (17-40), 21.1% (17-44), 35.0% (8-26), 51.9% (3-29), and remarkably high 62.4% (12-59). For segmented neutrophils CVs were lower, ranging from 7.4% to 32.2%. The CVs did not correlate with the number of staff members in each hospital (P = 0.293). Conclusions: This study revealed very high variability in enumerating band neutrophil count in the blood smear differential among all participants, thus prompting a need for action on a national level.


Assuntos
Neutrófilos , Humanos , Croácia , Projetos Piloto , Contagem de Leucócitos , Neutrófilos/citologia , Variações Dependentes do Observador , Reprodutibilidade dos Testes
12.
Brachytherapy ; 23(4): 421-432, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38845268

RESUMO

PURPOSE: To investigate geometric and dosimetric inter-observer variability in needle reconstruction for temporary prostate brachytherapy. To assess the potential of registrations between transrectal ultrasound (TRUS) and cone-beam computed tomography (CBCT) to support implant reconstructions. METHODS AND MATERIALS: The needles implanted in 28 patients were reconstructed on TRUS by three physicists. Corresponding geometric deviations and associated dosimetric variations to prostate and organs at risk (urethra, bladder, rectum) were analyzed. To account for the found inter-observer variability, various approaches (template-based, probe-based, marker-based) for registrations of CBCT to TRUS were investigated regarding the respective needle transfer accuracy in a phantom study. Three patient cases were examined to assess registration accuracy in-vivo. RESULTS: Geometric inter-observer deviations >1 mm and >3 mm were found for 34.9% and 3.5% of all needles, respectively. Prostate dose coverage (changes up to 7.2%) and urethra dose (partly exceeding given dose constraints) were most affected by associated dosimetric changes. Marker-based and probe-based registrations resulted in the phantom study in high mean needle transfer accuracies of 0.73 mm and 0.12 mm, respectively. In the patient cases, the marker-based approach was the superior technique for CBCT-TRUS fusions. CONCLUSION: Inter-observer variability in needle reconstruction can substantially affect dosimetry for individual patients. Especially marker-based CBCT-TRUS registrations can help to ensure accurate reconstructions for improved treatment planning.


Assuntos
Braquiterapia , Tomografia Computadorizada de Feixe Cônico , Agulhas , Variações Dependentes do Observador , Imagens de Fantasmas , Neoplasias da Próstata , Dosagem Radioterapêutica , Humanos , Masculino , Neoplasias da Próstata/radioterapia , Neoplasias da Próstata/diagnóstico por imagem , Braquiterapia/métodos , Tomografia Computadorizada de Feixe Cônico/métodos , Planejamento da Radioterapia Assistida por Computador/métodos , Ultrassonografia/métodos , Próstata/diagnóstico por imagem , Órgãos em Risco/efeitos da radiação , Radioterapia Guiada por Imagem/métodos , Reto/diagnóstico por imagem
13.
Eur Radiol Exp ; 8(1): 55, 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38705940

RESUMO

BACKGROUND: To evaluate the reproducibility of a vessel-specific minimum cost path (MCP) technique used for lobar segmentation on noncontrast computed tomography (CT). METHODS: Sixteen Yorkshire swine (49.9 ± 4.7 kg, mean ± standard deviation) underwent a total of 46 noncontrast helical CT scans from November 2020 to May 2022 using a 320-slice scanner. A semiautomatic algorithm was employed by three readers to segment the lung tissue and pulmonary arterial tree. The centerline of the arterial tree was extracted and partitioned into six subtrees for lobar assignment. The MCP technique was implemented to assign lobar territories by assigning lung tissue voxels to the nearest arterial tree segment. MCP-derived lobar mass and volume were then compared between two acquisitions, using linear regression, root mean square error (RMSE), and paired sample t-tests. An interobserver and intraobserver analysis of the lobar measurements was also performed. RESULTS: The average whole lung mass and volume was 663.7 ± 103.7 g and 1,444.22 ± 309.1 mL, respectively. The lobar mass measurements from the initial (MLobe1) and subsequent (MLobe2) acquisitions were correlated by MLobe1 = 0.99 MLobe2 + 1.76 (r = 0.99, p = 0.120, RMSE = 7.99 g). The lobar volume measurements from the initial (VLobe1) and subsequent (VLobe2) acquisitions were correlated by VLobe1 = 0.98VLobe2 + 2.66 (r = 0.99, p = 0.160, RSME = 15.26 mL). CONCLUSIONS: The lobar mass and volume measurements showed excellent reproducibility through a vessel-specific assignment technique. This technique may serve for automated lung lobar segmentation, facilitating clinical regional pulmonary analysis. RELEVANCE STATEMENT: Assessment of lobar mass or volume in the lung lobes using noncontrast CT may allow for efficient region-specific treatment strategies for diseases such as pulmonary embolism and chronic thromboembolic pulmonary hypertension. KEY POINTS: • Lobar segmentation is essential for precise disease assessment and treatment planning. • Current methods for segmentation using fissure lines are problematic. • The minimum-cost-path technique here is proposed and a swine model showed excellent reproducibility for lobar mass measurements. • Interobserver agreement was excellent, with intraclass correlation coefficients greater than 0.90.


Assuntos
Pulmão , Animais , Suínos , Pulmão/diagnóstico por imagem , Reprodutibilidade dos Testes , Tomografia Computadorizada por Raios X/métodos , Modelos Animais , Algoritmos
14.
Neuropsychol Rehabil ; : 1-32, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38805592

RESUMO

Goal Attainment Scaling (GAS) is a method for writing person-centred approach evaluation scales that can be used as an outcome measure in clinical or research settings in rehabilitation. To be used in a research setting, it requires a high methodological quality approach. The aim of this study was to explore the feasibility and reliability of the GAS quality rating system, to ensure that GAS scales used as outcome measures are valid and reliable. Secondary objectives were: (1) to compare goal attainment scores' reliability according to how many GAS levels are described in the scale; and (2) to explore if GAS scorings are influenced by who scores goal attainment. The GAS scales analysed here were set collaboratively by 57 cognitively impaired adults clients and their occupational therapist. Goals had to be achieved within an inpatient one-month stay, during which clients participated in an intervention aimed at improving planning skills in daily life. The GAS quality rating system proved to be feasible and reliable. Regarding GAS scores, interrater reliability was higher when only three of the five GAS levels were described, i.e., "three milestone GAS" (0.74-0.92), than when all five levels were described (0.5-0.88), especially when scored by the clients (0.5 -0.88).

15.
Insights Imaging ; 15(1): 104, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589691

RESUMO

OBJECTIVE: The aim of this study was to evaluate and compare reliability, costs, and radiation dose of dual-energy X-ray absorptiometry (DXA) to MRI and CT in measuring muscle mass for the diagnosis of sarcopenia. METHODS: Thirty-four consecutive DXA scans performed in surgically menopausal women from November 2019 until March 2020 were analyzed by two observers. Observers analyzed muscle mass of the lower limbs in every scan twice. Reliability was assessed by calculating inter- and intra-observer variability. Reliability from CT and MRI as well as radiation dose from CT and DXA were collected from literature. Costs for each type of scan were calculated according to the guidelines for economic evaluation of the Dutch National Health Care Institute. RESULTS: The 34 participants had a median age of 58 years (IQR 53-65) and a median body mass index of 24.6 (IQR 21.7-29.7). Inter-observer variability had an intraclass correlation coefficient (ICC) of 0.997 (95% CI 0.994-0.998) with a relative variability of 0.037 ± 0.022%. Regarding intra-observer variability, observer 1 had an ICC of 0.998 (95% CI 0.996-0.999) with a relative variability of 0.019 ± 0.016% and observer 2 had an ICC of 0.997 (95% CI 0.993-0.998) with a relative variability of 0.016 ± 0.011%. DXA costs were €62, CT €77, and MRI €195. The estimated radiation dose of CT was 2.5-3.0 mSv, for DXA this was 2-4 µSv. CONCLUSIONS: DXA has lower costs and a lower radiation dose, with low inter- and intra-observer variability, compared to CT and MRI for assessing lower limb muscle mass. TRIAL REGISTRATION: Netherlands Trial Register; NL8068. CRITICAL RELEVANCE STATEMENT: DXA is a good alternative for CT and MRI in assessing lower limb muscle mass, with lower costs and lower radiation dose, while inter-observer and intra-observer variability are low. KEY POINTS: • Screening for sarcopenia should be optimized as the population ages. • DXA outperformed CT and MRI in the measured metrics. • DXA validity should be further evaluated as an alternative to CT and MRI for sarcopenia evaluation.

16.
Histopathology ; 85(1): 171-181, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38571446

RESUMO

AIMS: Following the increased use of neoadjuvant therapy for pancreatic cancer, grading of tumour regression (TR) has become part of routine diagnostics. However, it suffers from marked interobserver variation, which is mainly ascribed to the subjectivity of the defining criteria of the categories in TR grading systems. We hypothesized that a further cause for the interobserver variation is the use of divergent and nonspecific morphological criteria to identify tumour regression. METHODS AND RESULTS: Twenty treatment-naïve pancreatic cancers and 20 pancreatic cancers treated with neoadjuvant chemotherapy were reviewed by three experienced pancreatic pathologists who, blinded for treatment status, categorized each tumour as treatment-naïve or neoadjuvantly treated, and annotated all tissue areas they considered showing tumour regression. Only 50%-65% of the cases were categorized correctly, and the annotated tissue areas were highly discrepant (only 3%-41% overlap). When the prevalence of various morphological features deemed to indicate TR was compared between treatment-naïve and neoadjuvantly treated tumours, only one pattern, characterized by reduced cancer cell density and prominent stroma affecting a large area of the tumour bed, occurred significantly more frequently, but not exclusively, in the neoadjuvantly treated group. Finally, stromal features, both morphological and biological, were investigated as possible markers for tumour regression, but failed to distinguish TR from native tumour stroma. CONCLUSION: There is considerable divergence in opinion between pathologists when it comes to the identification of tumour regression. Reliable identification of TR is only possible if it is extensive, while lesser degrees of treatment effect cannot be recognized with certainty.


Assuntos
Terapia Neoadjuvante , Neoplasias Pancreáticas , Humanos , Neoplasias Pancreáticas/patologia , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/terapia , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Variações Dependentes do Observador , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Gradação de Tumores
17.
BMC Med Res Methodol ; 24(1): 61, 2024 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-38461273

RESUMO

BACKGROUND: The provision of data sharing statements (DSS) for clinical trials has been made mandatory by different stakeholders. DSS are a device to clarify whether there is intention to share individual participant data (IPD). What is missing is a detailed assessment of whether DSS are providing clear and understandable information about the conditions for data sharing of IPD for secondary use. METHODS: A random sample of 200 COVID-19 clinical trials with explicit DSS was drawn from the ECRIN clinical research metadata repository. The DSS were assessed and classified, by two experienced experts and one assessor with less experience in data sharing (DS), into different categories (unclear, no sharing, no plans, yes but vague, yes on request, yes with specified storage location, yes but with complex conditions). RESULTS: Between the two experts the agreement was moderate to substantial (kappa=0.62, 95% CI [0.55, 0.70]). Agreement considerably decreased when these experts were compared with a third person who was less experienced and trained in data sharing ("assessor") (kappa=0.33, 95% CI [0.25, 0.41]; 0.35, 95% CI [0.27, 0.43]). Between the two experts and under supervision of an independent moderator, a consensus was achieved for those cases, where both experts had disagreed, and the result was used as "gold standard" for further analysis. At least some degree of willingness of DS (data sharing) was expressed in 63.5% (127/200) cases. Of these cases, around one quarter (31/127) were vague statements of support for data sharing but without useful detail. In around half of the cases (60/127) it was stated that IPD could be obtained by request. Only in in slightly more than 10% of the cases (15/127) it was stated that the IPD would be transferred to a specific data repository. In the remaining cases (21/127), a more complex regime was described or referenced, which could not be allocated to one of the three previous groups. As a result of the consensus meetings, the classification system was updated. CONCLUSION: The study showed that the current DSS that imply possible data sharing are often not easy to interpret, even by relatively experienced staff. Machine based interpretation, which would be necessary for any practical application, is currently not possible. Machine learning and / or natural language processing techniques might improve machine actionability, but would represent a very substantial investment of research effort. The cheaper and easier option would be for data providers, data requestors, funders and platforms to adopt a clearer, more structured and more standardised approach to specifying, providing and collecting DSS. TRIAL REGISTRATION: The protocol for the study was pre-registered on ZENODO ( https://zenodo.org/record/7064624#.Y4DIAHbMJD8 ).


Assuntos
Disseminação de Informação , Projetos de Pesquisa , Humanos , Disseminação de Informação/métodos , Consenso , Sistema de Registros
18.
Eur Radiol ; 34(10): 6877-6884, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38488970

RESUMO

BACKGROUND: The Paris classification categorises colorectal polyp morphology. Interobserver agreement for Paris classification has been assessed at optical colonoscopy (OC) but not CT colonography (CTC). We aimed to determine the following: (1) interobserver agreement for the Paris classification using CTC between radiologists; (2) if radiologist experience influenced classification, gross polyp morphology, or polyp size; and (3) the extent to which radiologist classifications agreed with (a) colonoscopy and (b) a combined reference standard. METHODS: Following ethical approval for this non-randomised prospective cohort study, seven radiologists from three hospitals classified 52 colonic polyps using the Paris system. We calculated interobserver agreement using Fleiss kappa and mean pairwise agreement (MPA). Absolute agreement was calculated between radiologists; between CTC and OC; and between CTC and a combined reference standard using all available imaging, colonoscopic, and histopathological data. RESULTS: Overall interobserver agreement between the seven readers was fair (Fleiss kappa 0.33; 95% CI 0.30-0.37; MPA 49.7%). Readers with < 1500 CTC experience had higher interobserver agreement (0.42 (95% CI 0.35-0.48) vs. 0.33 (95% CI 0.25-0.42)) and MPA (69.2% vs 50.6%) than readers with ≥ 1500 experience. There was substantial overall agreement for flat vs protuberant polyps (0.62 (95% CI 0.56-0.68)) with a MPA of 87.9%. Agreement between CTC and OC classifications was only 44%, and CTC agreement with the combined reference standard was 56%. CONCLUSION: Radiologist agreement when using the Paris classification at CT colonography is low, and radiologist classification agrees poorly with colonoscopy. Using the full Paris classification in routine CTC reporting is of questionable value. CLINICAL RELEVANCE STATEMENT: Interobserver agreement for radiologists using the Paris classification to categorise colorectal polyp morphology is only fair; routine use of the full Paris classification at CT colonography is questionable. KEY POINTS: • Overall interobserver agreement for the Paris classification at CT colonography (CTC) was only fair, and lower than for colonoscopy. • Agreement was higher for radiologists with < 1500 CTC experience and for larger polyps. There was substantial agreement when classifying polyps as protuberant vs flat. • Agreement between CTC and colonoscopic polyp classification was low (44%).


Assuntos
Pólipos do Colo , Colonografia Tomográfica Computadorizada , Variações Dependentes do Observador , Humanos , Pólipos do Colo/diagnóstico por imagem , Colonografia Tomográfica Computadorizada/métodos , Estudos Prospectivos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Colonoscopia/métodos , Adulto
19.
Radiother Oncol ; 194: 110196, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38432311

RESUMO

BACKGROUND AND PURPOSE: Studies investigating the application of Artificial Intelligence (AI) in the field of radiotherapy exhibit substantial variations in terms of quality. The goal of this study was to assess the amount of transparency and bias in scoring articles with a specific focus on AI based segmentation and treatment planning, using modified PROBAST and TRIPOD checklists, in order to provide recommendations for future guideline developers and reviewers. MATERIALS AND METHODS: The TRIPOD and PROBAST checklist items were discussed and modified using a Delphi process. After consensus was reached, 2 groups of 3 co-authors scored 2 articles to evaluate usability and further optimize the adapted checklists. Finally, 10 articles were scored by all co-authors. Fleiss' kappa was calculated to assess the reliability of agreement between observers. RESULTS: Three of the 37 TRIPOD items and 5 of the 32 PROBAST items were deemed irrelevant. General terminology in the items (e.g., multivariable prediction model, predictors) was modified to align with AI-specific terms. After the first scoring round, further improvements of the items were formulated, e.g., by preventing the use of sub-questions or subjective words and adding clarifications on how to score an item. Using the final consensus list to score the 10 articles, only 2 out of the 61 items resulted in a statistically significant kappa of 0.4 or more demonstrating substantial agreement. For 41 items no statistically significant kappa was obtained indicating that the level of agreement among multiple observers is due to chance alone. CONCLUSION: Our study showed low reliability scores with the adapted TRIPOD and PROBAST checklists. Although such checklists have shown great value during development and reporting, this raises concerns about the applicability of such checklists to objectively score scientific articles for AI applications. When developing or revising guidelines, it is essential to consider their applicability to score articles without introducing bias.


Assuntos
Inteligência Artificial , Lista de Checagem , Técnica Delphi , Planejamento da Radioterapia Assistida por Computador , Humanos , Planejamento da Radioterapia Assistida por Computador/métodos , Planejamento da Radioterapia Assistida por Computador/normas , Guias de Prática Clínica como Assunto , Viés , Reprodutibilidade dos Testes , Neoplasias/radioterapia
20.
Stat Methods Med Res ; 33(3): 532-553, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38320802

RESUMO

Reliability of measurement instruments providing quantitative outcomes is usually assessed by an intraclass correlation coefficient. When participants are repeatedly measured by a single rater or device, or, are each rated by a different group of raters, the intraclass correlation coefficient is based on a one-way analysis of variance model. When planning a reliability study, it is essential to determine the number of participants and measurements per participant (i.e. number of raters or number of repeated measurements). Three different sample size determination approaches under the one-way analysis of variance model were identified in the literature, all based on a confidence interval for the intraclass correlation coefficient. Although eight different confidence interval methods can be identified, Wald confidence interval with Fisher's large sample variance approximation remains most commonly used despite its well-known poor statistical properties. Therefore, a first objective of this work is comparing the statistical properties of all identified confidence interval methods-including those overlooked in previous studies. A second objective is developing a general procedure to determine the sample size using all approaches since a closed-form formula is not always available. This procedure is implemented in an R Shiny app. Finally, we provide advice for choosing an appropriate sample size determination method when planning a reliability study.


Assuntos
Tamanho da Amostra , Humanos , Reprodutibilidade dos Testes , Variações Dependentes do Observador , Análise de Variância
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA