Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 46.347
Filter
1.
Langenbecks Arch Surg ; 409(1): 170, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38822883

ABSTRACT

PURPOSE: Perioperative decision making for large (> 2 cm) rectal polyps with ambiguous features is complex. The most common intraprocedural assessment is clinician judgement alone while radiological and endoscopic biopsy can provide periprocedural detail. Fluorescence-augmented machine learning (FA-ML) methods may optimise local treatment strategy. METHODS: Surgeons of varying grades, all performing colonoscopies independently, were asked to visually judge endoscopic videos of large benign and early-stage malignant (potentially suitable for local excision) rectal lesions on an interactive video platform (Mindstamp) with results compared with and between final pathology, radiology and a novel FA-ML classifier. Statistical analyses of data used Fleiss Multi-rater Kappa scoring, Spearman Coefficient and Frequency tables. RESULTS: Thirty-two surgeons judged 14 ambiguous polyp videos (7 benign, 7 malignant). In all cancers, initial endoscopic biopsy had yielded false-negative results. Five of each lesion type had had a pre-excision MRI with a 60% false-positive malignancy prediction in benign lesions and a 60% over-staging and 40% equivocal rate in cancers. Average clinical visual cancer judgement accuracy was 49% (with only 'fair' inter-rater agreement), many reporting uncertainty and higher reported decision confidence did not correspond to higher accuracy. This compared to 86% ML accuracy. Size was misjudged visually by a mean of 20% with polyp size underestimated in 4/6 and overestimated in 2/6. Subjective narratives regarding decision-making requested for 7/14 lesions revealed wide rationale variation between participants. CONCLUSION: Current available clinical means of ambiguous rectal lesion assessment is suboptimal with wide inter-observer variation. Fluorescence based AI augmentation may advance this field via objective, explainable ML methods.


Subject(s)
Colonoscopy , Rectal Neoplasms , Humans , Rectal Neoplasms/pathology , Rectal Neoplasms/surgery , Rectal Neoplasms/diagnostic imaging , Intestinal Polyps/pathology , Intestinal Polyps/surgery , Machine Learning , Male , Fluorescence , Female , Observer Variation
2.
BMC Musculoskelet Disord ; 25(1): 388, 2024 May 18.
Article in English | MEDLINE | ID: mdl-38762738

ABSTRACT

BACKGROUND: A variety of measurement methods and imaging modalities are in use to quantify the morphology of lateral femoral condyle (LFC), but the most reliable method remains elusive in patients with lateral patellar dislocation (LPD). The purpose of this study was to determine the intra- and inter-observer reliability of different measurement methods for evaluating the morphology of LFC on different imaging modalities in patients with LPD. METHODS: Seventy-three patients with LPD were included. Four parameters for quantifying the morphology of LFC were retrospectively measured by three observers on MRI, sagittal CT image, conventional radiograph (CR), and three-dimensional CT (3D-CT). The intra-class correlation coefficient was calculated to determine the intra- and inter-observer reliability. Bland-Altman analysis was conducted to identify the bias between observers. RESULTS: The lateral femoral condyle index (LFCI) showed better intra- and inter-observer reliability on MRI and 3D-CT than on CR and sagittal CT images. The mean difference in the LFCI between observers was lowest on 3D-CT (0.047), higher on MRI (0.053), and highest on sagittal CT images (0.062). The LFCI was associated with the lateral femoral condyle ratio (ρ = 0.422, P = 0.022), lateral condyle index (r = 0.413, P = 0.037), and lateral femoral condyle distance (r = 0.459, P = 0.014). The LFCI could be reliably measured by MRI and 3D-CT. CONCLUSION: The LFCI could be reliably measured by MRI and 3D-CT. The LFCI was associated with both the height and length of LFC and could serve as a comprehensive parameter for quantifying the morphology of LFC in patients with LPD.


Subject(s)
Femur , Imaging, Three-Dimensional , Magnetic Resonance Imaging , Observer Variation , Patellar Dislocation , Tomography, X-Ray Computed , Humans , Female , Male , Reproducibility of Results , Patellar Dislocation/diagnostic imaging , Magnetic Resonance Imaging/methods , Femur/diagnostic imaging , Retrospective Studies , Young Adult , Adult , Imaging, Three-Dimensional/methods , Adolescent
3.
Early Hum Dev ; 193: 106019, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38718464

ABSTRACT

BACKGROUND: Prechtl's General Movement Assessment (GMA) at fidgety age (3-5 months) is a widely used tool for early detection of cerebral palsy. Further to GMA classification, detailed assessment of movement patterns at fidgety age is conducted with the Motor Optimality Score-Revised (MOS-R). Inter-rater reliability and agreement are properties that inform test application and interpretation in clinical and research settings. This study aims to establish the inter-rater reliability and agreement of the GMA classification and MOS-R in a large population-based sample. METHODS: A cross-sectional study of 773 infants from birth-cohort in Perth, Western Australia. GMA was conducted on home-recorded videos collected between 12 + 0 and 16 + 6 weeks post term age. Videos were independently scored by two masked experienced assessors. Inter-rater reliability and agreement were assessed using intraclass correlation coefficient and limits of agreement respectively for continuous variables, and Cohen's Kappa and Gwet's Agreement Coefficient, and percentage agreement respectively for discrete variables. RESULTS: The classification of GMA showed almost perfect reliability (AC1 = 0.999) and agreement (99.9 %). Total MOS-R scores showed good-excellent reliability (ICC 0.857, 95 % CI 0.838-0.876) and clinically acceptable agreement (95 % limits of agreement of ±2.5 points). Substantial to almost perfect reliability and agreement were found for all MOS-R domain subscores. While MOS-R domains with higher redundancy in their categorisation have higher reliability and agreement, inter-rater reliability and agreement are substantial to almost perfect at the item level and are consistent across domains. CONCLUSION: GMA at fidgety age shows clinically acceptable inter-rater reliability and agreement for GMA classification and MOS-R for population-based cohorts assessed by experienced assessors.


Subject(s)
Cerebral Palsy , Observer Variation , Humans , Female , Cerebral Palsy/diagnosis , Cerebral Palsy/physiopathology , Male , Infant , Reproducibility of Results , Movement/physiology , Cross-Sectional Studies , Western Australia , Motor Skills/physiology
4.
Clin Radiol ; 79(7): e957-e962, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38693034

ABSTRACT

AIM: The comparison between chest x-ray (CXR) and computed tomography (CT) images is commonly required in clinical practice to assess the evolution of chest pathological manifestations. Intrinsic differences between the two techniques, however, limit reader confidence in such a comparison. CT average intensity projection (AIP) reconstruction allows obtaining "synthetic" CXR (s-CXR) images, which are thought to have the potential to increase the accuracy of comparison between CXR and CT imaging. We aim at assessing the diagnostic performance of s-CXR imaging in detecting common pleuro-parenchymal abnormalities. MATERIALS AND METHODS: 142 patients who underwent chest CT examination and CXR within 24 hours were enrolled. CT was the standard of reference. Both conventional CXR (c-CXR) and s-CXR images were retrospectively reviewed for the presence of consolidation, nodule/mass, linear opacities, reticular opacities, and pleural effusion by 3 readers in two separate sessions. Sensitivity, specificity, accuracy and their 95% confidence interval were calculated for each reader and setting and tested by McNemar test. Inter-observer agreement was tested by Cohen's K test and its 95%CI. RESULTS: Overall, s-CXR sensitivity ranged 45-67% for consolidation, 12-28% for nodule/mass, 17-33% for linear opacities, 2-61% for reticular opacities, and 33-58% for pleural effusion; specificity 65-83%, 83-94%, 94-98%, 93-100% and 79-86%; accuracy 66-68%, 74-79%, 89-91%, 61-65% and 68-72%, respectively. K values ranged 0.38-0.50, 0.05-0.25, -0.05-0.11, -0.01-0.15, and 0.40-0.66 for consolidation, nodule/mass, linear opacities, reticular opacities, and pleural effusion, respectively. CONCLUSION: S-CXR images, reconstructed with AIP technique, can be compared with conventional images in clinical practice and for educational purposes.


Subject(s)
Radiography, Thoracic , Sensitivity and Specificity , Tomography, X-Ray Computed , Humans , Male , Female , Tomography, X-Ray Computed/methods , Middle Aged , Retrospective Studies , Aged , Radiography, Thoracic/methods , Adult , Aged, 80 and over , Radiographic Image Interpretation, Computer-Assisted/methods , Pleural Diseases/diagnostic imaging , Reproducibility of Results , Observer Variation
5.
Sci Rep ; 14(1): 12133, 2024 05 27.
Article in English | MEDLINE | ID: mdl-38802436

ABSTRACT

Epithelial ovarian cancer is mostly discovered at the stage of peritoneal carcinosis. Complete cytoreductive surgery improves overall survival. The Fagotti score is a predictive score of resectability based on peritoneal laparoscopic exploratory. Our aim was to study the inter-observer concordance in an external validation of the Fagotti score. An observational, prospective, multicenter study was conducted using the Francogyn research network. The primary outcome was inter-observer concordance of the Fagotti score. 15 patients in which an ovarian mass was discovered were included. For each patient, the first exploratory laparoscopy before any treatment/chemotherapy was recorded. This bank of 15 videos was subject to blind review accompanied by a Fagotti score rating by 11 gynecological surgeons specializing in oncology. A total of 165 blind reviews were performed. Inter-observer concordance was very good for the Fagotti score with an intraclass correlation coefficient (ICC) of 0.83 [95% CI 0.71; 0.93]. Inter-observer concordance for the adjusted Fagotti score, which accounts for unexplorable areas with extensive carcinomatosis, resulted in an ICC of 0.64 [95% CI 0.46; 0.82]. According to the reviewers, the three least explorable parameters were mesentery involvement, stomach infiltration and liver damage. The ICC of the explorable Fagotti score, i.e. score with deletion of the parameters most often unexplored by laparoscopy, was 0.86 [0.75-0.94]. This study confirms the reproducibility of the Fagotti score during first assessment laparoscopies in cases of advanced ovarian cancer. The explorable Fagotti score has an equivalent or better inter-observer concordance than the Fagotti score.


Subject(s)
Ovarian Neoplasms , Humans , Female , Ovarian Neoplasms/pathology , Ovarian Neoplasms/mortality , Ovarian Neoplasms/surgery , Middle Aged , Prospective Studies , Aged , Laparoscopy , Observer Variation , Cytoreduction Surgical Procedures , Carcinoma, Ovarian Epithelial/pathology , Carcinoma, Ovarian Epithelial/mortality , Adult , Reproducibility of Results
6.
Invest Ophthalmol Vis Sci ; 65(5): 38, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38787547

ABSTRACT

Purpose: Visual snow is the hallmark of the neurological condition visual snow syndrome (VSS) but the characteristics of the visual snow percept remain poorly defined. This study aimed to quantify its appearance, interobserver variability, and effect on measured visual performance and self-reported visual quality. Methods: Twenty-three participants with VSS estimated their visual snow dot size, separation, luminance, and flicker rate by matching to a simulation. To assess whether visual snow masks vision, we compared pattern discrimination thresholds for textures that were similar in spatial scale to visual snow as well as more coarse than visual snow, in participants with VSS, and with and without external noise simulating visual snow in 23 controls. Results: Mean and 95% confidence intervals for visual snow appearance were: size (6.0, 5.8-6.3 arcseconds), separation (2.0, 1.7-2.3 arcmin), luminance (72.4, 58.1-86.8 cd/m2), and flicker rate (25.8, 18.9-32.8 frames per image at 120 hertz [Hz]). Participants with finer dot spacing estimates also reported greater visibility of their visual snow (τb = -0.41, 95% confidence interval [CI] = -0.62 to -0.13, P = 0.01). In controls, adding simulated fine-scale visual snow to textures increased thresholds for fine but not coarse textures (F(1, 22) = 4.98, P = 0.036, ηp2 = 0.19). In VSS, thresholds for fine and coarse textures were similar (t(22) = 0.54, P = 0.60), suggesting that inherent visual snow does not act like external noise in controls. Conclusions: Our quantitative estimates of visual snow constrain its likely neural origins, may aid differential diagnosis, and inform future investigations of how it affects vision. Methods to quantify visual snow are needed for evaluation of potential treatments.


Subject(s)
Visual Acuity , Humans , Male , Female , Adult , Middle Aged , Visual Acuity/physiology , Young Adult , Sensory Thresholds/physiology , Vision Disorders/physiopathology , Vision Disorders/diagnosis , Aged , Visual Perception/physiology , Observer Variation , Pattern Recognition, Visual/physiology , Perceptual Disorders
7.
J Am Heart Assoc ; 13(11): e033723, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38780180

ABSTRACT

BACKGROUND: Studies reporting on the incidence of sudden cardiac arrest and/or death (SCA/D) in athletes commonly lack methodological and reporting rigor, which has implications for screening and preventative policy in sport. To date, there are no tools designed for assessing study quality in studies investigating the incidence of SCA/D in athletes. METHODS AND RESULTS: The International Criteria for Reporting Study Quality for Sudden Cardiac Arrest/Death tool (IQ-SCA/D) was developed following a Delphi process. Sixteen international experts in sports cardiology were identified and invited. Experts voted on each domain with subsequent moderated discussion for successive rounds until consensus was reached for a final tool. Interobserver agreement between a novice, intermediate, and expert observer was then assessed from the scoring of 22 relevant studies using weighted and unweighted κ analyses. The final IQ-SCA/D tool comprises 8 domains with a summated score of a possible 22. Studies are categorized as low, intermediate, and high quality with summated IQ-SCA/D scores of ≤11, 12 to 16, and ≥17, respectively. Interrater agreement was "substantial" between all 3 observers for summated IQ-SCA/D scores and study categorization. CONCLUSIONS: The IQ-SCA/D is an expert consensus tool for assessing the study quality of research reporting the incidence of SCA/D in athletes. This tool may be used to assist researchers, reviewers, journal editors, and readers in contextualizing the methodological quality of different studies with varying athlete SCA/D incidence estimates. Importantly, the IQ-SCA/D also provides an expert-informed framework to support and guide appropriate design and reporting practices in future SCA/D incidence trials.


Subject(s)
Consensus , Death, Sudden, Cardiac , Delphi Technique , Humans , Death, Sudden, Cardiac/epidemiology , Death, Sudden, Cardiac/prevention & control , Death, Sudden, Cardiac/etiology , Incidence , Research Design/standards , Athletes , Sports Medicine/standards , Sports Medicine/methods , Observer Variation
8.
Early Hum Dev ; 193: 106021, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38701668

ABSTRACT

OBJECTIVE: Fetal face measurements in prenatal ultrasound can aid in identifying craniofacial abnormalities in the developing fetus. However, the accuracy and reliability of ultrasound measurements can be affected by factors such as fetal position, image quality, and the sonographer's expertise. This study assesses the accuracy and reliability of fetal facial measurements in prenatal ultrasound. Additionally, the temporal evolution of measurements is studied, comparing prenatal and postnatal measurements. METHODS: Three different experts located up to 23 facial landmarks in 49 prenatal 3D ultrasound scans from normal Caucasian fetuses at weeks 20, 26, and 35 of gestation. Intra- and inter-observer variability was obtained. Postnatal facial measurements were also obtained at 15 days and 1 month postpartum. RESULTS: Most facial landmarks exhibited low errors, with overall intra- and inter-observer errors of 1.01 mm and 1.60 mm, respectively. Landmarks on the nose were found to be the most reliable, while the most challenging ones were those located on the ears and eyes. Overall, scans obtained at 26 weeks of gestation presented the best trade-off between observer variability and landmark visibility. The temporal evolution of the measurements revealed that the lower face area had the highest rate of growth throughout the latest stages of pregnancy. CONCLUSIONS: Craniofacial landmarks can be evaluated using 3D fetal ultrasound, especially those located on the nose, mouth, and chin. Despite its limitations, this study provides valuable insights into prenatal and postnatal biometric changes over time, which could aid in developing predictive models for postnatal measurements based on prenatal data.


Subject(s)
Face , Ultrasonography, Prenatal , Humans , Female , Ultrasonography, Prenatal/methods , Ultrasonography, Prenatal/standards , Face/diagnostic imaging , Face/embryology , Face/anatomy & histology , Pregnancy , Imaging, Three-Dimensional/methods , Longitudinal Studies , Observer Variation , Reproducibility of Results , Adult
9.
Jt Dis Relat Surg ; 35(2): 324-329, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38727111

ABSTRACT

OBJECTIVES: This study aims to evaluate the inter-observer reliability of fibula-condyle-patella angle measurements and to compare it with other measurement techniques. PATIENTS AND METHODS: Between January 01, 2023 and January 31, 2023, a total of 108 patients (20 males, 88 females; mean age: 47.5±12.0 years; range, 18 to 72 years) who underwent X-rays using the fibula-condyle-patella angle, Insall-Salvati, Caton-Deschamps, Blackburne-Pell, and plateau-patella angle (PPA) methods were retrospectively analyzed. Knee lateral radiographs taken in at least 30 degrees of flexion and appropriate rotation were scanned. All measurements were made by two orthopedic surgeons who were blinded to measurement methods. RESULTS: Right knee patellar height measurements were conducted in 56 patients, while left knee patellar heights were assessed in 52 patients. The highest inter-observer concordance was found in the fibula-condyle-patella angle. The second highest concordance was found in the Insall-Salvati. The highest concordance correlation was found with PPA in the measurements of both researchers. CONCLUSION: The fibula-condyle-patella angle is a reliable technique with a good inter-observer reliability for measuring patellar height. We believe that this study will inspire future research to establish comprehensive reference values for clinical applications.


Subject(s)
Fibula , Observer Variation , Patella , Humans , Female , Male , Fibula/diagnostic imaging , Fibula/anatomy & histology , Adult , Patella/diagnostic imaging , Patella/anatomy & histology , Middle Aged , Aged , Retrospective Studies , Adolescent , Young Adult , Reproducibility of Results , Radiography/methods , Knee Joint/diagnostic imaging , Knee Joint/anatomy & histology
10.
Clin Transplant ; 38(6): e15335, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38804610

ABSTRACT

BACKGROUND: Antibody-mediated rejection (AMR) often leads to chronic kidney allograft damage and is a critical cause of allograft failure. The Banff classification, used to diagnose AMR, has become complex and challenging for clinicians. A Banff-based histologic chronicity index (CI) was recently proposed as a simplified prognostic indicator. Its reliability and reproducibility have not been externally validated. METHODS: This study investigated 71 kidney allograft biopsies diagnosed with AMR. Interobserver reproducibility of the recently proposed CI and its components (cg, cv, ct, and ci) were assessed. The association between CI and allograft failure was analyzed, and CI cut-off values were evaluated by Cox proportional hazards regression and Kaplan-Meier estimator with log-rank test. RESULTS: The study confirmed the association of CI with allograft failure, but also revealed that the assessment of CI varied between pathologists, impacting its reproducibility as a prognostic tool. Only 49 (69.0%) of the biopsies showed complete agreement on the proposed cut-off value of CI < 4 or CI ≥ 4. Furthermore, this cut-off did not reliably stratify allograft failure. Notably, the cg score, which carries significant weight in the CI calculation, had the lowest agreement between observers (kappa = .281). CONCLUSIONS: While a simplified prognostic indicator for AMR is needed, this study highlights the limitations of CI, particularly its poor interobserver reproducibility. Our findings suggest that clinicians should interpret CI cautiously and consider establishing their own cut-off values. This study underscores the need to address interobserver reproducibility before CI can be widely adopted for AMR management.


Subject(s)
Graft Rejection , Graft Survival , Kidney Transplantation , Observer Variation , Humans , Graft Rejection/pathology , Graft Rejection/etiology , Graft Rejection/diagnosis , Female , Male , Prognosis , Middle Aged , Follow-Up Studies , Reproducibility of Results , Adult , Risk Factors , Retrospective Studies , Glomerular Filtration Rate , Postoperative Complications , Kidney Function Tests
11.
West J Emerg Med ; 25(2): 264-267, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38596928

ABSTRACT

Introduction: The use of a reliable scoring system for quality assessment (QA) is imperative to limit inconsistencies in measuring ultrasound acquisition skills. The current grading scale used for QA endorsed by the American College of Emergency Physicians (ACEP) is non-specific, applies irrespective of the type of study performed, and has not been rigorously validated. Our goal in this study was to determine whether a succinct, organ-specific grading scale designed for lung-specific QA would be more precise with better interobserver agreement. Methods: This was a prospective validation study of an objective QA scale for lung ultrasound (LUS) in the emergency department. We identified the first 100 LUS performed in normal clinical practice in the year 2020. Four reviewers at an urban academic center who were either emergency ultrasound fellowship-trained or current fellows with at least six months of QA experience scored each study, resulting in a total of 400. The primary outcome was the level of agreement between the reviewers. Our secondary outcome was the variability of the scores given to the studies. For the agreement between reviewers, we computed the intraclass correlation coefficient (ICC) based on a two-way random-effect model with a single rater for each grading scale. We generated 10,000 bootstrapped ICCs to construct 95% confidence intervals (CI) for both grading systems. A two-sided one-sample t-test was used to determine whether there were differences in the bootstrapped ICCs between the two grading systems. Results: The ICC between reviewers was 0.552 (95% CI 0.40-0.68) for the ACEP grading scale and 0.703 (95% CI 0.59-0.79) for the novel grading scale (P < 0.001), indicating significantly more interobserver agreement using the novel scale compared to the ACEP scale. The variance of scores was similar (0.93 and 0.92 for the novel and ACEP scales, respectively). Conclusion: We found an increased interobserver agreement between reviewers when using the novel, organ-specific scale when compared with the ACEP grading scale. Increased consistency in feedback based on objective criteria directed to the specific, targeted organ provides an opportunity to enhance learner education and satisfaction with their ultrasound education.


Subject(s)
Emergency Service, Hospital , Lung , Humans , Lung/diagnostic imaging , Prospective Studies , Ultrasonography , Educational Status , Observer Variation , Reproducibility of Results
12.
BMC Psychiatry ; 24(1): 303, 2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38654194

ABSTRACT

BACKGROUND: Facilities providing health- and social services for youth are commonly faced with the need for assessment and management of violent behavior. These providers often experience shortage of resources, compromising the feasibility of conducting comprehensive violence risk assessments. The Violence Risk Assessment Checklist for Youth aged 12-18 (V-RISK-Y) is a 12-item violence risk screening instrument developed to rapidly identify youth at high risk for violent behavior in situations requiring expedient evaluation of violence risk. The V-RISK-Y instrument was piloted in acute psychiatric units for youth, yielding positive results of predictive validity. The aim of the present study was to assess the interrater reliability of V-RISK-Y in child and adolescent psychiatric units and acute child protective services institutions. METHODS: A case vignette study design was utilized to assess interrater reliability of V-RISK-Y. Staff at youth facilities (N = 163) in Norway and Sweden scored V-RISK-Y for three vignettes, and interrater reliability was assessed with the intraclass correlation coefficient (ICC). RESULTS: Results indicate good interrater reliability for the sum score and Low-Moderate-High risk level appraisal across staff from the different facilities and professions. For single items, interrater reliability ranged from poor to excellent. CONCLUSIONS: This study is an important step in establishing the psychometric properties of V-RISK-Y. Findings support the structured professional judgment tradition the instrument is based on, with high agreement on the overall risk assessment. This study had a case vignette design, and the next step is to assess the reliability and validity of V-RISK-Y in naturalistic settings.


Subject(s)
Checklist , Violence , Humans , Adolescent , Violence/psychology , Risk Assessment/methods , Child , Reproducibility of Results , Male , Female , Checklist/standards , Sweden , Observer Variation , Norway , Child Protective Services , Psychometrics
13.
Hum Pathol ; 146: 75-85, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38640986

ABSTRACT

INTRODUCTION: Semi-quantitative scoring of various parameters in renal biopsy is accepted as an important tool to assess disease activity and prognostication. There are concerns on the impact of interobserver variability in its prognostic utility, generating a need for computerized quantification. METHODS: We studied 94 patients with renal biopsies, 45 with native diseases and 49 transplant patients with index biopsies for Polyomavirus nephropathy. Chronicity scores were evaluated using two methods. A standard definition diagram was agreed after international consultation and four renal pathologists scored each parameter in a double-blinded manner. Interstitial fibrosis (IF) score was assessed with five different computerized and AI-based algorithms on trichrome and PAS stains. RESULTS: There was strong prognostic correlation with renal function and graft outcome at a median follow-up ranging from 24 to 42 months respectively, independent of moderate concordance for pathologists scores. IF scores with two of the computerized algorithms showed significant correlation with estimated glomerular filtration rate (eGFR) at biopsy but not at the end of follow-up. There was poor concordance for AI based platforms. CONCLUSION: Chronicity scores are robust prognostic tools despite interobserver reproducibility. AI-algorithms have absolute precision but are limited by significant variation when different hardware and software algorithms are used for quantification.


Subject(s)
Artificial Intelligence , Kidney , Observer Variation , Humans , Biopsy , Reproducibility of Results , Kidney/pathology , Male , Female , Prognosis , Middle Aged , Microscopy/methods , Image Interpretation, Computer-Assisted/methods , Adult , Algorithms , Glomerular Filtration Rate , Fibrosis/pathology , Predictive Value of Tests , Kidney Diseases/pathology , Kidney Diseases/diagnosis , Kidney Transplantation , Aged , Polyomavirus Infections/pathology
14.
Codas ; 36(3): e20230175, 2024.
Article in English | MEDLINE | ID: mdl-38629682

ABSTRACT

PURPOSE: To assess the influence of the listener experience, measurement scales and the type of speech task on the auditory-perceptual evaluation of the overall severity (OS) of voice deviation and the predominant type of voice (rough, breathy or strain). METHODS: 22 listeners, divided into four groups participated in the study: speech-language pathologist specialized in voice (SLP-V), SLP non specialized in voice (SLP-NV), graduate students with auditory-perceptual analysis training (GS-T), and graduate students without auditory-perceptual analysis training (GS-U). The subjects rated the OS of voice deviation and the predominant type of voice of 44 voices by visual analog scale (VAS) and the numerical scale (score "G" from GRBAS), corresponding to six speech tasks such as sustained vowel /a/ and /ɛ/, sentences, number counting, running speech, and all five previous tasks together. RESULTS: Sentences obtained the best interrater reliability in each group, using both VAS and GRBAS. SLP-NV group demonstrated the best interrater reliability in OS judgment in different speech tasks using VAS or GRBAS. Sustained vowel (/a/ and /ɛ/) and running speech obtained the best interrater reliability among the groups of listeners in judging the predominant vocal quality. GS-T group got the best result of interrater reliability in judging the predominant vocal quality. CONCLUSION: The time of experience in the auditory-perceptual judgment of the voice, the type of training to which they were submitted, and the type of speech task influence the reliability of the auditory-perceptual evaluation of vocal quality.


Subject(s)
Dysphonia , Speech Perception , Humans , Speech , Reproducibility of Results , Speech Production Measurement , Observer Variation , Voice Quality , Speech Acoustics
15.
J Orthop Traumatol ; 25(1): 23, 2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38653863

ABSTRACT

BACKGROUND: The exact positioning of the cephalomedullary (CM) nail entry point for managing femoral fractures remains debatable, with significant implications for fracture reduction and postoperative complications. This study aimed to explore the variability in the selection of the entry point among trauma surgeons, hypothesizing potential differences and their association with surgeon experience. METHODS: In this prospective multicenter study, 16 participants, ranging from residents to senior specialists, partook in a simulation wherein they determined the optimal entry point for the implantation of a proximal femoral nail antirotation (PFN-A; DePuy Synthes) in various femora. The inter- and intra-observer variability was calculated, along with comprehensive descriptive statistical analysis, to assess the variability in entry point selection and the impact of surgeon experience. RESULTS: In this study, the mean distance from the selected entry points to the calculated mean entry point was 3.98 mm, with a smaller distance observed among surgeons with more than 500 implantations (ANOVA, p = 0.050). Intra-surgeon variability for identical femora averaged at 5.14 mm, showing no significant differences across various levels of surgical experience or training. Notably, 13.6% of selected entry points would not allow a proper intramedullary positioning of the implant, thereby rendering anatomical repositioning unfeasible. Among these impossible entry points, a significant skew towards anterior placement was observed (70.6% of the impossible entry points), with a smaller fraction being overly lateral (27.5%) or medial (13.7%). On a patient level, the impossibility rate varied widely from 0 to 35% among the different femora examined, with a significantly higher rate seen in younger patients (mean age 55.02 versus 60.32; t-test for independent samples, p = 0.04). CONCLUSIONS: Significant variations exist in surgeons' selection of entry points for proximal femoral nailing, underscoring the task's complexity. Experience does not prevent the choice of unfeasible entry points, emphasizing the inadequacy of a universal approach and pointing towards the necessity for a patient-specific strategy for improved outcomes. TRIAL REGISTRATION NUMBER: DRKS00032465.


Subject(s)
Bone Nails , Femoral Fractures , Fracture Fixation, Intramedullary , Female , Humans , Male , Clinical Competence , Femoral Fractures/surgery , Fracture Fixation, Intramedullary/methods , Fracture Fixation, Intramedullary/instrumentation , Observer Variation , Prospective Studies
16.
J Nepal Health Res Counc ; 21(4): 543-549, 2024 Mar 31.
Article in English | MEDLINE | ID: mdl-38616581

ABSTRACT

BACKGROUND: The American Society of Anaesthesiologists Physical Status classification is deployed by the anaesthesiologists worldwide to classify operative surgical patients. Many studies have found moderate degree of interrater variability among anaesthesiologists. The general objective of the study was to find out interrater variability among Nepalese anesthesiologists using this classification system in Nepal. The specific objectives of the study were to find out the correctness of assignment and inter-rater variability among anaesthesiologists based on their experience. METHODS: Ten clinical cases were distributed among 130 registered anaesthesiologist practitioners of Nepal after validation with the experts. Respondents were asked to assign each of ten cases to a specific physical status class. Anaesthesiologists were classified to two classes based on clinical experience as having more or less than five years of experience. RESULTS: We found substantial agreement among < 5 year's (0.66) and > 5 year's experience group (0.753) and among all raters (0.736). The mean score of the group with less than 5 years of experience was more. There was no significant difference between the mean score (p = 0.595). Overall mean score for the both groups was 5.66 with SD 1.66. There was no significant difference between the groups. CONCLUSIONS: The study shows that there is very less variation among registered practising anaesthesiologists of Nepal using American Society of Anesthesiologists Physical Status classification system.


Subject(s)
Anesthesiologists , Observer Variation , Physical Examination , Humans , Nepal , South Asian People , Physical Examination/classification
17.
Forensic Sci Int ; 358: 112009, 2024 May.
Article in English | MEDLINE | ID: mdl-38581823

ABSTRACT

Tire impression evidence can be a valuable tool during a crime scene investigation-it can link vehicles to scenes or secondary locations, and reveal information about the series of events surrounding a crime. The interpretation of tire impression evidence relies on the expertise of forensic tire examiners. To date, there have not been any studies published that empirically evaluate the accuracy and reproducibility of decisions made by tire impression examiners. This paper presents the results of a study in which 17 tire impression examiners and trainees conducted 238 comparisons on 77 distinct questioned impression-known tire comparison sets (QKsets). This study was conducted digitally and addressed examinations based solely upon the characteristics of the tire impression images provided. The quality and characteristics of the impressions were selected to be broadly representative of those encountered in casework. Participants reported their decisions using a multi-level conclusion scale: 68% of responses were class associations (Association of Class Characteristics or Limited Association of Class), 21% were definitive decisions (ID or Exclusion), 8% were probable decisions (High Degree of Association or Indications of Non-Association), and 3% were neutral responses (Not Suitable or Inconclusive). Although class associations were the most reported response type, when definitive decisions were reported, they were often correct: 96% of IDs and 89% of Exclusions were consistent with ground truth regarding the source of the known tire in the QKset. Overall, we observed 4 erroneous definitive decisions (3 Exclusions on mated QKsets; 1 ID on a nonmated QKset) and 1 incorrect probable decision (Indications of Non-Association on a mated QKset). Decision rates were notably associated with both quality (lower quality questioned impressions were more likely to result in class associations) and dimensionality (2D questioned impressions were more likely to result in definitive decisions), which were correlated factors. Although the study size limits the precision of the measured rates, the results of this study remain valuable to the forensic science and legal communities and provide empirical data regarding examiner performance for a discipline that previously did not have any such estimates.


Subject(s)
Forensic Sciences , Humans , Reproducibility of Results , Forensic Sciences/methods , Decision Making , Observer Variation
18.
Ultrason Imaging ; 46(3): 178-185, 2024 May.
Article in English | MEDLINE | ID: mdl-38622911

ABSTRACT

To evaluate the inter-observer variability and the intra-observer repeatability of pulmonary transit time (PTT) measurement using contrast-enhanced ultrasound (CEUS) in healthy rabbits, and assess the effects of dilution concentration of ultrasound contrast agents (UCAs) on PTT. Thirteen healthy rabbits were selected, and five concentrations UCAs of 1:200, 1:100, 1:50, 1:10, and 1:1 were injected into the right ear vein. Five digital loops were obtained from the apical 4-chamber view. Four sonographers obtained PTT by plotting the TIC of right atrium (RA) and left atrium (LA) at two time points (T1 and T2). The frame counts of the first appearance of UCAs in RA and LA had excellent inter-observer agreement, with intra-class correlations (ICC) of 0.996, 0.988, respectively. The agreement of PTT among four observers was all good at five different concentrations, with an ICC of 0.758-0.873. The reproducibility of PTT obtained by four observers at T1 and T2 was performed well, with ICC of 0.888-0.961. The median inter-observer variability across 13 rabbits was 6.5% and the median variability within 14 days for 4 observers was 1.9%, 1.7%, 2.2%, 1.9%, respectively; The PTT of 13 healthy rabbits is 1.01 ± 0.18 second. The difference of PTT between five concentrations is statistically significant. The PTT obtained by a concentration of 1:200 and 1:100 were higher than that of 1:1, while there were no significantly differences in PTT of a concentration of 1:1, 1:10, and 1:50. PTT measured by CEUS in rabbits is feasible, with excellent inter-observer and intra-observer reliability and reproducibility, and dilution concentration of UCAs influences PTT results.


Subject(s)
Contrast Media , Feasibility Studies , Observer Variation , Ultrasonography , Animals , Rabbits , Reproducibility of Results , Ultrasonography/methods , Sulfur Hexafluoride/pharmacokinetics , Pulmonary Circulation/physiology
19.
J Stroke Cerebrovasc Dis ; 33(6): 107700, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38570060

ABSTRACT

OBJECTIVES: With the rising global burden of stroke-related morbidity, and increased focus on patient-centered healthcare, patient reported outcome measures (PROMs) are increasingly used to inform healthcare decision-making. Some stroke patients with cognitive or motor impairments are unable to respond to PROMs, so proxies may respond on their behalf; the reliability of which remains unclear. The aim of the study is to update a 2010 systematic review to investigate the inter-rater reliability of proxy respondents answering PROMs for stroke patients. MATERIALS AND METHODS: Studies on the reliability of proxy respondents in stroke were searched within CINAHL, Embase, PsycInfo, and WoS databases (01/07/22, 08/07/22). Fifteen studies were included for review. ICC and k-statistic were extracted for PROMs scales and categorized as poor (0.80). Bias was assessed using the CCAT. RESULTS: Five studies reported PROMs with inter-rater reliability scores ranging from 0.80. Two studies reported activities of daily living (ADLs) scores ranging from 0.41 to 0.80 and 8 studies reported quality of life (QoL) measures with scores ranging from 0.80. Subcategories of these scales included physical (ICC/k-statistic 0.41- >0.8), cognitive (ICC/k-statistic 0.40-0.80), communication (ICC/k-statistic <0.4-0.80,) and psychological (ICC/k-statistic <0.40-0.60) measures. CONCLUSIONS: Proxy respondents are reliable sources for PROM reports on physical domains in ADLs, PROMs and QoL scales. Proxy reports for measures of communication and psychological domains had greater variability in reliability scores, ranging from poor to substantial; hence, caution should be applied when interpreting proxy reports for these domains.


Subject(s)
Activities of Daily Living , Patient Reported Outcome Measures , Proxy , Stroke , Humans , Disability Evaluation , Observer Variation , Predictive Value of Tests , Quality of Life , Reproducibility of Results , Stroke/diagnosis , Stroke/therapy , Stroke/psychology , Stroke/physiopathology , Treatment Outcome
20.
J Eval Clin Pract ; 30(4): 670-677, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38588276

ABSTRACT

AIM: The aim of this study was to examine the validity and reliability of the Sitting Assessment Scale (SAS) in individuals with cerebral palsy (CP). METHODS: The study included 34 individuals with a diagnosis of spastic CP. Individuals were evaluated with the Gross Motor Function Classification System and the Manual Ability Classification System. SAS and Trunk Control Measurement Scale (TCMS) were applied to the participants. The intraclass correlation coefficient (ICC) was calculated to determine the intraobserver and interobserver reliability of the scale scored by three different physiotherapists at two different time intervals. Internal consistency was calculated with Cronbach's ⍺ coefficient. The fit between SAS and TCMS for criterion-dependent validity was evaluated using Pearson Correlation Analysis. RESULTS: According to the GMFCS level, 79.41% of the children were mildly (Level I-II), 14.71% were moderately affected (level III), and 5.88% were severely affected (level IV). Intra > observer and interobserver reliability values of SAS were extremely high (ICCinterrater > 0.923, ICCintrarater > 0.930). It was observed that the internal consistency of SAS had high values (Cronbach ⍺test > 0.822, Cronbach ⍺retest > 0.804). For the criterion-dependent reliability; positive medium correlations found between SAS with Total TCMS Static Sitting Balance (r = 0.579, p < 0.001), with TCMS Selective Movement Control (r = 0.597, p < 0.001), with TCMS Dynamic Reaching (r = 0.609, p < 0.001), and with TCMS Total (r = 0.619, p < 0.001). CONCLUSION: SAS was found to have high validity and reliability in children with CP. In addition, the test-retest reliability of the scale was also high. SAS is a practical tool that can be used to assess sitting balance in children with CP.


Subject(s)
Cerebral Palsy , Sitting Position , Humans , Cerebral Palsy/physiopathology , Reproducibility of Results , Female , Male , Child , Disability Evaluation , Adolescent , Observer Variation , Child, Preschool , Severity of Illness Index
SELECTION OF CITATIONS
SEARCH DETAIL
...