Search | VHL Regional Portal

1.

Psychometric Properties of Entrustable Professional Activity-Based Objective Structured Clinical Examinations During Transition From Undergraduate to Graduate Medical Education: A Generalizability Study.

Suneja, Manish; Hanrahan, Kate DuChene; Kreiter, Clarence; Rowat, Jane.

Acad Med ; 2024 Mar 26.

Article in English | MEDLINE | ID: mdl-38534117

ABSTRACT

PURPOSE: The objective structured clinical examination (OSCE) assesses clinical competence in health sciences education. There is little research regarding the reliability and validity of using an OSCE during the transition from undergraduate to graduate medical education. The goal of this study was to measure the reliability of a unique 2-rater entrustable professional activity (EPA)-based OSCE format for transition to internship using generalizability theory for estimating reliability. METHOD: During the 2018 to 2022 academic years, 5 cohorts of interns (n = 230) at the University of Iowa Hospital and Clinics participated in a 6-station OSCE assessment delivered during orientation. A univariate and multivariate generalizability study (G study) was conducted on the scores generated by the 3 cases in the orientation OSCE that shared the 2-rater format. This analysis was supplemented with an associated decision study (D study). RESULTS: The univariate G study for the cases that used a simulated patient and a faculty rater demonstrated that this OSCE generated a moderately reliable score with 3 cases. The D study showed that increasing the OCSE to 12 cases yielded a mean score reliable enough (G = 0.76) for making high-stakes normative decisions regarding remediation and readiness to practice. The universe score correlation between 2 types of raters was 0.398. The faculty ratings displayed a larger proportion of universe (true) score variance and yielded a more reliable (G = 0.433) score compared with the standardized patient ratings (G = 0.337). CONCLUSIONS: This study provides insight into the development of an EPA-based OSCE. The univariate G study demonstrated that when using the 2 rater types, this assessment could generate a moderately reliable score with 3 cases. The multivariate G study showed that the 2 types of raters assessed different aspects of clinical skills and faculty raters were more reliable.

2.

Development and evaluation of novel tool to assess communication skills in adult triadic interviews.

Shibli-Rahhal, Amal; Kreiter, Clarence.

Patient Educ Couns ; 104(10): 2412-2417, 2021 10.

Article in English | MEDLINE | ID: mdl-34244034

ABSTRACT

OBJECTIVE: Develop and evaluate the reliability and validity of a novel assessment tool for triadic communication. METHODS: We developed the tool using published opinions of patients and companions regarding effective communication, and used it in a four-station Objective Standardized Clinical Examination (OSCE) with 140 medical students, including one triadic interview station. We conducted multitrait-multimethod (MTMM) and generalizability (G) analyses to assess its performance. RESULTS: MTMM analyses demonstrated the validity of the instrument in assessing two separate communication traits (with patient and companion), as showed by the high covariation of both traits based on patient and companion's ratings (average r = 0.78) compared to the inter-traits covariation within (average r = 0.50) and across raters (average r = 0.45). G analyses showed that the communication assessment of the single triadic station functioned similar to two independent stations, revealing the novel tool's ability to reliably measure medical students' ability to communicate with patient and companion. CONCLUSION: Triadic communication skills with patient and companion can be individually assessed within a single OSCE station using this novel tool. PRACTICE IMPLICATIONS: This tool fills a gap in communication assessments, allowing for reliable evaluation of triadic communication.

Subject(s)

Educational Measurement , Students, Medical , Adult , Clinical Competence , Communication , Humans , Reproducibility of Results

3.

Implementation and assessment of an effective, novel, student-centered, remotely accessible, dermatology education program: A prospective cohort study.

Poggemiller, Andrew M; Kreiter, Clarence; Naridze, Rachelle; Melrose, Ellen; Fenton, Paul; Wanat, Karolyn A; Liu, Vincent.

J Am Acad Dermatol ; 85(6): 1645-1647, 2021 12.

Article in English | MEDLINE | ID: mdl-33422627

Subject(s)

Dermatology , Curriculum , Dermatology/education , Humans , Prospective Studies , Students

4.

Teaching Patient-Related Communication to Surgical Residents in Brief Training Sessions.

Kapadia, Muneera R; White, Anna V; Peters, Lauren; Kreiter, Clarence; Koch, Kelsey E; Rosenbaum, Marcy E.

J Surg Educ ; 77(6): 1496-1502, 2020.

Article in English | MEDLINE | ID: mdl-32534941

ABSTRACT

OBJECTIVE: Effective provider-patient communication has several benefits; however, few surgical residency programs have communication training and surgical residents have limited time for education. We developed a communication curriculum with limited didactics and emphasis on practice. Our objective was to evaluate whether this time-limited intervention led to changes in surgical resident communication skills. DESIGN: A 4-module curriculum was implemented for surgical residents (PGY2-4). Each 30-minute module focused on specific communication micro-skills: empathy, concerns and expectations, chunking information and avoiding jargon, and teach-back. Modules included brief didactics, simulated patient interactions, feedback, and debriefing. Precurriculum, residents completed a 2-station objective structured clinical examination (OSCE) and a survey on communication confidence. Residents evaluated each module and postcurriculum, completed another 2-station OSCE, confidence survey, and overall curriculum evaluation. Using validated rating scales, OSCEs were scored by 2 independent raters. SETTING: Tertiary care, academic center with a 5-year surgical residency program. PARTICIPANTS: All 17 eligible residents completed both OSCEs and surveys, and 14 attended ≥3 modules. RESULTS: Following the curriculum, residents reported increased use of the targeted skills and increased confidence in responding to emotions, information sharing, and bad news telling (p < 0.004). There was no change in history taking. Residents rated the usefulness of each module modestly (2.5-3.1, scale 0-4), however, the likelihood of skill implementation was higher (3.2-3.6). The overall postcurriculum OSCE scores increased (versus precurriculum scores, p < 0.001). Postcurriculum scores increased for empathy, concerns and expectations, and teach-back. Chunking information and avoiding jargon was unchanged. Fifteen residents reported module length as appropriate, and 2 thought they were too short. CONCLUSIONS: The brief modules led to increased self-reported use of communication skills and were effective in improving resident communication in OSCEs. This may be a useful curricular model for both surgical and nonsurgical residency programs with limited availability for curricular time.

Subject(s)

Internship and Residency , Clinical Competence , Communication , Curriculum , Feedback , Humans

5.

Investigating the validity of web-enabled mechanistic case diagramming scores to assess students' integration of foundational and clinical sciences.

Ferguson, Kristi J; Kreiter, Clarence D; Franklin, Ellen; Haugen, Thomas H; Dee, Fred R.

Adv Health Sci Educ Theory Pract ; 25(3): 629-639, 2020 08.

Article in English | MEDLINE | ID: mdl-31720878

ABSTRACT

As medical schools have changed their curricula to address foundational and clinical sciences in a more integrated fashion, teaching methods such as concept mapping have been incorporated in small group learning settings. Methods that can assess students' ability to apply such integrated knowledge are not as developed, however. The purpose of this project was to assess the validity of scores on a focused version of concept maps called mechanistic case diagrams (MCDs), which are hypothesized to enhance existing tools for assessing integrated knowledge that supports clinical reasoning. The data were from the medical school graduating class of 2018 (N = 136 students). In 2014-2015 we implemented a total of 16 case diagrams in case analysis groups within the Mechanisms of Health and Disease (MOHD) strand of the pre-clinical curriculum. These cases were based on topics being taught during the lectures and small group sessions for MOHD. We created an overall score across all 16 cases for each student. We then correlated these scores with performance in the preclinical curriculum [as assessed by overall performance in MOHD integrated foundational basic science courses and overall performance in the Clinical and Professional Skills (CAPS) courses], and standardized licensing exam scores [United States Medical Licensing Exam (USMLE)] Step 1 (following core clerkships) and Step 2 Clinical Knowledge (at the beginning of the fourth year of medical school). MCD scores correlated with students' overall basic science scores (r = .46, p = .0002) and their overall performance in Clinical and Professional Skills courses (r = .49, p < .0001). In addition, they correlated significantly with standardized exam measures, including USMLE Step 1 (r = .33, p ≤ .0001), and USMLE Step 2 CK (r = .39, p < .0001). These results provide preliminary validity evidence that MCDs may be useful in identifying students who have difficulty in integrating foundational and clinical sciences.

Subject(s)

Concept Formation , Curriculum , Internet , Science/education , Systems Integration , Clinical Competence , Diagnosis, Differential , Pilot Projects

6.

Nurse anesthetists' evaluations of anesthesiologists' operating room performance are sensitive to anesthesiologists' years of postgraduate practice.

O'Brien, Mary K; Dexter, Franklin; Kreiter, Clarence D; Slater-Scott, Chad; Hindman, Bradley J.

J Clin Anesth ; 54: 102-110, 2019 May.

Article in English | MEDLINE | ID: mdl-30415149

ABSTRACT

STUDY OBJECTIVE: The first aim of this study was to test whether a 7 item evaluation scale developed by our department's certified registered nurse anesthetists (CRNAs) was psychometrically reliable. The second aim was to test whether anesthesiologists' performance changed with their years of postgraduate experience. DESIGN, SETTING, MEASUREMENTS: Sixty-two University of Iowa CRNAs evaluated 81 anesthesiologists during one weekend. Anesthesiologists' scores were adjusted for CRNA rater leniency. Anesthesiologists' scores were tested for sensitivity to CRNA-anesthesiologist case-specific variables. Scores also were tested against anesthesiologists' years of postgraduate experience. The latter association was tested for sensitivity to case-specific variables, anesthesiologists' clinical supervision scores provided by residents, and anesthesiologist clinical assignment variables. MAIN RESULTS: The 7 items demonstrated a single-factor structure, allowing calculation of mean score over the 7 items. Individual anesthesiologist scores were reliable when scores were provided by at least 10 different CRNAs. Anesthesiologists' scores (mean 3.34 [SD 0.41]) were not affected by the interval since last CRNA-anesthesiologist interaction, number of interactions, or case-specific variables. There was a negative association between leniency-adjusted anesthesiologist scores and years of anesthesiologist postgraduate practice (coefficient -0.20 per decade, tâ¯=â¯-19.39, Pâ¯<â¯0.0001). The association remained robust when accounting for case-specific variables, resident clinical supervision scores, and overall clinical assignment variables. CONCLUSIONS: Anesthesiologists' operating room performance can be evaluated reliably by non-physician anesthesia providers (CRNAs). The evaluation process can be done reliably and validly using an assessment scale consisting of only a few (<10) items and with evaluations by only a few individuals (≥10 CRNA raters). There is no indication evaluations provided by CRNAs were significantly influenced by the interval between interaction and evaluation, number of interactions, or other case-specific variables. From CRNAs' perspectives, on average, as anesthesiologists gain experience, anesthesiologists' behaviors in the operating room change, providing CRNAs with less direct assistance in patient care.

Subject(s)

Anesthesiologists/statistics & numerical data , Clinical Competence/statistics & numerical data , Employee Performance Appraisal/statistics & numerical data , Nurse Anesthetists/psychology , Physician-Nurse Relations , Anesthesiologists/psychology , Employee Performance Appraisal/methods , Humans , Operating Rooms , Psychometrics , Time Factors

7.

Developing an objective assessment of surgical performance from operating room video and surgical imagery.

Taylor, Leah K; Thomas, Geb W; Karam, Matthew D; Kreiter, Clarence D; Anderson, Donald D.

IISE Trans Healthc Syst Eng ; 88(2): 110-116, 2018.

Article in English | MEDLINE | ID: mdl-29963653

ABSTRACT

An unbiased, repeatable process for assessing operating room performance is an important step toward quantifying the relationship between surgical training and performance. Hip fracture surgeries offer a promising first target in orthopedic trauma because they are common and they offer quantitative performance metrics that can be assessed from video recordings and intraoperative fluoroscopic images. Hip fracture repair surgeries were recorded using a head-mounted point-of-view camera. Intraoperative fluoroscopic images were also saved. The following performance metrics were analyzed: duration of wire navigation, number of fluoroscopic images collected, degree of intervention by the surgeon's supervisor, and the tip-apex distance (TAD). Two orthopedic traumatologists graded surgical performance in each video independently using an Objective Structured Assessment of Technical Skill (OSATS). Wire navigation duration correlated with weeks into residency and prior cases logged. TAD correlated with cases logged. There was no significant correlation between the OSATS total score and experience metrics. Total OSATS score correlated with duration and number of fluoroscopic images. Our results indicate that two metrics of hip fracture wire navigation performance, duration and TAD, significantly differentiate surgical experience. The methods presented have the potential to provide truly objective assessment of resident technical performance in the OR.

8.

Generalizability of Competency Assessment Scores Across and Within Clerkships: How Students, Assessors, and Clerkships Matter.

Zaidi, Nikki L Bibler; Kreiter, Clarence D; Castaneda, Peris R; Schiller, Jocelyn H; Yang, Jun; Grum, Cyril M; Hammoud, Maya M; Gruppen, Larry D; Santen, Sally A.

Acad Med ; 93(8): 1212-1217, 2018 08.

Article in English | MEDLINE | ID: mdl-29697428

ABSTRACT

PURPOSE: Many factors influence the reliable assessment of medical students' competencies in the clerkships. The purpose of this study was to determine how many clerkship competency assessment scores were necessary to achieve an acceptable threshold of reliability. METHOD: Clerkship student assessment data were collected during the 2015-2016 academic year as part of the medical school assessment program at the University of Michigan Medical School. Faculty and residents assigned competency assessment scores for third-year core clerkship students. Generalizability (G) and decision (D) studies were conducted using balanced, stratified, and random samples to examine the extent to which overall assessment scores could reliably differentiate between students' competency levels both within and across clerkships. RESULTS: In the across-clerkship model, the residual error accounted for the largest proportion of variance (75%), whereas the variance attributed to the student and student-clerkship effects was much smaller (7% and 10.1%, respectively). D studies indicated that generalizability estimates for eight assessors within a clerkship varied across clerkships (G coefficients range = 0.000-0.795). Within clerkships, the number of assessors needed for optimal reliability varied from 4 to 17. CONCLUSIONS: Minimal reliability was found in competency assessment scores for half of clerkships. The variability in reliability estimates across clerkships may be attributable to differences in scoring processes and assessor training. Other medical schools face similar variation in assessments of clerkship students; therefore, the authors hope this study will serve as a model for other institutions that wish to examine the reliability of their clerkship assessment scores.

Subject(s)

Clinical Clerkship/standards , Clinical Competence/standards , Educational Measurement/standards , Clinical Clerkship/statistics & numerical data , Clinical Competence/statistics & numerical data , Educational Measurement/methods , Educational Measurement/statistics & numerical data , Educational Status , Humans , Reproducibility of Results , Students, Medical/statistics & numerical data

9.

Web-Enabled Mechanistic Case Diagramming: A Novel Tool for Assessing Students' Ability to Integrate Foundational and Clinical Sciences.

Ferguson, Kristi J; Kreiter, Clarence D; Haugen, Thomas H; Dee, Fred R.

Acad Med ; 93(8): 1146-1149, 2018 08.

Article in English | MEDLINE | ID: mdl-29465452

ABSTRACT

PROBLEM: As medical schools move from discipline-based courses to more integrated approaches, identifying assessment tools that parallel this change is an important goal. APPROACH: The authors describe the use of test item statistics to assess the reliability and validity of web-enabled mechanistic case diagrams (MCDs) as a potential tool to assess students' ability to integrate basic science and clinical information. Students review a narrative clinical case and construct an MCD using items provided by the case author. Students identify the relationships among underlying risk factors, etiology, pathogenesis and pathophysiology, and the patients' signs and symptoms. They receive one point for each correctly identified link. OUTCOMES: In 2014-2015 and 2015-2016, case diagrams were implemented in consecutive classes of 150 medical students. The alpha reliability coefficient for the overall score, constructed using each student's mean proportion correct across all cases, was 0.82. Discrimination indices for each of the case scores with the overall score ranged from 0.23 to 0.51. In a G study using those students with complete data (n = 251) on all 16 cases, 10% of the variance was true score variance, and systematic case variance was large. Using 16 cases generated a G coefficient (relative score reliability) equal to 0.72 and a Phi equal to 0.65. NEXT STEPS: The next phase of the project will involve deploying MCDs in higher-stakes settings to determine whether similar results can be achieved. Further analyses will determine whether these assessments correlate with other measures of higher-order thinking skills.

Subject(s)

Educational Measurement/standards , Students, Medical/psychology , Thinking , Clinical Competence/standards , Educational Measurement/methods , Humans , Reproducibility of Results

10.

Investigating a self-scoring interview simulation for learning and assessment in the medical consultation.

Bruen, Catherine; Kreiter, Clarence; Wade, Vincent; Pawlikowska, Teresa.

Adv Med Educ Pract ; 8: 353-358, 2017.

Article in English | MEDLINE | ID: mdl-28603434

ABSTRACT

Experience with simulated patients supports undergraduate learning of medical consultation skills. Adaptive simulations are being introduced into this environment. The authors investigate whether it can underpin valid and reliable assessment by conducting a generalizability analysis using IT data analytics from the interaction of medical students (in psychiatry) with adaptive simulations to explore the feasibility of adaptive simulations for supporting automated learning and assessment. The generalizability (G) study was focused on two clinically relevant variables: clinical decision points and communication skills. While the G study on the communication skills score yielded low levels of true score variance, the results produced by the decision points, indicating clinical decision-making and confirming user knowledge of the process of the Calgary-Cambridge model of consultation, produced reliability levels similar to what might be expected with rater-based scoring. The findings indicate that adaptive simulations have potential as a teaching and assessment tool for medical consultations.

11.

A research agenda for establishing the validity of non-academic assessments of medical school applicants.

Kreiter, Clarence Dennis.

Adv Health Sci Educ Theory Pract ; 22(2): 559-563, 2017 05.

Article in English | MEDLINE | ID: mdl-28341923

Subject(s)

Education, Medical/organization & administration , Personality Tests/standards , Reproducibility of Results , School Admission Criteria , Students, Medical/psychology , Adult , Female , Humans , Male , Young Adult

12.

Precepting Medical Students in the Patient's Presence: An Educational Randomized Trial in Family Medicine Clinic.

Power, David V; Rosenbaum, Marcy E; Hanson, Lindsey; Reynolds, Ian R; Brink, Darin; Prasad, Shailendra; Kreiter, Clarence D.

Fam Med ; 49(2): 97-105, 2017 Feb.

Article in English | MEDLINE | ID: mdl-28218934

ABSTRACT

BACKGROUND AND OBJECTIVES: Many medical student-patient encounters occur in the outpatient setting. Conference room staffing (CRS) of student presentations has been the norm in the United States in recent decades. However, this method may not be suitable for outpatient precepting, being inefficient and reducing valuable direct face time between physician and patient. Precepting in the Presence of the Patient (PIPP) has previously been found to be an effective educational model in the outpatient setting but has never been studied in family medicine clinics, nor with non-English speaking patients, nor patients from lower socioeconomic backgrounds with low literacy. METHODS: We used a randomized controlled trial of educational models comparing time spent using PIPP with CRS in two family medicine clinics. Patient, student, and physician satisfaction were also measured using a 5-point Likert scale; total encounter time and time spent precepting were also recorded. RESULTS: PIPP is strongly preferred by attending physicians while patients and students were equally satisfied with either precepting method. PIPP provides an additional 3 minutes of physician-patient face time (17.39 versus 14.08 minutes) in an encounter that is overall shortened by 2 minutes (17.39 versus 19.71 minutes). CONCLUSIONS: PIPP is an effective method for precepting medical students in family medicine clinics, even with non-English speaking patients and those with low literacy. Given the time constraints of family physicians, PIPP should be considered as a preferred, time-efficient method for training medical students that is well received by patients, students, and particularly by physicians.

Subject(s)

Family Practice/education , Preceptorship/methods , Students, Medical/psychology , Adult , Ambulatory Care , Female , Humans , Male , Middle Aged , Patient Satisfaction , Physician-Patient Relations , Physicians, Family/psychology , Time Factors , United States

13.

A Bayesian perspective on constructing a written assessment of probabilistic clinical reasoning in experienced clinicians.

Kreiter, Clarence D.

J Eval Clin Pract ; 23(1): 44-48, 2017 Feb.

Article in English | MEDLINE | ID: mdl-26486941

ABSTRACT

RATIONALE: Decision-making performance assessments have proven problematic for assessing clinical reasoning. AIMS AND OBJECTIVES: A Bayesian approach to designing an advanced clinical reasoning assessment is well grounded in mathematical and cognitive theory and may offer significant psychometric advantages. Probabilistic logic plays an important role in medical problem solving, and performances on Bayesian-type tasks appear to be causally-related to the ability to make sound clinical decisions. METHODS: A validity argument is used to guide the design of an assessment of medical reasoning using clinical probabilities. RESULTS/CONCLUSIONS: The practical advantage of using a Bayesian approach to item design relates to the fact that probability theory provides a rationally optimal method for managing uncertain information and provides the criteria for objective correct answer scoring. Potential item formats are discussed.

Subject(s)

Bayes Theorem , Clinical Competence/standards , Clinical Decision-Making/methods , Problem Solving , Humans , Logic , Psychometrics , Thinking , Uncertainty

14.

The Development and Evaluation of a Novel Instrument Assessing Residents' Discharge Summaries.

Hommos, Musab S; Kuperman, Ethan F; Kamath, Aparna; Kreiter, Clarence D.

Acad Med ; 92(4): 550-555, 2017 04.

Article in English | MEDLINE | ID: mdl-27805951

ABSTRACT

PURPOSE: To develop and determine the reliability of a novel measurement instrument assessing the quality of residents' discharge summaries. METHOD: In 2014, the authors created a discharge summary evaluation instrument based on consensus recommendations from national regulatory bodies and input from primary care providers at their institution. After a brief pilot, they used the instrument to evaluate discharge summaries written by first-year internal medicine residents (n = 24) at a single U.S. teaching hospital during the 2013-2014 academic year. They conducted a generalizability study to determine the reliability of the instrument and a series of decision studies to determine the number of discharge summaries and raters needed to achieve a reliable evaluation score. RESULTS: The generalizability study demonstrated that 37% of the variance reflected residents' ability to generate an adequate discharge summary (true score variance). The decision studies estimated that the mean score from six discharge summary reviews completed by a unique rater for each review would yield a reliability coefficient of 0.75. Because of high interrater reliability, multiple raters per discharge summary would not significantly enhance the reliability of the mean rating. CONCLUSIONS: This evaluation instrument reliably measured residents' performance writing discharge summaries. A single rating of six discharge summaries can achieve a reliable mean evaluation score. Using this instrument is feasible even for programs with a limited number of inpatient encounters and a small pool of faculty preceptors.

Subject(s)

Clinical Competence , Internal Medicine/education , Internship and Residency , Patient Discharge Summaries/standards , Educational Measurement/methods , Hospitals, Teaching , Humans , Pilot Projects , Reproducibility of Results , Retrospective Studies , United States

15.

Skill Assessment in the Interpretation of 3D Fracture Patterns from Radiographs.

Thomas, Geb W; Rojas-Murillo, Salvador; Hanley, Jessica M; Kreiter, Clarence D; Karam, Matthew D; Anderson, Donald D.

Iowa Orthop J ; 36: 1-6, 2016.

Article in English | MEDLINE | ID: mdl-27528827

ABSTRACT

BACKGROUND: Interpreting two-dimensional radiographs to ascertain the three-dimensional (3D) position and orientation of fracture planes and bone fragments is an important component of orthopedic diagnosis and clinical management. This skill, however, has not been thoroughly explored and measured. Our primary research question is to determine if 3D radiographic image interpretation can be reliably assessed, and whether this assessment varies by level of training. A test designed to measure this skill among orthopedic surgeons would provide a quantitative benchmark for skill assessment and training research. METHODS: Two tests consisting of a series of online exercises were developed to measure this skill. Each exercise displayed a pair of musculoskeletal radiographs. Participants selected one of three CT slices of the same or similar fracture patterns that best matched the radiographs. In experiment 1, 10 orthopedic residents and staff responded to nine questions. In experiment 2, 52 residents from both orthopedics and radiology responded to 12 questions. RESULTS: Experiment 1 yielded a Cronbach alpha of 0.47. Performance correlated with experience; r(8) = 0.87, p<0.01, suggesting that the test could be both valid and reliable with a slight increase in test length. In experiment 2, after removing three non-discriminating items, the Cronbach coefficient alpha was 0.28 and performance correlated with experience; r(50) = 0.25, p<0.10. CONCLUSIONS: Although evidence for reliability and validity was more compelling with the first experiment, the analyses suggest motivation and test duration are important determinants of test efficacy. The interpretation of radiographs to discern 3D information is a promising and a relatively unexplored area for surgical skill education and assessment. The online test was useful and reliable. Further test development is likely to increase test effectiveness. CLINICAL RELEVANCE: Accurately interpreting radiographic images is an essential clinical skill. Quantitative, repeatable techniques to measure this skill can improve resident training and improve patient safety.

Subject(s)

Clinical Competence , Fractures, Bone/diagnostic imaging , Orthopedics/education , Tomography, X-Ray Computed , Educational Measurement , Humans , Reproducibility of Results

16.

Assessing Wire Navigation Performance in the Operating Room.

Taylor, Leah K; Thomas, Geb W; Karam, Matthew D; Kreiter, Clarence D; Anderson, Donald D.

J Surg Educ ; 73(5): 780-7, 2016.

Article in English | MEDLINE | ID: mdl-27184177

ABSTRACT

OBJECTIVE: There are no widely accepted, objective, and reliable tools for measuring surgical skill in the operating room (OR). Ubiquitous video and imaging technology provide opportunities to develop metrics that meet this need. Hip fracture surgery is a promising area in which to develop these measures because hip fractures are common, the surgery is used as a milestone for residents, and it demands technical skill. The study objective is to develop meaningful, objective measures of wire navigation performance in the OR. DESIGN: Resident surgeons wore a head-mounted video camera while performing surgical open reduction and internal fixation using a dynamic hip screw. Data collected from video included: duration of wire navigation, number of fluoroscopic images, and the degree of intervention by the surgeon×³s supervisor. To determine reliability of these measurements, 4 independent raters performed them for 2 cases. Raters independently measured the tip-apex distance (TAD), which reflects the accuracy of the surgical placement of the wire, on all the 7 cases. SETTING: University of Iowa Hospitals and Clinics in Iowa City, IA-a public tertiary academic center. PARTICIPANTS: In total 7 surgeries were performed by 7 different orthopedic residents. All 10 raters were biomedical engineering graduate students. RESULTS: The standard deviations for anteroposterior, lateral, and combined TAD measurements of the 10 raters were 2.7, 1.9, and 3.7mm, respectively, and interrater reliability produced a Cronbach α of 0.97. The interrater reliability analysis for all 9 video-based measures produced a Cronbach α of 0.99. CONCLUSIONS: Several video-based metrics were consistent across the 4 video reviewers and are likely to be useful for performance assessment. The TAD measurement was less reliable than previous reports have suggested, but remains a valuable metric of performance. Nonexperts can reliably measure these values and they offer an objective assessment of OR performance.

Subject(s)

Bone Wires , Clinical Competence , Fracture Fixation, Internal/methods , Hip Fractures/surgery , Operating Rooms , Orthopedic Procedures/education , Orthopedic Procedures/instrumentation , Aged, 80 and over , Education, Medical, Graduate , Female , Fluoroscopy , Humans , Internship and Residency , Iowa , Male , Reproducibility of Results , Treatment Outcome , Video Recording

17.

An Investigation of the Generalizability of Medical School Grades.

Kreiter, Clarence D; Ferguson, Kristi J.

Teach Learn Med ; 28(3): 279-85, 2016.

Article in English | MEDLINE | ID: mdl-27092723

ABSTRACT

UNLABELLED: Construct/Background: Medical school grades are currently unstandardized, and their level of reliability is unknown. This means their usefulness for reporting on student achievement is also not well documented. This study investigates grade reliability within 1 medical school. APPROACH: Generalizability analyses are conducted on grades awarded. Grades from didactic and clerkship-based courses were treated as 2 levels of a fixed facet within a univariate mixed model. Grades from within the 2 levels (didactic and clerkship) were also entered in a multivariate generalizability study. RESULTS: Grades from didactic courses were shown to produce a highly reliable mean score (G = .79) when averaged over as few as 5 courses. Although the universe score correlation between didactic and clerkship courses was high (r = .80), the clerkship courses required almost twice as many grades to reach a comparable level of reliability. When grades were converted to a Pass/Fail metric, almost all information contained in the grades was lost. CONCLUSIONS: Although it has been suggested that the imprecision of medical school grades precludes their use as a reliable indicator of student achievement, these results suggest otherwise. While it is true that a Pass/Fail system of grading provides very little information about a student's level of performance, a multi-tiered grading system was shown to be a highly reliable indicator of student achievement within the medical school. Although grades awarded during the first 2 didactic years appear to be more reliable than clerkship grades, both yield useful information about student performance within the medical college.

Subject(s)

Education, Medical/standards , Educational Measurement/standards , Achievement , Humans , Iowa , Models, Statistical , Reproducibility of Results

18.

Examining rater and occasion influences in observational assessments obtained from within the clinical environment.

Kreiter, Clarence D; Wilson, Adam B; Humbert, Aloysius J; Wade, Patricia A.

Med Educ Online ; 21: 29279, 2016.

Article in English | MEDLINE | ID: mdl-26925540

ABSTRACT

BACKGROUND: When ratings of student performance within the clerkship consist of a variable number of ratings per clinical teacher (rater), an important measurement question arises regarding how to combine such ratings to accurately summarize performance. As previous G studies have not estimated the independent influence of occasion and rater facets in observational ratings within the clinic, this study was designed to provide estimates of these two sources of error. METHOD: During 2 years of an emergency medicine clerkship at a large midwestern university, 592 students were evaluated an average of 15.9 times. Ratings were performed at the end of clinical shifts, and students often received multiple ratings from the same rater. A completely nested G study model (occasion: rater: person) was used to analyze sampled rating data. RESULTS: The variance component (VC) related to occasion was small relative to the VC associated with rater. The D study clearly demonstrates that having a preceptor rate a student on multiple occasions does not substantially enhance the reliability of a clerkship performance summary score. CONCLUSIONS: Although further research is needed, it is clear that case-specific factors do not explain the low correlation between ratings and that having one or two raters repeatedly rate a student on different occasions/cases is unlikely to yield a reliable mean score. This research suggests that it may be more efficient to have a preceptor rate a student just once. However, when multiple ratings from a single preceptor are available for a student, it is recommended that a mean of the preceptor's ratings be used to calculate the student's overall mean performance score.

Subject(s)

Clinical Clerkship/standards , Educational Measurement/methods , Educational Measurement/standards , Clinical Competence , Emergency Medicine/education , Humans , Observer Variation , Reproducibility of Results

19.

A research agenda for establishing the validity of non-academic assessments of medical school applicants.

Kreiter, Clarence Dennis.

Adv Health Sci Educ Theory Pract ; 21(5): 1081-1085, 2016 Dec.

Article in English | MEDLINE | ID: mdl-26902234

Subject(s)

School Admission Criteria/trends , Schools, Medical , Students, Medical/classification , Students, Medical/psychology , Decision Making , Educational Measurement , Female , Humans , Male , Psychological Tests

20.

Constructing a more comprehensive validity argument for medical school admission testing: predicting long-term outcomes.

Kreiter, Clarence D; Otaki, Junji.

Teach Learn Med ; 27(2): 197-200, 2015.

Article in English | MEDLINE | ID: mdl-25893942

ABSTRACT

ISSUE: The research published outside of medical education journals provides an important source of validity evidence for using cognitive ability testing in medical school admissions. EVIDENCE: The cumulative body of validity research, consisting of thousands of studies and scores of meta-analyses, has conclusively demonstrated that a strong positive relationship exists between job performance and general mental ability. IMPLICATIONS: Recommendations for reducing the emphasis on or eliminating the role of general mental ability in the selection process for medical schools are not based on a consideration of the wider research evidence. Admission interventions that substantially reduce the level of academic aptitude are also likely to result in reduced professional performance.

Subject(s)

College Admission Test , Predictive Value of Tests , School Admission Criteria , Schools, Medical , Clinical Competence , Forecasting , Humans , Learning , United States

ABSTRACT

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL