Pesquisa | Portal Regional da BVS

1.

A realist evaluation of how, why and when objective structured clinical exams (OSCEs) are experienced as an authentic assessment of clinical preparedness.

Yeates, Peter; Maluf, Adriano; Kinston, Ruth; Cope, Natalie; Cullen, Kathy; Cole, Aidan; O'Neill, Vikki; Chung, Ching-Wa; Goodfellow, Rhian; Vallender, Rebecca; Ensaff, Sue; Goddard-Fuller, Rikki; McKinley, Robert; Wong, Geoff.

Med Teach ; : 1-9, 2024 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-38635469

RESUMO

INTRODUCTION: Whilst rarely researched, the authenticity with which Objective Structured Clinical Exams (OSCEs) simulate practice is arguably critical to making valid judgements about candidates' preparedness to progress in their training. We studied how and why an OSCE gave rise to different experiences of authenticity for different participants under different circumstances. METHODS: We used Realist evaluation, collecting data through interviews/focus groups from participants across four UK medical schools who participated in an OSCE which aimed to enhance authenticity. RESULTS: Several features of OSCE stations (realistic, complex, complete cases, sufficient time, autonomy, props, guidelines, limited examiner interaction etc) combined to enable students to project into their future roles, judge and integrate information, consider their actions and act naturally. When this occurred, their performances felt like an authentic representation of their clinical practice. This didn't work all the time: focusing on unavoidable differences with practice, incongruous features, anxiety and preoccupation with examiners' expectations sometimes disrupted immersion, producing inauthenticity. CONCLUSIONS: The perception of authenticity in OSCEs appears to originate from an interaction of station design with individual preferences and contextual expectations. Whilst tentatively suggesting ways to promote authenticity, more understanding is needed of candidates' interaction with simulation and scenario immersion in summative assessment.

2.

Using video-based examiner score comparison and adjustment (VESCA) to compare the influence of examiners at different sites in a distributed objective structured clinical exam (OSCE).

Yeates, Peter; Maluf, Adriano; Cope, Natalie; McCray, Gareth; McBain, Stuart; Beardow, Dominic; Fuller, Richard; McKinley, Robert Bob.

BMC Med Educ ; 23(1): 803, 2023 Oct 26.

Artigo em Inglês | MEDLINE | ID: mdl-37885005

RESUMO

PURPOSE: Ensuring equivalence of examiners' judgements within distributed objective structured clinical exams (OSCEs) is key to both fairness and validity but is hampered by lack of cross-over in the performances which different groups of examiners observe. This study develops a novel method called Video-based Examiner Score Comparison and Adjustment (VESCA) using it to compare examiners scoring from different OSCE sites for the first time. MATERIALS/ METHODS: Within a summative 16 station OSCE, volunteer students were videoed on each station and all examiners invited to score station-specific comparator videos in addition to usual student scoring. Linkage provided through the video-scores enabled use of Many Facet Rasch Modelling (MFRM) to compare 1/ examiner-cohort and 2/ site effects on students' scores. RESULTS: Examiner-cohorts varied by 6.9% in the overall score allocated to students of the same ability. Whilst only a tiny difference was apparent between sites, examiner-cohort variability was greater in one site than the other. Adjusting student scores produced a median change in rank position of 6 places (0.48 deciles), however 26.9% of students changed their rank position by at least 1 decile. By contrast, only 1 student's pass/fail classification was altered by score adjustment. CONCLUSIONS: Whilst comparatively limited examiner participation rates may limit interpretation of score adjustment in this instance, this study demonstrates the feasibility of using VESCA for quality assurance purposes in large scale distributed OSCEs.

Assuntos

Avaliação Educacional , Estudantes de Medicina , Humanos , Avaliação Educacional/métodos , Competência Clínica

3.

Technology enhanced assessment: Ottawa consensus statement and recommendations.

Fuller, Richard; Goddard, Viktoria C T; Nadarajah, Vishna D; Treasure-Jones, Tamsin; Yeates, Peter; Scott, Karen; Webb, Alexandra; Valter, Krisztina; Pyorala, Eeva.

Med Teach ; 44(8): 836-850, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35771684

RESUMO

INTRODUCTION: In 2011, a consensus report was produced on technology-enhanced assessment (TEA), its good practices, and future perspectives. Since then, technological advances have enabled innovative practices and tools that have revolutionised how learners are assessed. In this updated consensus, we bring together the potential of technology and the ultimate goals of assessment on learner attainment, faculty development, and improved healthcare practices. METHODS: As a material for the report, we used the scholarly publications on TEA in both HPE and general higher education, feedback from 2020 Ottawa Conference workshops, and scholarly publications on assessment technology practices during the Covid-19 pandemic. RESULTS AND CONCLUSION: The group identified areas of consensus that remained to be resolved and issues that arose in the evolution of TEA. We adopted a three-stage approach (readiness to adopt technology, application of assessment technology, and evaluation/dissemination). The application stage adopted an assessment 'lifecycle' approach and targeted five key foci: (1) Advancing authenticity of assessment, (2) Engaging learners with assessment, (3) Enhancing design and scheduling, (4) Optimising assessment delivery and recording learner achievement, and (5) Tracking learner progress and faculty activity and thereby supporting longitudinal learning and continuous assessment.

Assuntos

COVID-19 , Pandemias , Currículo , Humanos , Aprendizagem , Tecnologia

4.

In Reply to Anto et al.

Yeates, Peter; Fuller, Richard; McKinley, Robert K Bob.

Acad Med ; 97(4): 475-476, 2022 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-35353728

5.

Memory, credibility and insight: How video-based feedback promotes deeper reflection and learning in objective structured clinical exams.

Makrides, Alexandra; Yeates, Peter.

Med Teach ; 44(6): 664-671, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35000530

RESUMO

INTRODUCTION: Providing high-quality feedback from Objective Structured Clinical Exams (OSCEs) is important but challenging. Whilst prior research suggests that video-based feedback (VbF), where students review their own performances alongside usual examiner feedback, may usefully enhance verbal or written feedback, little is known about how students experience or interact with VbF or what mechanisms may underly any such benefits. METHODS: We used social constructive grounded theory to explore students' interaction with VbF. Within semi-structured interviews, students reviewed their verbal feedback from examiners before watching a video of the same performance, reflecting with the interviewer before and after the video. Transcribed interviews were analysed using grounded theory analysis methods. RESULT: Videos greatly enhanced students' memories of their performance, which increased their receptivity to and the credibility of examiners' feedback. Reflecting on video performances produced novel insights for students beyond the points described by examiners. Students triangulated these novel insights with their own self-assessment and experiences from practice to reflect deeply on their performance which led to the generation of additional, often patient-orientated, learning objectives. CONCLUSIONS: The array of beneficial mechanisms evoked by VbF suggests it may be a powerful means to richly support students' learning in both formative and summative contexts.

Assuntos

Educação de Graduação em Medicina , Estudantes de Medicina , Competência Clínica , Educação de Graduação em Medicina/métodos , Avaliação Educacional/métodos , Retroalimentação , Humanos

6.

Determining the influence of different linking patterns on the stability of students' score adjustments produced using Video-based Examiner Score Comparison and Adjustment (VESCA).

Yeates, Peter; McCray, Gareth; Moult, Alice; Cope, Natalie; Fuller, Richard; McKinley, Robert.

BMC Med Educ ; 22(1): 41, 2022 Jan 17.

Artigo em Inglês | MEDLINE | ID: mdl-35039023

RESUMO

BACKGROUND: Ensuring equivalence of examiners' judgements across different groups of examiners is a priority for large scale performance assessments in clinical education, both to enhance fairness and reassure the public. This study extends insight into an innovation called Video-based Examiner Score Comparison and Adjustment (VESCA) which uses video scoring to link otherwise unlinked groups of examiners. This linkage enables comparison of the influence of different examiner-groups within a common frame of reference and provision of adjusted "fair" scores to students. Whilst this innovation promises substantial benefit to quality assurance of distributed Objective Structured Clinical Exams (OSCEs), questions remain about how the resulting score adjustments might be influenced by the specific parameters used to operationalise VESCA. Research questions, How similar are estimates of students' score adjustments when the model is run with either: fewer comparison videos per participating examiner?; reduced numbers of participating examiners? METHODS: Using secondary analysis of recent research which used VESCA to compare scoring tendencies of different examiner groups, we made numerous copies of the original data then selectively deleted video scores to reduce the number of 1/ linking videos per examiner (4 versus several permutations of 3,2,or 1 videos) or 2/examiner participation rates (all participating examiners (76%) versus several permutations of 70%, 60% or 50% participation). After analysing all resulting datasets with Many Facet Rasch Modelling (MFRM) we calculated students' score adjustments for each dataset and compared these with score adjustments in the original data using Spearman's correlations. RESULTS: Students' score adjustments derived form 3 videos per examiner correlated highly with score adjustments derived from 4 linking videos (median Rho = 0.93,IQR0.90-0.95,p < 0.001), with 2 (median Rho 0.85,IQR0.81-0.87,p < 0.001) and 1 linking videos (median Rho = 0.52(IQR0.46-0.64,p < 0.001) producing progressively smaller correlations. Score adjustments were similar for 76% participating examiners and 70% (median Rho = 0.97,IQR0.95-0.98,p < 0.001), and 60% (median Rho = 0.95,IQR0.94-0.98,p < 0.001) participation, but were lower and more variable for 50% examiner participation (median Rho = 0.78,IQR0.65-0.83, some ns). CONCLUSIONS: Whilst VESCA showed some sensitivity to the examined parameters, modest reductions in examiner participation rates or video numbers produced highly similar results. Employing VESCA in distributed or national exams could enhance quality assurance or exam fairness.

Assuntos

Avaliação Educacional , Estudantes de Medicina , Competência Clínica , Humanos , Julgamento

7.

Determining influence, interaction and causality of contrast and sequence effects in objective structured clinical exams.

Yeates, Peter; Moult, Alice; Cope, Natalie; McCray, Gareth; Fuller, Richard; McKinley, Robert.

Med Educ ; 56(3): 292-302, 2022 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-34893998

RESUMO

INTRODUCTION: Differential rater function over time (DRIFT) and contrast effects (examiners' scores biased away from the standard of preceding performances) both challenge the fairness of scoring in objective structured clinical exams (OSCEs). This is important as, under some circumstances, these effects could alter whether some candidates pass or fail assessments. Benefitting from experimental control, this study investigated the causality, operation and interaction of both effects simultaneously for the first time in an OSCE setting. METHODS: We used secondary analysis of data from an OSCE in which examiners scored embedded videos of student performances interspersed between live students. Embedded video position varied between examiners (early vs. late) whilst the standard of preceding performances naturally varied (previous high or low). We examined linear relationships suggestive of DRIFT and contrast effects in all within-OSCE data before comparing the influence and interaction of 'early' versus 'late' and 'previous high' versus 'previous low' conditions on embedded video scores. RESULTS: Linear relationships data did not support the presence of DRIFT or contrast effects. Embedded videos were scored higher early (19.9 [19.4-20.5]) versus late (18.6 [18.1-19.1], p < 0.001), but scores did not differ between previous high and previous low conditions. The interaction term was non-significant. CONCLUSIONS: In this instance, the small DRIFT effect we observed on embedded videos can be causally attributed to examiner behaviour. Contrast effects appear less ubiquitous than some prior research suggests. Possible mediators of these finding include the following: OSCE context, detail of task specification, examiners' cognitive load and the distribution of learners' ability. As the operation of these effects appears to vary across contexts, further research is needed to determine the prevalence and mechanisms of contrast and DRIFT effects, so that assessments may be designed in ways that are likely to avoid their occurrence. Quality assurance should monitor for these contextually variable effects in order to ensure OSCE equivalence.

Assuntos

Competência Clínica , Avaliação Educacional , Humanos

8.

Enhancing authenticity, diagnosticity and equivalence (AD-Equiv) in multicentre OSCE exams in health professionals education: protocol for a complex intervention study.

Yeates, Peter; Maluf, Adriano; Kinston, Ruth; Cope, Natalie; McCray, Gareth; Cullen, Kathy; O'Neill, Vikki; Cole, Aidan; Goodfellow, Rhian; Vallender, Rebecca; Chung, Ching-Wa; McKinley, Robert K; Fuller, Richard; Wong, Geoff.

BMJ Open ; 12(12): e064387, 2022 12 07.

Artigo em Inglês | MEDLINE | ID: mdl-36600366

RESUMO

INTRODUCTION: Objective structured clinical exams (OSCEs) are a cornerstone of assessing the competence of trainee healthcare professionals, but have been criticised for (1) lacking authenticity, (2) variability in examiners' judgements which can challenge assessment equivalence and (3) for limited diagnosticity of trainees' focal strengths and weaknesses. In response, this study aims to investigate whether (1) sharing integrated-task OSCE stations across institutions can increase perceived authenticity, while (2) enhancing assessment equivalence by enabling comparison of the standard of examiners' judgements between institutions using a novel methodology (video-based score comparison and adjustment (VESCA)) and (3) exploring the potential to develop more diagnostic signals from data on students' performances. METHODS AND ANALYSIS: The study will use a complex intervention design, developing, implementing and sharing an integrated-task (research) OSCE across four UK medical schools. It will use VESCA to compare examiner scoring differences between groups of examiners and different sites, while studying how, why and for whom the shared OSCE and VESCA operate across participating schools. Quantitative analysis will use Many Facet Rasch Modelling to compare the influence of different examiners groups and sites on students' scores, while the operation of the two interventions (shared integrated task OSCEs; VESCA) will be studied through the theory-driven method of Realist evaluation. Further exploratory analyses will examine diagnostic performance signals within data. ETHICS AND DISSEMINATION: The study will be extra to usual course requirements and all participation will be voluntary. We will uphold principles of informed consent, the right to withdraw, confidentiality with pseudonymity and strict data security. The study has received ethical approval from Keele University Research Ethics Committee. Findings will be academically published and will contribute to good practice guidance on (1) the use of VESCA and (2) sharing and use of integrated-task OSCE stations.

Assuntos

Educação de Graduação em Medicina , Estudantes de Medicina , Humanos , Avaliação Educacional/métodos , Educação de Graduação em Medicina/métodos , Competência Clínica , Faculdades de Medicina , Estudos Multicêntricos como Assunto

9.

Understanding patient involvement in judging students' communication skills in OSCEs.

Moult, Alice; McKinley, Robert K; Yeates, Peter.

Med Teach ; 43(9): 1070-1078, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34496725

RESUMO

INTRODUCTION: Communication skills are assessed by medically-enculturated examiners using consensus frameworks which were developed with limited patient involvement. Assessments consequently risk rewarding performance which incompletely serves patients' authentic communication needs. Whilst regulators require patient involvement in assessment, little is known about how this can be achieved. We aimed to explore patients' perceptions of students' communication skills, examiner feedback and potential roles for patients in assessment. METHODS: Using constructivist grounded theory we performed cognitive stimulated, semi-structured interviews with patients who watched videos of student performances in communication-focused OSCE stations and read corresponding examiner feedback. Data were analysed using grounded theory methods. RESULTS: A disconnect occurred between participants' and examiners' views of students' communication skills. Whilst patients frequently commented on students' use of medical terminology, examiners omitted to mention this in feedback. Patients' judgements of students' performances varied widely, reflecting different preferences and beliefs. Participants viewed variability as an opportunity for students to learn from diverse lived experiences. Participants perceived a variety of roles to enhance assessment authenticity. DISCUSSION: Integrating patients into communications skills assessments could help to highlight deficiencies in students' communication which medically-enculturated examiners may miss. Overcoming the challenges inherent to this is likely to enhance graduates' preparedness for practice.

Assuntos

Participação do Paciente , Estudantes de Medicina , Competência Clínica , Comunicação , Avaliação Educacional , Humanos

10.

Measuring the Effect of Examiner Variability in a Multiple-Circuit Objective Structured Clinical Examination (OSCE).

Yeates, Peter; Moult, Alice; Cope, Natalie; McCray, Gareth; Xilas, Eleftheria; Lovelock, Tom; Vaughan, Nicholas; Daw, Dan; Fuller, Richard; McKinley, Robert K Bob.

Acad Med ; 96(8): 1189-1196, 2021 08 01.

Artigo em Inglês | MEDLINE | ID: mdl-33656012

RESUMO

PURPOSE: Ensuring that examiners in different parallel circuits of objective structured clinical examinations (OSCEs) judge to the same standard is critical to the chain of validity. Recent work suggests examiner-cohort (i.e., the particular group of examiners) could significantly alter outcomes for some candidates. Despite this, examiner-cohort effects are rarely examined since fully nested data (i.e., no crossover between the students judged by different examiner groups) limit comparisons. In this study, the authors aim to replicate and further develop a novel method called Video-based Examiner Score Comparison and Adjustment (VESCA), so it can be used to enhance quality assurance of distributed or national OSCEs. METHOD: In 2019, 6 volunteer students were filmed on 12 stations in a summative OSCE. In addition to examining live student performances, examiners from 8 separate examiner-cohorts scored the pool of video performances. Examiners scored videos specific to their station. Video scores linked otherwise fully nested data, enabling comparisons by Many Facet Rasch Modeling. Authors compared and adjusted for examiner-cohort effects. They also compared examiners' scores when videos were embedded (interspersed between live students during the OSCE) or judged later via the Internet. RESULTS: Having accounted for differences in students' ability, different examiner-cohort scores for the same ability of student ranged from 18.57 of 27 (68.8%) to 20.49 (75.9%), Cohen's d = 1.3. Score adjustment changed the pass/fail classification for up to 16% of students depending on the modeled cut score. Internet and embedded video scoring showed no difference in mean scores or variability. Examiners' accuracy did not deteriorate over the 3-week Internet scoring period. CONCLUSIONS: Examiner-cohorts produced a replicable, significant influence on OSCE scores that was unaccounted for by typical assessment psychometrics. VESCA offers a promising means to enhance validity and fairness in distributed OSCEs or national exams. Internet-based scoring may enhance VESCA's feasibility.

Assuntos

Competência Clínica , Avaliação Educacional , Avaliação Educacional/métodos , Humanos , Exame Físico , Psicometria

11.

Understanding and developing procedures for video-based assessment in medical education.

Yeates, Peter; Moult, Alice; Lefroy, Janet; Walsh-House, Jacqualyn; Clews, Lorraine; McKinley, Robert; Fuller, Richard.

Med Teach ; 42(11): 1250-1260, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32749915

RESUMO

INTRODUCTION: Novel uses of video aim to enhance assessment in health-professionals education. Whilst these uses presume equivalence between video and live scoring, some research suggests that poorly understood variations could challenge validity. We aimed to understand examiners' and students' interaction with video whilst developing procedures to promote its optimal use. METHODS: Using design-based research we developed theory and procedures for video use in assessment, iteratively adapting conditions across simulated OSCE stations. We explored examiners' and students' perceptions using think-aloud, interviews and focus group. Data were analysed using constructivist grounded-theory methods. RESULTS: Video-based assessment produced detachment and reduced volitional control for examiners. Examiners ability to make valid video-based judgements was mediated by the interaction of station content and specifically selected filming parameters. Examiners displayed several judgemental tendencies which helped them manage videos' limitations but could also bias judgements in some circumstances. Students rarely found carefully-placed cameras intrusive and considered filming acceptable if adequately justified. DISCUSSION: Successful use of video-based assessment relies on balancing the need to ensure station-specific information adequacy; avoiding disruptive intrusion; and the degree of justification provided by video's educational purpose. Video has the potential to enhance assessment validity and students' learning when an appropriate balance is achieved.

Assuntos

Competência Clínica , Educação Médica , Avaliação Educacional , Humanos , Julgamento

12.

Comparing the influence of 'describing findings to the examiner' or 'examining as in usual practice' on the students' performance and assessors' judgements during physical examination skills assessment.

Stephenson, Catherine; Yeates, Peter; Lefroy, Janet.

MedEdPublish (2016) ; 9: 18, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-38073781

RESUMO

This article was migrated. The article was marked as recommended. BACKGROUND: Within assessment of physical examination skills, two approaches are common: "Describing Findings" (students comment throughout); and examining as "Usual Practice" (students only report findings at the end). Despite numerous potential influences on both students' performances and assessors' judgements, no prior studies have investigated the influence of either approach on assessments. METHODS: Two group, randomised, crossover design. Within a 2-station simulated physical examination OSCE, we manipulated whether students "described findings" or examined as "usual practice", collecting 1/. performance scores; 2/. Students'/examiners' cognitive load ratings; ratings of the 3/. fluency and 4/. completeness of students' presentations and 5/. Students' task-finishing, comparing all 5 end-points across conditions. RESULTS: Neither students' performance scores nor examiners' cognitive load were influenced by experimental condition. Students reported higher cognitive load (7/9) when "describing findings" than "usual practice" (6/9, p=0.002), and were less likely to finish (4 vs 12, p=0.007). Presentation completeness was higher for "describing findings" (mean=2.40, (95CIs=2.05-2.74)) than "usual practice" (mean=1.92 (1.65-2.18),p=0.016), whilst fluency ratings showed a similar trend. CONCLUSIONS: The decision to "Describe Findings" or examine as "Usual Practice" does not appear neutral, potentially influencing students' efficiency, recall and (by inference) learning. Institutions should explicitly select one option based on assessment goals.

13.

The mental workload of conducting research in assessor cognition.

Gingerich, Andrea; Yeates, Peter.

Perspect Med Educ ; 8(6): 315-316, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31745709

Assuntos

Cognição , Carga de Trabalho

14.

Perils, pitfalls and potential for the use of reporting guidelines in experimental research in medical education.

Moult, Alice; Yeates, Peter.

Perspect Med Educ ; 8(4): 207-208, 2019 08.

Artigo em Inglês | MEDLINE | ID: mdl-31325045

Assuntos

Educação Médica , Emoções , Aprendizagem

15.

Exploring differences in individual and group judgements in standard setting.

Yeates, Peter; Cope, Natalie; Luksaite, Eva; Hassell, Andrew; Dikomitis, Lisa.

Med Educ ; 53(9): 941-952, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31264741

RESUMO

CONTEXT: Standard setting is critically important to assessment decisions in medical education. Recent research has demonstrated variations between medical schools in the standards set for shared items. Despite the centrality of judgement to criterion-referenced standard setting methods, little is known about the individual or group processes that underpin them. This study aimed to explore the operation and interaction of these processes in order to illuminate potential sources of variability. METHODS: Using qualitative research, we purposively sampled across UK medical schools that set a low, medium or high standard on nationally shared items, collecting data by observation of graduation-level standard-setting meetings and semi-structured interviews with standard-setting judges. Data were analysed using thematic analysis based on the principles of grounded theory. RESULTS: Standard setting occurred through the complex interaction of institutional context, judges' individual perspectives and group interactions. Schools' procedures, panel members and atmosphere produced unique contexts. Individual judges formed varied understandings of the clinical and technical features of each question, relating these to their differing (sometimes contradictory) conceptions of minimally competent students, by balancing information and making suppositions. Conceptions of minimal competence variously comprised: limited attendance; limited knowledge; poor knowledge application; emotional responses to questions; 'test-savviness', or a strategic focus on safety. Judges experienced tensions trying to situate these abstract conceptions in reality, revealing uncertainty. Groups constructively revised scores through debate, sharing information and often constructing detailed clinical representations of cases. Groups frequently displayed conformity, illustrating a belief that outlying judges were likely to be incorrect. Less frequently, judges resisted change, using emphatic language, bargaining or, rarely, 'polarisation' to influence colleagues. CONCLUSIONS: Despite careful conduct through well-established procedures, standard setting is judgementally complex and involves uncertainty. Understanding whether or how these varied processes produce the previously observed variations in outcomes may offer routes to enhance equivalence of criterion-referenced standards.

Assuntos

Competência Clínica/normas , Educação de Graduação em Medicina , Julgamento , Tomada de Decisões , Avaliação Educacional/métodos , Processos Grupais , Conhecimentos, Atitudes e Prática em Saúde , Humanos , Padrões de Referência , Faculdades de Medicina , Reino Unido

16.

Developing a video-based method to compare and adjust examiner effects in fully nested OSCEs.

Yeates, Peter; Cope, Natalie; Hawarden, Ashley; Bradshaw, Hannah; McCray, Gareth; Homer, Matt.

Med Educ ; 53(3): 250-263, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-30575092

RESUMO

BACKGROUND: Although averaging across multiple examiners' judgements reduces unwanted overall score variability in objective structured clinical examinations (OSCE), designs involving several parallel circuits of the OSCE require that different examiner cohorts collectively judge performances to the same standard in order to avoid bias. Prior research suggests the potential for important examiner-cohort effects in distributed or national examinations that could compromise fairness or patient safety, but despite their importance, these effects are rarely investigated because fully nested assessment designs make them very difficult to study. We describe initial use of a new method to measure and adjust for examiner-cohort effects on students' scores. METHODS: We developed video-based examiner score comparison and adjustment (VESCA): volunteer students were filmed 'live' on 10 out of 12 OSCE stations. Following the examination, examiners additionally scored station-specific common-comparator videos, producing partial crossing between examiner cohorts. Many-facet Rasch modelling and linear mixed modelling were used to estimate and adjust for examiner-cohort effects on students' scores. RESULTS: After accounting for students' ability, examiner cohorts differed substantially in their stringency or leniency (maximal global score difference of 0.47 out of 7.0 [Cohen's d = 0.96]; maximal total percentage score difference of 5.7% [Cohen's d = 1.06] for the same student ability by different examiner cohorts). Corresponding adjustment of students' global and total percentage scores altered the theoretical classification of 6.0% of students for both measures (either pass to fail or fail to pass), whereas 8.6-9.5% students' scores were altered by at least 0.5 standard deviations of student ability. CONCLUSIONS: Despite typical reliability, the examiner cohort that students encountered had a potentially important influence on their score, emphasising the need for adequate sampling and examiner training. Development and validation of VESCA may offer a means to measure and adjust for potential systematic differences in scoring patterns that could exist between locations in distributed or national OSCE examinations, thereby ensuring equivalence and fairness.

Assuntos

Competência Clínica/normas , Educação de Graduação em Medicina/normas , Avaliação Educacional/métodos , Avaliação Educacional/normas , Variações Dependentes do Observador , Gravação de Videoteipe/métodos , Educação de Graduação em Medicina/métodos , Humanos , Reprodutibilidade dos Testes , Estudantes de Medicina

17.

Comparatively salient: examining the influence of preceding performances on assessors' focus and interpretations in written assessment comments.

Gingerich, Andrea; Schokking, Edward; Yeates, Peter.

Adv Health Sci Educ Theory Pract ; 23(5): 937-959, 2018 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-29980956

RESUMO

Recent literature places more emphasis on assessment comments rather than relying solely on scores. Both are variable, however, emanating from assessment judgements. One established source of variability is "contrast effects": scores are shifted away from the depicted level of competence in a preceding encounter. The shift could arise from an effect on the range-frequency of assessors' internal scales or the salience of performance aspects within assessment judgments. As these suggest different potential interventions, we investigated assessors' cognition by using the insight provided by "clusters of consensus" to determine whether any change in the salience of performance aspects was induced by contrast effects. A dataset from a previous experiment contained scores and comments for 3 encounters: 2 with significant contrast effects and 1 without. Clusters of consensus were identified using F-sort and latent partition analysis both when contrast effects were significant and non-significant. The proportion of assessors making similar comments only significantly differed when contrast effects were significant with assessors more frequently commenting on aspects that were dissimilar with the standard of competence demonstrated in the preceding performance. Rather than simply influencing range-frequency of assessors' scales, preceding performances may affect salience of performance aspects through comparative distinctiveness: when juxtaposed with the context some aspects are more distinct and selectively draw attention. Research is needed to determine whether changes in salience indicate biased or improved assessment information. The potential should be explored to augment existing benchmarking procedures in assessor training by cueing assessors' attention through observation of reference performances immediately prior to assessment.

Assuntos

Avaliação Educacional/normas , Ocupações em Saúde/educação , Variações Dependentes do Observador , Competência Clínica , Cognição , Comunicação , Avaliação Educacional/métodos , Humanos , Julgamento , Anamnese , Relações Profissional-Paciente , Método Simples-Cego , Reino Unido

18.

Exploring the relationship between examiners' memories for performances, domain separation and score variability.

Cleaton, Natasha; Yeates, Peter; McCray, Gareth.

Med Teach ; 40(11): 1159-1165, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-29703091

RESUMO

Background: OSCE examiners' scores are variable and may discriminate domains of performance poorly. Examiners must hold their observations of OSCE performances in "episodic memory" until performances end. We investigated whether examiners vary in their recollection of performances; and whether this relates to their score variability or ability to separate disparate performance domains. Methods: Secondary analysis was performed on data where examiners had: 1/scored videos of OSCE performances showing disparate student ability in different domains; and 2/performed a measure of recollection for an OSCE performance. We calculated measures of "overall-score variance" (the degree individual examiners' overall scores varied from the group mean) and "domain separation" (the degree to which examiners separated different performance domains). We related these variables to the measure of examiners' recollection. Results: Examiners varied considerably in their recollection accuracy (recognition beyond chance -5% to +75% for different examiners). Examiners' recollection accuracy was weakly inversely related to their overall score accuracy (R = -0.17, p < 0.001) and related to their ability to separate domains of performance (R = 0.25, p < 0.001). Conclusions: Examiners vary substantially in their memories for students' performances which may offer a useful point of difference to study processing and integration phases of judgment. Findings could have implication for the utility of feedback.

Assuntos

Avaliação Educacional/normas , Julgamento , Rememoração Mental , Variações Dependentes do Observador , Competência Clínica , Feminino , Humanos , Masculino , Análise de Regressão , Reino Unido

19.

A randomised trial of the influence of racial stereotype bias on examiners' scores, feedback and recollections in undergraduate clinical exams.

Yeates, Peter; Woolf, Katherine; Benbow, Emyr; Davies, Ben; Boohan, Mairhead; Eva, Kevin.

BMC Med ; 15(1): 179, 2017 10 25.

Artigo em Inglês | MEDLINE | ID: mdl-29065875

RESUMO

BACKGROUND: Asian medical students and doctors receive lower scores on average than their white counterparts in examinations in the UK and internationally (a phenomenon known as "differential attainment"). This could be due to examiner bias or to social, psychological or cultural influences on learning or performance. We investigated whether students' scores or feedback show influence of ethnicity-related bias; whether examiners unconsciously bring to mind (activate) stereotypes when judging Asian students' performance; whether activation depends on the stereotypicality of students' performances; and whether stereotypes influence examiner memories of performances. METHODS: This is a randomised, double-blinded, controlled, Internet-based trial. We created near-identical videos of medical student performances on a simulated Objective Structured Clinical Exam using British Asian and white British actors. Examiners were randomly assigned to watch performances from white and Asian students that were either consistent or inconsistent with a previously described stereotype of Asian students' performance. We compared the two examiner groups in terms of the following: the scores and feedback they gave white and Asian students; how much the Asian stereotype was activated in their minds (response times to Asian-stereotypical vs neutral words in a lexical decision task); and whether the stereotype influenced memories of student performances (recognition rates for real vs invented stereotype-consistent vs stereotype-inconsistent phrases from one of the videos). RESULTS: Examiners responded to Asian-stereotypical words (716 ms, 95% confidence interval (CI) 702-731 ms) faster than neutral words (769 ms, 95% CI 753-786 ms, p < 0.001), suggesting Asian stereotypes were activated (or at least active) in examiners' minds. This occurred regardless of whether examiners observed stereotype-consistent or stereotype-inconsistent performances. Despite this stereotype activation, student ethnicity had no influence on examiners' scores; on the feedback examiners gave; or on examiners' memories for one performance. CONCLUSIONS: Examiner bias does not appear to explain the differential attainment of Asian students in UK medical schools. Efforts to ensure equality should focus on social, psychological and cultural factors that may disadvantage learning or performance in Asian and other minority ethnic students.

Assuntos

Competência Clínica , Educação Médica/normas , Povo Asiático , Método Duplo-Cego , Feminino , Humanos , Masculino , Racismo , Estudantes de Medicina , População Branca

20.

Hawks, Doves and Rasch decisions: Understanding the influence of different cycles of an OSCE on students' scores using Many Facet Rasch Modeling.

Yeates, Peter; Sebok-Syer, Stefanie S.

Med Teach ; 39(1): 92-99, 2017 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-27897083

RESUMO

INTRODUCTION: OSCEs are commonly conducted in multiple cycles (different circuits, times, and locations), yet the potential for students' allocation to different OSCE cycles is rarely considered as a source of variance-perhaps in part because conventional psychometrics provide limited insight. METHODS: We used Many Facet Rasch Modeling (MFRM) to estimate the influence of "examiner cohorts" (the combined influence of the examiners in the cycle to which each student was allocated) on students' scores within a fully nested multi-cycle OSCE. RESULTS: Observed average scores for examiners cycles varied by 8.6%, but model-adjusted estimates showed a smaller range of 4.4%. Most students' scores were only slightly altered by the model; the greatest score increase was 5.3%, and greatest score decrease was -3.6%, with 2 students passing who would have failed. DISCUSSION: Despite using 16 examiners per cycle, examiner variability did not completely counter-balance, resulting in an influence of OSCE cycles on students' scores. Assumptions were required for the MFRM analysis; innovative procedures to overcome these limitations and strengthen OSCEs are discussed. CONCLUSIONS: OSCE cycle allocation has the potential to exert a small but unfair influence on students' OSCE scores; these little-considered influences should challenge our assumptions and design of OSCEs.

Assuntos

Educação de Graduação em Medicina/métodos , Educação de Graduação em Medicina/normas , Avaliação Educacional/métodos , Avaliação Educacional/normas , Competência Clínica , Humanos , Variações Dependentes do Observador , Resolução de Problemas , Reprodutibilidade dos Testes , Fatores de Tempo

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA