Pesquisa | Portal Regional da BVS (teste)

Examinee Cohort Size and Item Analysis Guidelines for Health Professions Education Programs: A Monte Carlo Simulation Study.

Aubin, André-Sébastien; Young, Meredith; Eva, Kevin; St-Onge, Christina.

Acad Med ; 95(1): 151-156, 2020 01.

Artigo em Inglês | MEDLINE | ID: mdl-31335813

RESUMO

PURPOSE: Using item analyses is an important quality-monitoring strategy for written exams. Authors urge caution as statistics may be unstable with small cohorts, making application of guidelines potentially detrimental. Given the small cohorts common in health professions education, this study's aim was to determine the impact of cohort size on outcomes arising from the application of item analysis guidelines. METHOD: The authors performed a Monte Carlo simulation study in fall 2015 to examine the impact of applying 2 commonly used item analysis guidelines on the proportion of items removed and overall exam reliability as a function of cohort size. Three variables were manipulated: Cohort size (6 levels), exam length (6 levels), and exam difficulty (3 levels). Study parameters were decided based on data provided by several Canadian medical schools. RESULTS: The analyses showed an increase in proportion of items removed with decreases in exam difficulty and decreases in cohort size. There was no effect of exam length on this outcome. Exam length had a greater impact on exam reliability than did cohort size after applying item analysis guidelines. That is, exam reliability decreased more with shorter exams than with smaller cohorts. CONCLUSIONS: Although program directors and assessment creators have little control over their cohort sizes, they can control the length of their exams. Creating longer exams makes it possible to remove items without as much negative impact on the exam's reliability relative to shorter exams, thereby reducing the negative impact of small cohorts when applying item removal guidelines.

Assuntos

Currículo/normas , Avaliação Educacional/normas , Ocupações em Saúde/educação , Faculdades de Medicina/estatística & dados numéricos , Canadá/epidemiologia , Estudos de Coortes , Avaliação Educacional/estatística & dados numéricos , Estudos de Avaliação como Assunto , Guias como Assunto , Ocupações em Saúde/normas , Humanos , Método de Monte Carlo , Psicometria/métodos , Reprodutibilidade dos Testes , Fatores de Tempo

Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study.

Aubin, André-Sébastien; St-Onge, Christina; Renaud, Jean-Sébastien.

Perspect Med Educ ; 7(2): 83-92, 2018 04.

Artigo em Inglês | MEDLINE | ID: mdl-29294255

RESUMO

INTRODUCTION: With the Standards voicing concern for the appropriateness of response processes, we need to explore strategies that would allow us to identify inappropriate rater response processes. Although certain statistics can be used to help detect rater bias, their use is complicated by either a lack of data about their actual power to detect rater bias or the difficulty related to their application in the context of health professions education. This exploratory study aimed to establish the worthiness of pursuing the use of l z to detect rater bias. METHODS: We conducted a Monte Carlo simulation study to investigate the power of a specific detection statistic, that is: the standardized likelihood l z person-fit statistics (PFS). Our primary outcome was the detection rate of biased raters, namely: raters whom we manipulated into being either stringent (giving lower scores) or lenient (giving higher scores), using the l z statistic while controlling for the number of biased raters in a sample (6 levels) and the rate of bias per rater (6 levels). RESULTS: Overall, stringent raters (M = 0.84, SD = 0.23) were easier to detect than lenient raters (M = 0.31, SD = 0.28). More biased raters were easier to detect then less biased raters (60% bias: 62, SD = 0.37; 10% bias: 43, SD = 0.36). DISCUSSION: The PFS l z seems to offer an interesting potential to identify biased raters. We observed detection rates as high as 90% for stringent raters, for whom we manipulated more than half their checklist. Although we observed very interesting results, we cannot generalize these results to the use of PFS with estimated item/station parameters or real data. Such studies should be conducted to assess the feasibility of using PFS to identify rater bias.

Assuntos

Viés , Avaliação Educacional/normas , Variações Dependentes do Observador , Pesquisadores/psicologia , Análise de Variância , Humanos , Método de Monte Carlo , Pesquisadores/normas

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA