Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Adv Health Sci Educ Theory Pract ; 20(5): 1263-89, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25808311

RESUMO

The authors report final-year ward simulation data from the University of Dundee Medical School. Faculty who designed this assessment intend for the final score to represent an individual senior medical student's level of clinical performance. The results are included in each student's portfolio as one source of evidence of the student's capability as a practitioner, professional, and scholar. Our purpose in conducting this study was to illustrate how assessment designers who are creating assessments to evaluate clinical performance might develop propositions and then collect and examine various sources of evidence to construct and evaluate a validity argument. The data were from all 154 medical students who were in their final year of study at the University of Dundee Medical School in the 2010-2011 academic year. To the best of our knowledge, this is the first report on an analysis of senior medical students' clinical performance while they were taking responsibility for the management of a simulated ward. Using multi-facet Rasch measurement and a generalizability theory approach, we examined various sources of validity evidence that the medical school faculty have gathered for a set of six propositions needed to support their use of scores as measures of students' clinical ability. Based on our analysis of the evidence, we would conclude that, by and large, the propositions appear to be sound, and the evidence seems to support their proposed score interpretation. Given the body of evidence collected thus far, their intended interpretation seems defensible.


Assuntos
Competência Clínica , Avaliação Educacional/métodos , Avaliação Educacional/normas , Simulação de Paciente , Comunicação , Feminino , Humanos , Relações Interprofissionais , Masculino , Segurança do Paciente , Relações Médico-Paciente , Profissionalismo , Distribuição Aleatória , Reprodutibilidade dos Testes
2.
Am J Pharm Educ ; 77(1): 5, 2013 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-23459441

RESUMO

Objective. To determine if defined subgroups of pharmacists' have variability in their expectations for competency of entry-level practitioners.Methods. Rating scale data collected from the 2009 National Pharmacy Practice Survey were analyzed to determine to what extent pharmacists' degree, practice setting, and experience as a preceptor were associated with the ratings they assigned to 43 competency statements for entry-level practitioners. The competency statements determine the content on the North American Pharmacist Licensure Examination (NAPLEX).Results. Pharmacists with a doctor of pharmacy (PharmD) degree rated the co mpetency statements higher in terms of criticality to entry-level practice than did those with a bachelor of science (BS) degree (p< 0.05). Pharmacists working in inpatient settings gave slightly higher ratings to the competency statements than did pharmacists working in outpatient settings, pharmacists without direct patient care responsibilities, and those in academia. However, there were no significant differences among practitioner subgroups' criticality ratings with regard to practice setting. Preceptor pharmacists' criticality ratings of the competency statements were not significantly different from those of non-preceptor practitioners. Conclusion. Pharmacists exhibited a fair amount of agreement in their expectations for the competence of entry-level practitioners independent of their practice sites and professional roles. As the pharmacy profession embraces patient-centered clinical practice, evaluating practicing pharmacists' expectations for entry-level practitioners will provide useful information to the practitioners and academicians involved in training future pharmacists. Stakeholders in pharmacy education and regulation have vested interests in the alignment of the education of future practitioners with the needs of the profession.


Assuntos
Atitude do Pessoal de Saúde , Educação em Farmácia/normas , Conhecimentos, Atitudes e Prática em Saúde , Farmacêuticos/psicologia , Farmacêuticos/normas , Competência Profissional/normas , Papel Profissional , Serviços Comunitários de Farmácia/normas , Coleta de Dados , Educação de Pós-Graduação em Farmácia/normas , Avaliação Educacional , Escolaridade , Docentes/normas , Humanos , Licenciamento em Farmácia/normas , Modelos Lineares , Serviço de Farmácia Hospitalar/normas , Preceptoria/normas , Sociedades Farmacêuticas , Estados Unidos
3.
Acad Med ; 88(2): 216-23, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23269299

RESUMO

PURPOSE: The authors report multiple mini-interview (MMI) selection process data at the University of Dundee Medical School; staff, students, and simulated patients were examiners and investigated how effective this process was in separating candidates for entry into medical school according to the attributes measured, whether the different groups of examiners exhibited systematic differences in their rating patterns, and what effect such differences might have on candidates' scores. METHOD: The 452 candidates assessed in 2009 rotated through the same 10-station MMI that measured six noncognitive attributes. Each candidate was rated by one examiner in each station. Scores were analyzed using Facets software, with candidates, examiners, and stations as facets. The computer program calculated fair average scores that adjusted for examiner severity/leniency and station difficulty. RESULTS: The MMI reliably (0.89) separated the candidates into four statistically distinct levels of noncognitive ability. The Rasch measures accounted for 31.69% of the total variance in the ratings (candidates 16.01%, examiners 11.32%, and stations 4.36%). Students rated more severely than staff and also had more unexpected ratings. Adjusting scores for examiner severity/leniency and station difficulty would have changed the selection outcomes for 9.6% of the candidates. CONCLUSIONS: The analyses highlighted the fact that quality control monitoring is essential to ensure fairness when ranking candidates according to scores obtained in the MMI. The results can be used to identify examiners needing further training, or who should not be included again, as well as stations needing review. "Fair average" scores should be used for ranking the candidates.


Assuntos
Testes de Aptidão , Entrevistas como Assunto/métodos , Critérios de Admissão Escolar , Faculdades de Medicina , Adolescente , Adulto , Docentes de Medicina , Feminino , Humanos , Entrevistas como Assunto/normas , Masculino , Modelos Estatísticos , Variações Dependentes do Observador , Psicometria , Escócia , Estudantes de Medicina/psicologia , Adulto Jovem
4.
Acad Med ; 84(11): 1603-9, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19858824

RESUMO

PURPOSE: To determine (1) whether judges differed in the levels of severity they exercised when rating candidates' performance in an oral certification exam, (2) to what extent candidates' clinical competence ratings were related to their organization/communication ratings, and (3) to what extent clinical competence ratings could predict organization/communication ratings. METHOD: Six hundred eighty-four physicians participated in a medical specialty board's 2002 oral examination. Ninety-nine senior members of the medical specialty served as judges, rating candidates' performances. Candidates' clinical competence ratings were analyzed using multifaceted Rasch measurement to investigate judge severity. A Pearson correlation was calculated to examine the relationship between ratings of clinical competence and organization/communication. Logistic regression was used to determine to what extent clinical competence ratings predicted organization/communication ratings. RESULTS: There were about three statistically distinct strata of judge severity; judges were not interchangeable. There was a moderately strong relationship between the two sets of candidate ratings. Higher clinical competence ratings were associated with an organization/communication rating of acceptable, whereas lower clinical competence ratings were associated with an organization/communication rating of unacceptable. The judges' clinical competence ratings correctly predicted 61.9% of the acceptable and 88.3% of the unacceptable organization/communication ratings. Overall, the clinical competence ratings correctly predicted 80% of the organization/communication ratings. CONCLUSIONS: The close association between the two sets of ratings was possibly due to a "halo" effect. Several explanations for this relationship were explored, and the authors considered the implications for their understanding of how judges carry out this complex rating task.


Assuntos
Certificação , Competência Clínica/normas , Comunicação , Tomada de Decisões , Percepção Social , Conselhos de Especialidade Profissional/normas , Avaliação Educacional/normas , Humanos , Illinois , Modelos Logísticos , Psicometria
5.
J Appl Meas ; 10(1): 52-69, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19299885

RESUMO

This article is based on a more extensive research report (Engelhard, Myford and Cline, 2000) prepared for the National Board for Professional Teaching Standards (NBPTS) concerning the Early Childhood/Generalist and Middle Childhood/Generalist assessment systems. The report is available from the Educational Testing Service (ETS). An earlier version of the article was presented at the American Educational Research Association Conference in New Orleans in 2000. We would like to acknowledge the helpful advice of Mike Linacre regarding the use of the FACETS computer program and the assistance of Fred Cline in analyzing these data. The material contained in this article is based on work supported by the NBPTS. Any opinions, findings, conclusions, and recommendations expressed herein are those of the authors and do not necessarily reflect the views of the NBPTS, Emory University, ETS, or the University of Illinois at Chicago.


Assuntos
Estudos de Avaliação como Assunto , Competência Profissional/normas , Ensino/normas , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Competência Profissional/estatística & dados numéricos
6.
Adv Health Sci Educ Theory Pract ; 14(4): 575-94, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18985427

RESUMO

The investigators used evidence based on response processes to evaluate and improve the validity of scores on the Patient-Centered Communication and Interpersonal Skills (CIS) Scale for the assessment of residents' communication competence. The investigators retrospectively analyzed the communication skills ratings of 68 residents at the University of Illinois at Chicago (UIC). Each resident encountered six standardized patients (SPs) portraying six cases. SPs rated the performance of each resident using the CIS Scale--an 18-item rating instrument asking for level of agreement on a 5-category scale. A many-faceted Rasch measurement model was used to determine how effectively each item and scale on the rating instrument performed. The analyses revealed that items were too easy for the residents. The SPs underutilized the lowest rating category, making the scale function as a 4-category rating scale. Some SPs were inconsistent when assigning ratings in the middle categories. The investigators modified the rating instrument based on the findings, creating the Revised UIC Communication and Interpersonal Skills (RUCIS) Scale--a 13-item rating instrument that employs a 4-category behaviorally anchored rating scale for each item. The investigators implemented the RUCIS Scale in a subsequent communication skills OSCE for 85 residents. The analyses revealed that the RUCIS Scale functioned more effectively than the CIS Scale in several respects (e.g., a more uniform distribution of ratings across categories, and better fit of the items to the measurement model). However, SPs still rarely assigned ratings in the lowest rating category of each scale.


Assuntos
Comunicação , Internato e Residência , Relações Interpessoais , Assistência Centrada no Paciente , Relações Médico-Paciente , Feminino , Indicadores Básicos de Saúde , Humanos , Masculino , Pesquisa , Estudos Retrospectivos
7.
Adv Health Sci Educ Theory Pract ; 13(4): 479-93, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17310306

RESUMO

An Objective Structured Clinical Examination (OSCE) is an effective method for evaluating competencies. However, scores obtained from an OSCE are vulnerable to many potential measurement errors that cases, items, or standardized patients (SPs) can introduce. Monitoring these sources of errors is an important quality control mechanism to ensure valid interpretations of the scores. We describe how one can use generalizability theory (GT) and many-faceted Rasch measurement (MFRM) approaches in quality control monitoring of an OSCE. We examined the communication skills OSCE of 79 residents from one Midwestern university in the United States. Each resident performed six communication tasks with SPs, who rated the performance of each resident using 18 5-category rating scale items. We analyzed their ratings with generalizability and MFRM studies. The generalizability study revealed that the largest source of error variance besides the residual error variance was SPs/cases. The MFRM study identified specific SPs/cases and items that introduced measurement errors and suggested the nature of the errors. SPs/cases were significantly different in their levels of severity/difficulty. Two SPs gave inconsistent ratings, which suggested problems related to the ways they portrayed the case, their understanding of the rating scale, and/or the case content. SPs interpreted two of the items inconsistently, and the rating scales for two items did not function as 5-category scales. We concluded that generalizability and MFRM analyses provided useful complementary information for monitoring and improving the quality of an OSCE.


Assuntos
Comunicação , Avaliação Educacional/métodos , Medicina Interna/educação , Internato e Residência , Controle de Qualidade , Adulto , Distribuição de Qui-Quadrado , Competência Clínica , Educação de Pós-Graduação em Medicina , Feminino , Humanos , Masculino , Simulação de Paciente
8.
J Appl Meas ; 5(2): 189-227, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15064538

RESUMO

The purpose of this two-part paper is to introduce researchers to the many-facet Rasch measurement (MFRM) approach for detecting and measuring rater effects. In Part II of the paper, researchers will learn how to use the Facets (Linacre, 2001) computer program to study five effects: leniency/severity, central tendency, randomness, halo, and differential leniency/severity. As we introduce each effect, we operationally define it within the context of a MFRM approach, specify the particular measurement model(s) needed to detect it, identify group- and individual-level statistical indicators of the effect, and show output from a Facets analysis, pinpointing the various indicators and explaining how to interpret each one. At the close of the paper, we describe other statistical procedures that have been used to detect and measure rater effects to help researchers become aware of important and influential literature on the topic and to gain an appreciation for the diversity of psychometric perspectives that researchers bring to bear on their work. Finally, we consider future directions for research in the detection and measurement of rater effects.


Assuntos
Modelos Psicológicos , Psicometria/métodos , Psicometria/estatística & dados numéricos , Humanos , Variações Dependentes do Observador
9.
J Appl Meas ; 4(4): 386-422, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-14523257

RESUMO

The purpose of this two-part paper is to introduce researchers to the many-facet Rasch measurement (MFRM) approach for detecting and measuring rater effects. The researcher will learn how to use the Facets (Linacre, 2001) computer program to study five effects: leniency/severity, central tendency, randomness, halo, and differential leniency/severity. Part 1 of the paper provides critical background and context for studying MFRM. We present a catalog of rater effects, introducing effects that researchers have studied over the last three-quarters of a century in order to help readers gain a historical perspective on how those effects have been conceptualized. We define each effect and describe various ways the effect has been portrayed in the research literature. We then explain how researchers theorize that the effect impacts the quality of ratings, pinpoint various indices they have used to measure it, and describe various strategies that have been proposed to try to minimize its impact on the measurement of ratees. The second half of Part 1 provides conceptual and mathematical explanations of many-facet Rasch measurement, focusing on how researchers can use MFRM to study rater effects. First, we present the many-facet version of Andrich's (1978) rating scale model and identify questions about a rating operation that researchers can address using this model. We then introduce three hybrid MFRM models, explain the conceptual distinctions among them, describe how they differ from the rating scale model, and identify questions about a rating operation that researchers can address using these hybrid models.


Assuntos
Modelos Estatísticos , Pesquisa/normas , Humanos , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Pesquisa/estatística & dados numéricos
10.
J Appl Meas ; 3(3): 300-24, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-12147915

RESUMO

The purpose of this study was to examine a procedure for identifying and resolving discrepancies in ratings. We sought to determine to what extent the third-rater adjudication procedure employed in scoring the Test of Spoken English (TSE) successfully identified all anomalous ratings. We analyzed data from the April 1997 TSE scoring session using FACETS, a rating scale analysis computer program. The results suggest that, while it is important for an assessment program to identify cases in which there is obvious disagreement in the ratings assigned and have a policy to resolve those disagreements, implementing a discrepancy resolution procedure is not sufficient in and of itself for quality control monitoring. Often times, there are other anomalous ratings that discrepancy resolution procedures may miss. Fit analysis can provide a valuable adjunct to a discrepancy resolution procedure, flagging suspect rating profiles in need of expert review before a final score report is issued.


Assuntos
Avaliação Educacional/normas , Testes de Linguagem/estatística & dados numéricos , Adulto , Avaliação Educacional/métodos , Feminino , Humanos , Modelos Lineares , Masculino , Computação Matemática , Variações Dependentes do Observador , Controle de Qualidade , Software/normas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...