Search | VHL Regional Portal

A Comparative Evaluation of Kernel Equating and Test Characteristic Curve Equating.

De Ayala, R J; Smith, Bradley; Norman Dvorak, Rebecca.

Appl Psychol Meas ; 42(2): 155-168, 2018 Mar.

Article in English | MEDLINE | ID: mdl-29881117

ABSTRACT

This study compares the kernel equating (KE) and test characteristic curve (TCC) equating methods using the nonequivalent anchor test equating design. In this Monte Carlo study, four independent variables were examined: sample size, test length, average form discrimination, anchor test reliability, and the percentage of anchor items. For each condition, there were 100 replications. To assess the performance of TCC equating and KE, the differences between the examinee parametric true scores and the equated estimated expected true scores were examined. The equated scores were based on the average across replications for each condition. Generally speaking, both KE and TCC equating produced accurate results, although KE tended to perform better than TCC on the parametric true score scale across conditions. Past research and the current study's results seem to indicate that KE should be strongly considered for most equating situations, particularly in light of its flexibility.

An introduction to mixture item response theory models.

De Ayala, R J; Santiago, S Y.

J Sch Psychol ; 60: 25-40, 2017 02.

Article in English | MEDLINE | ID: mdl-28164797

ABSTRACT

Mixture item response theory (IRT) allows one to address situations that involve a mixture of latent subpopulations that are qualitatively different but within which a measurement model based on a continuous latent variable holds. In this modeling framework, one can characterize students by both their location on a continuous latent variable as well as by their latent class membership. For example, in a study of risky youth behavior this approach would make it possible to estimate an individual's propensity to engage in risky youth behavior (i.e., on a continuous scale) and to use these estimates to identify youth who might be at the greatest risk given their class membership. Mixture IRT can be used with binary response data (e.g., true/false, agree/disagree, endorsement/not endorsement, correct/incorrect, presence/absence of a behavior), Likert response scales, partial correct scoring, nominal scales, or rating scales. In the following, we present mixture IRT modeling and two examples of its use. Data needed to reproduce analyses in this article are available as supplemental online materials at http://dx.doi.org/10.1016/j.jsp.2016.01.002.

Subject(s)

Models, Statistical , Psychometrics/methods , Adolescent , Humans

A Validation and Reliability Study of the Physical Activity and Healthy Food Efficacy Scale for Children (PAHFE).

Perry, Christina M; De Ayala, R J; Lebow, Ryan; Hayden, Emily.

Health Educ Behav ; 35(3): 346-60, 2008 Jun.

Article in English | MEDLINE | ID: mdl-17449632

ABSTRACT

The purpose of this study was to obtain validity evidence for the Physical Activity and Healthy Food Efficacy Scale for Children (PAHFE). Construct validity evidence identifies four subscales: Goal-Setting for Physical Activity, Goal-Setting for Healthy Food Choices, Decision-Making for Physical Activity, and Decision-Making for Healthy Food Choices. The scores on each of these subscales show a moderate to high degree of internal consistency (0.59

Subject(s)

Diet/psychology , Exercise/psychology , Self Efficacy , Surveys and Questionnaires , Adolescent , Child , Female , Goals , Health Education , Humans , Male , Reproducibility of Results , Socioeconomic Factors

Estimating person locations from partial credit data containing missing responses.

De Ayala, R J.

J Appl Meas ; 7(3): 278-91, 2006.

Article in English | MEDLINE | ID: mdl-16807494

ABSTRACT

Certain assessment situations produce partial credit data. For instance, performance assessment items may utilize a rubric that assigns partial credit for some not completely correct responses. In some cases examinees may choose to not answer each question. This study investigated the effect of various strategies for handling these missing responses for estimating a respondent's location. These methods included ignoring the omitted response, selecting the "midpoint" category score, treating the omitted response as incorrect, hotdecking, and a likelihood-based approach. A simulation study was performed to examine the efficacy of these methods with the partial credit and generalized partial credit models. Expected a posteriori (EAP) ability estimation was used. Results showed that the Midpoint and Likelihood procedures performed the best of methods examined. In contrast, omitted responses should not be treated as incorrect nor ignored when estimating an examinee's proficiency using EAP. Implications for practitioners are discussed.

Subject(s)

Data Interpretation, Statistical , Models, Statistical , Surveys and Questionnaires , Data Collection/methods , Humans , United States

The effect of missing data on estimating a respondent's location using ratings data.

De Ayala, R J.

J Appl Meas ; 4(1): 1-9, 2003.

Article in English | MEDLINE | ID: mdl-12700427

ABSTRACT

In social science research there are a number of instruments that utilize a rating scale such as a Likert response scale. For a number of reasons a respondent's response vector may not contain responses to each item. This study investigated the effect on a respondent's location estimate when a respondent is presented an item, has ample time to answer the item, but decides to not respond to the item. For these situations different strategies have been developed for handling missing data. In this study, four different approaches for handling missing data were investigated for their capability to mitigate against the effect of omitted responses on person location estimation. These methods included ignoring the omitted response, selecting the "midpoint" response category, hot-decking, and a Likelihood-based approach. A Monte Carlo study was performed and the effect of different levels of omissions on the simulees' location estimates was determined. Results showed that the hot-decking procedure performed the best of methods examined. Implications for practitioners were discussed.

Subject(s)

Social Sciences/statistics & numerical data , Statistics as Topic , Humans , Monte Carlo Method

The Assessment of Dimensionality for Use in Item Response Theory.

De Ayala, R J; Hertzog, M A.

Multivariate Behav Res ; 26(4): 765-92, 1991 Oct 01.

Article in English | MEDLINE | ID: mdl-26751030

ABSTRACT

The application of item response theory (IRT) models requires the identification of the data's dimensionality. A popular method for determining the number of latent dimensions is the factor analysis of a correlation matrix. Unlike factor analysis, which is based on a linear model, IRT assumes a nonlinear relationship between item performance and ability. Because multidimensional scaling (MDS) assumes a monotonic relationship this method may be useful for the assessment of a data set's dimensionality for use with IRT models. This study compared MDS, exploratory and confirmatory factor analysis (EFA and CFA, respectively) in the assessment of the dimensionality of data sets which had been generated to be either one- or two-dimensional. In addition, the data sets differed in the degree of interdimensional correlation and in the number of items defining a dimension. Results showed that MDS and CFA were able to correctly identify the number of latent dimensions for all data sets. In general, EFA was able to correctly identify the data's dimensionality, except for data whose interdimensional correlation was high.

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL