Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
2.
Appl Psychol Meas ; 45(1): 71-73, 2021 Jan.
Article in English | MEDLINE | ID: mdl-33304022
3.
J Appl Meas ; 21(4): 400-419, 2020.
Article in English | MEDLINE | ID: mdl-33989197

ABSTRACT

Accurate parameter estimation in the Rasch model involves the assumption of conditional independence, also termed local independence. Conditional on ability, the responses to items A and B should be independent. Two types of conditional dependence are detailed in this pedagogical piece: trait dependency and response dependency. The bias in difficulty and reliability and the estimates of fit and correlated residuals resulting from these dependencies are compared and contrasted to results from using models that account for the dependency. Contrasts with results from a 2-parameter item response theory model are also briefly noted.


Subject(s)
Psychometrics , Bias , Reproducibility of Results
4.
J Appl Meas ; 20(1): 1-12, 2019.
Article in English | MEDLINE | ID: mdl-30789829

ABSTRACT

This paper investigates a strategy for accounting for correct guessing with the Rasch model that we entitled the Guessing Adjustment. This strategy involves the identification of all person/item encounters where the probability of a correct response is below a specified threshold. These responses are converted to missing data and the calibration is conducted a second time. This simulation study focuses on the effects of different probability thresholds across varying conditions of sample size, amount of correct guessing, and item difficulty. Biases, standard errors, and root mean squared errors were calculated within each condition. Larger probability thresholds were generally associated with reductions in bias and increases in standard errors. Across most conditions, the reduction in bias was more impactful than the decrease in precision, as reflected by the RMSE. The Guessing Adjustment is an effective means for reducing the impact of correct guessing and the choice of probability threshold matters.


Subject(s)
Models, Statistical , Bias , Probability , Psychometrics , Sample Size
5.
Educ Psychol Meas ; 79(1): 151-169, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30636786

ABSTRACT

Previous work showing that revised parallel analysis can be effective with dichotomous items has used a two-parameter model and normally distributed abilities. In this study, both two- and three-parameter models were used with normally distributed and skewed ability distributions. Relatively minor skew and kurtosis in the underlying ability distribution had almost no effect on Type I error for unidimensional data and reduced power for two-dimensional data slightly with smaller sample sizes of 400. Using a two-parameter model on three-parameter data produced dramatically increased rejection rates for the unidimensional data. Using the correct three-parameter model reduced the unidimensional rejection rates but yielded lower power than the two-parameter data in some conditions.

6.
J Appl Meas ; 18(2): 163-177, 2017.
Article in English | MEDLINE | ID: mdl-28961152

ABSTRACT

In many areas of statistics it is common practice to present both a statistical significance test and an effect size. In contrast, for the Infit and Outfit indices of item misfit, it has historically been common to focus on either the mean square (MS; an index of the magnitude of misfit) or the statistical significance, but not both. If the statistical significance and effect size are to be used together, it is important not only that the Type I error rate matches the nominal alpha level, but also that, for any given magnitude of misfit, the expected value of the MS is independent of sample size. This study confirmed that the average MS for several simulated misfitting items was nearly the same for large and small samples, although necessarily the variance depended on sample size. Thus, if the item fit is statistically significant, the MS appears to be a reasonable index for judging the magnitude of the misfit in the sample, although one must recognize that the estimate of the magnitude will be less stable in small samples, as is true for all effect sizes.


Subject(s)
Data Collection/statistics & numerical data , Data Interpretation, Statistical , Models, Statistical , Humans , Reproducibility of Results
7.
Appl Psychol Meas ; 41(5): 323-337, 2017 Jul.
Article in English | MEDLINE | ID: mdl-29881095

ABSTRACT

The purpose of this study was to examine the performance of the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm in the estimation of multilevel multidimensional item response theory (ML-MIRT) models. The accuracy and efficiency of MH-RM in recovering item parameters, latent variances and covariances, as well as ability estimates within and between clusters (e.g., schools) were investigated in a simulation study, varying the number of dimensions, the intraclass correlation coefficient, the number of clusters, and cluster size, for a total of 24 conditions. Overall, MH-RM performed well in recovering the item, person, and group-level parameters of the model. Ratios of the empirical to analytical standard errors indicated that the analytical standard errors reported in flexMIRT were somewhat overestimated for the cluster-level ability estimates, a little too large for the person-level ability estimates, and essentially accurate for the other parameters. Limitations of the study, implications for educational measurement practice, and directions for future research are offered.

8.
Educ Psychol Meas ; 76(2): 231-257, 2016 Apr.
Article in English | MEDLINE | ID: mdl-29795864

ABSTRACT

Partially compensatory models may capture the cognitive skills needed to answer test items more realistically than compensatory models, but estimating the model parameters may be a challenge. Data were simulated to follow two different partially compensatory models, a model with an interaction term and a product model. The model parameters were then estimated for both models and for the compensatory model. Either the model used to simulate the data or the compensatory model generally had the best fit, as indexed by information criteria. Interfactor correlations were estimated well by both the correct model and the compensatory model. The predicted response probabilities were most accurate from the model used to simulate the data. Regarding item parameters, root mean square errors seemed reasonable for the interaction model but were quite large for some items for the product model. Thetas were recovered similarly by all models, regardless of the model used to simulate the data.

9.
Educ Psychol Meas ; 75(4): 610-633, 2015 Aug.
Article in English | MEDLINE | ID: mdl-29795835

ABSTRACT

In educational testing, differential item functioning (DIF) statistics must be accurately estimated to ensure the appropriate items are flagged for inspection or removal. This study showed how using the Rasch model to estimate DIF may introduce considerable bias in the results when there are large group differences in ability (impact) and the data follow a three-parameter logistic model. With large group ability differences, difficult non-DIF items appeared to favor the focal group and easy non-DIF items appeared to favor the reference group. Correspondingly, the effect sizes for DIF items were biased. These effects were mitigated when data were coded as missing for item-examinee encounters in which the person measure was considerably lower than the item location. Explanation of these results is provided by illustrating how the item response function becomes differentially distorted by guessing depending on the groups' ability distributions. In terms of practical implications, results suggest that measurement practitioners should not trust the DIF estimates from the Rasch model when there is a large difference in ability and examinees are potentially able to answer items correctly by guessing, unless data from examinees poorly matched to the item difficulty are coded as missing.

10.
J Appl Meas ; 14(2): 179-96, 2013.
Article in English | MEDLINE | ID: mdl-23816595

ABSTRACT

Using a scale of test-taking motivation designed to have multiple factors, results are compared from a confirmatory factor analysis (CFA) using LISREL and a multidimensional Rasch partial credit model using ConQuest. Both types of analyses work with latent factors and allow the comparison of nested models. CFA models most typically model a linear relationship between observed and latent variables, while Rasch models specify a non-linear relationship between observed and latent variables. The CFA software provides many more measures of overall fit than ConQuest, which is focused more on the fit of individual items. Despite the conceptual differences in these techniques, the results were similar. The data fit a three-dimensional model better than the one-dimensional or two-dimensional models also hypothesized, although some misfit remained.


Subject(s)
Algorithms , Data Interpretation, Statistical , Factor Analysis, Statistical , Models, Statistical , Psychometrics/methods , Surveys and Questionnaires , Computer Simulation , Sample Size
11.
Behav Ther ; 42(3): 462-74, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21658528

ABSTRACT

There are no empirically supported psychosocial treatments for adolescents with attention-deficit hyperactivity disorder (ADHD). This study examined the treatment benefits of the Challenging Horizons Program (CHP), a psychosocial treatment program designed to address the impairment and symptoms associated with this disorder in young adolescents. In addition to evaluating social and academic functioning outcomes, two critical questions from previous studies pertaining to the timing, duration, and family involvement in treatment were addressed. Forty-nine students recruited in two cohorts were randomly assigned to receive either the CHP or a community care condition. Outcomes suggested that students who received the CHP improved compared to students in the control condition on measures of symptoms and impairment. Implications related to timing, duration, and family involvement are reported, as well as recommendations for future studies.


Subject(s)
Attention Deficit Disorder with Hyperactivity/psychology , Attention Deficit Disorder with Hyperactivity/therapy , Educational Status , Psychotherapy/methods , Social Adjustment , Adolescent , Child , Family/psychology , Female , Humans , Male , Psychiatric Status Rating Scales , Schools , Severity of Illness Index , Time Factors , Wechsler Scales/statistics & numerical data
12.
J Appl Meas ; 5(4): 350-61, 2004.
Article in English | MEDLINE | ID: mdl-15496743

ABSTRACT

A multidimensional Rasch model was applied to two instruments measuring abilities in two related areas of a university general education curriculum. Grades from related courses were also calibrated using the Rasch model. Thus, course grades, test items, and persons were all placed on the same metric. Incorporating grades within the metric provided additional meaning to the measures; instructors could see which items were matched to students in a particular grade range for a course. This could help not only in interpreting items but also in interpreting grades. Test items and grades fit the model reasonably well, with adequate person separation reliability.


Subject(s)
Curriculum , Models, Educational , Aptitude Tests , Calibration , Educational Status , Humans , Psychometrics , Reproducibility of Results , Treatment Outcome , Universities
SELECTION OF CITATIONS
SEARCH DETAIL
...