Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Appl Psychol Meas ; 42(2): 136-154, 2018 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29882542

RESUMO

The current study investigated the consequences of ignoring a multilevel structure for a mixture item response model to show when a multilevel mixture item response model is needed. Study 1 focused on examining the consequence of ignoring dependency for within-level latent classes. Simulation conditions that may affect model selection and parameter recovery in the context of a multilevel data structure were manipulated: class-specific ICC, cluster size, and number of clusters. The accuracy of model selection (based on information criteria) and quality of parameter recovery were used to evaluate the impact of ignoring a multilevel structure. Simulation results indicated that, for the range of class-specific ICCs examined here (.1 to .3), mixture item response models which ignored a higher level nesting structure resulted in less accurate estimates and standard errors (SEs) of item discrimination parameters when the number of clusters was larger than 24 and the cluster size was larger than six. Class-varying ICCs can have compensatory effects on bias. Also, the results suggested that a mixture item response model which ignored multilevel structure was not selected over the multilevel mixture item response model based on Bayesian information criterion (BIC) if the number of clusters and cluster size was at least 50, respectively. In Study 2, the consequences of unnecessarily fitting a multilevel mixture item response model to single-level data were examined. Reassuringly, in the context of single-level data, a multilevel mixture item response model was not selected by BIC, and its use would not distort the within-level item parameter estimates or SEs when the cluster size was at least 20. Based on these findings, it is concluded that, for class-specific ICC conditions examined here, a multilevel mixture item response model is recommended over a single-level item response model for a clustered dataset having cluster size >20 and the number of clusters >50 .

2.
Psychometrika ; 83(3): 751-771, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29417454

RESUMO

As a method to ascertain person and item effects in psycholinguistics, a generalized linear mixed effect model (GLMM) with crossed random effects has met limitations in handing serial dependence across persons and items. This paper presents an autoregressive GLMM with crossed random effects that accounts for variability in lag effects across persons and items. The model is shown to be applicable to intensive binary time series eye-tracking data when researchers are interested in detecting experimental condition effects while controlling for previous responses. In addition, a simulation study shows that ignoring lag effects can lead to biased estimates and underestimated standard errors for the experimental condition effects.


Assuntos
Modelos Lineares , Simulação por Computador , Medições dos Movimentos Oculares , Humanos , Psicolinguística , Psicometria , Fatores de Tempo
3.
J Vis ; 18(1): 2, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29305600

RESUMO

The presence of differential item functioning (DIF) in a test suggests bias that could disadvantage members of a certain group. Previous work with tests of visual learning abilities found significant DIF related to age groups in a car test (Lee, Cho, McGugin, Van Gulick, & Gauthier, 2015), but not in a face test (Cho et al., 2015). The presence of age DIF is a threat to the validity of the test even for studies where aging is not of interest. Here, we assessed whether this pattern of age DIF for cars and not faces would also apply to new tests targeting the same abilities with a new matching task that uses two studied items per trial. We found evidence for DIF in matching tests for faces and for cars, though with encouragingly small effect sizes. Even though the age DIF was small enough at the test level to be acceptable for most uses, we also asked whether the specific format of our matching tasks may induce some age-related DIF regardless of domain. We decomposed the face matching task into its components, and using new data from subjects performing these simpler tasks, found evidence that the age DIF was driven by the similarity of the two faces presented at study on each trial. Overall, our results suggest that using a matching format, especially for cars, reduces age-related DIF, and that a simpler matching task with only one study item per trial could reduce age DIF further.


Assuntos
Envelhecimento/fisiologia , Automóveis , Reconhecimento Facial/fisiologia , Percepção de Forma/fisiologia , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Psicometria/métodos , Inquéritos e Questionários , Adulto Jovem
4.
Appl Psychol Meas ; 41(5): 353-371, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29881097

RESUMO

The linear logistic test model (LLTM) has been widely applied to investigate the effects of item covariates on item difficulty. The LLTM was extended with random item residuals to account for item differences not explained by the item covariates. This extended LLTM is called the LLTM-R. In this article, statistical inference methods are investigated for these two models. Type I error rates and power are compared via Monte Carlo studies. Based on the simulation results, the use of the likelihood ratio test (LRT) is recommended over the paired-sample t test based on sum scores, the Wald z test, and information criteria, and the LRT is recommended over the profile likelihood confidence interval because of the simplicity of the LRT. In addition, it is concluded that the LLTM-R is the better general model approach. Inferences based on the LLTM while the LLTM-R is the true model appear to be largely biased in the liberal way, while inferences based on the LLTM-R while the LLTM is the true model are only biased in a very minor and conservative way. Furthermore, in the absence of residual variance, Type I error rate and power were acceptable except for power when the number of items is small (10 items) and also the number of persons is small (200 persons). In the presence of residual variance, however, the number of items needs to be large (80 items) to avoid an inflated Type I error and to reach a power level of .90 for a moderate effect.

5.
Appl Psychol Meas ; 40(8): 573-591, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-29881071

RESUMO

Researchers are commonly interested in group comparisons such as comparisons of group means, called impact, or comparisons of individual scores across groups. A meaningful comparison can be made between the groups when there is no differential item functioning (DIF) or differential test functioning (DTF). During the past three decades, much progress has been made in detecting DIF and DTF. However, little research has been conducted on what researchers can do after such detection. This study presents and evaluates a confirmatory multigroup multidimensional item response model to obtain the purified item parameter estimates, person scores, and impact estimates on the primary dimension, controlling for the secondary dimension due to DIF. In addition, the item response model approach was compared with current practices of DIF treatment such as deleting and ignoring DIF items and using multigroup item response models through simulation studies. The authors suggested guidelines for DIF treatment based on the simulation study results.

6.
J Vis ; 15(13): 23, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26418499

RESUMO

The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.


Assuntos
Automóveis , Aprendizagem/fisiologia , Percepção Visual/fisiologia , Adulto , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Testes Neuropsicológicos , Psicometria/métodos , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...