Search | VHL Regional Portal

1.

A Note on Comparing the Bifactor and Second-Order Factor Models: Is the Bayesian Information Criterion a Routinely Dependable Index for Model Selection?

Raykov, Tenko; DiStefano, Christine; Calvocoressi, Lisa.

Educ Psychol Meas ; 84(2): 271-288, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38898876

ABSTRACT

This note demonstrates that the widely used Bayesian Information Criterion (BIC) need not be generally viewed as a routinely dependable index for model selection when the bifactor and second-order factor models are examined as rival means for data description and explanation. To this end, we use an empirically relevant setting with multidimensional measuring instrument components, where the bifactor model is found consistently inferior to the second-order model in terms of the BIC even though the data on a large number of replications at different sample sizes were generated following the bifactor model. We therefore caution researchers that routine reliance on the BIC for the purpose of discriminating between these two widely used models may not always lead to correct decisions with respect to model choice.

2.

On the Importance of Coefficient Alpha for Measurement Research: Loading Equality Is Not Necessary for Alpha's Utility as a Scale Reliability Index.

Raykov, Tenko; Anthony, James C; Menold, Natalja.

Educ Psychol Meas ; 83(4): 766-781, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37398845

ABSTRACT

The population relationship between coefficient alpha and scale reliability is studied in the widely used setting of unidimensional multicomponent measuring instruments. It is demonstrated that for any set of component loadings on the common factor, regardless of the extent of their inequality, the discrepancy between alpha and reliability can be arbitrarily small in any considered population and hence practically ignorable. In addition, the set of parameter values where this discrepancy is negligible is shown to possess the same dimensionality as that of the underlying model parameter space. The article contributes to the measurement and related literature by pointing out that (a) approximate or strict loading identity is not a necessary condition for the utility of alpha as a trustworthy index of scale reliability, and (b) coefficient alpha can be a dependable reliability measure with any extent of inequality in the component loadings.

3.

Estimating the Irreducible Uncertainty in Visual Diagnosis: Statistical Modeling of Skill Using Response Models.

Pusic, Martin V; Rapkiewicz, Amy; Raykov, Tenko; Melamed, Jonathan.

Med Decis Making ; 43(6): 680-691, 2023 08.

Article in English | MEDLINE | ID: mdl-37401184

ABSTRACT

BACKGROUND: For the representative problem of prostate cancer grading, we sought to simultaneously model both the continuous nature of the case spectrum and the decision thresholds of individual pathologists, allowing quantitative comparison of how they handle cases at the borderline between diagnostic categories. METHODS: Experts and pathology residents each rated a standardized set of prostate cancer histopathological images on the International Society of Urological Pathologists (ISUP) scale used in clinical practice. They diagnosed 50 histologic cases with a range of malignancy, including intermediate cases in which clear distinction was difficult. We report a statistical model showing the degree to which each individual participant can separate the cases along the latent decision spectrum. RESULTS: The slides were rated by 36 physicians in total: 23 ISUP pathologists and 13 residents. As anticipated, the cases showed a full continuous range of diagnostic severity. Cases ranged along a logit scale consistent with the consensus rating (Consensus ISUP 1: mean -0.93 [95% confidence interval {CI} -1.10 to -0.78], ISUP 2: -0.19 logits [-0.27 to -0.12]; ISUP 3: 0.56 logits [0.06-1.06]; ISUP 4 1.24 logits [1.10-1.38]; ISUP 5: 1.92 [1.80-2.04]). The best raters were able to meaningfully discriminate between all 5 ISUP categories, showing intercategory thresholds that were quantifiably precise and meaningful. CONCLUSIONS: We present a method that allows simultaneous quantification of both the confusability of a particular case and the skill with which raters can distinguish the cases. IMPLICATIONS: The technique generalizes beyond the current example to other clinical situations in which a diagnostician must impose an ordinal rating on a biological spectrum. HIGHLIGHTS: Question: How can we quantify skill in visual diagnosis for cases that sit at the border between 2 ordinal categories-cases that are inherently difficult to diagnose?Findings: In this analysis of pathologists and residents rating prostate biopsy specimens, decision-aligned response models are calculated that show how pathologists would be likely to classify any given case on the diagnostic spectrum. Decision thresholds are shown to vary in their location and precision.Significance: Improving on traditional measures such as kappa and receiver-operating characteristic curves, this specialization of item response models allows better individual feedback to both trainees and pathologists, including better quantification of acceptable decision variation.

Subject(s)

Prostatic Neoplasms , Male , Humans , Neoplasm Grading , Uncertainty , Prostatic Neoplasms/diagnosis , Prostatic Neoplasms/pathology , Models, Statistical , Pathologists

4.

Evaluation of Polytomous Item Locations in Multicomponent Measuring Instruments: A Note on a Latent Variable Modeling Procedure.

Raykov, Tenko; Pusic, Martin.

Educ Psychol Meas ; 83(3): 630-641, 2023 Jun.

Article in English | MEDLINE | ID: mdl-37187691

ABSTRACT

This note is concerned with evaluation of location parameters for polytomous items in multiple-component measuring instruments. A point and interval estimation procedure for these parameters is outlined that is developed within the framework of latent variable modeling. The method permits educational, behavioral, biomedical, and marketing researchers to quantify important aspects of the functioning of items with ordered multiple response options, which follow the popular graded response model. The procedure is routinely and readily applicable in empirical studies using widely circulated software and is illustrated with empirical data.

5.

On Effect Size Measures for Nested Measurement Models.

Raykov, Tenko; DiStefano, Christine; Calvocoressi, Lisa; Volker, Martin.

Educ Psychol Meas ; 82(6): 1225-1246, 2022 Dec.

Article in English | MEDLINE | ID: mdl-36325123

ABSTRACT

A class of effect size indices are discussed that evaluate the degree to which two nested confirmatory factor analysis models differ from each other in terms of fit to a set of observed variables. These descriptive effect measures can be used to quantify the impact of parameter restrictions imposed in an initially considered model and are free from an explicit relationship to sample size. The described indices represent the extent to which respective linear combinations of the proportions of explained variance in the manifest variables are changed as a result of introducing the constraints. The indices reflect corresponding aspects of the impact of the restrictions and are independent of their statistical significance or lack thereof. The discussed effect size measures are readily point and interval estimated, using popular software, and their application is illustrated with numerical examples.

6.

Design Effect in Multilevel Settings: A Commentary on a Latent Variable Modeling Procedure for Its Evaluation.

Raykov, Tenko; DiStefano, Christine.

Educ Psychol Meas ; 82(5): 1020-1030, 2022 Oct.

Article in English | MEDLINE | ID: mdl-35989726

ABSTRACT

A latent variable modeling-based procedure is discussed that permits to readily point and interval estimate the design effect index in multilevel settings using widely circulated software. The method provides useful information about the relationship of important parameter standard errors when accounting for clustering effects relative to conducting single-level analyses. The approach can also be employed as an addendum to point and interval estimation of the intraclass correlation coefficient in empirical research. The discussed procedure makes it easily possible to evaluate the design effect in two-level studies by utilizing the popular latent variable modeling methodology and is illustrated with an example.

7.

Evaluation of Second- and Third-Level Variance Proportions in Multilevel Designs With Completely Observed Populations: A Note on a Latent Variable Modeling Procedure.

Raykov, Tenko; Menold, Natalja; Leer, Jane.

Educ Psychol Meas ; 82(3): 568-579, 2022 Jun.

Article in English | MEDLINE | ID: mdl-35444342

ABSTRACT

Two- and three-level designs in educational and psychological research can involve entire populations of Level-3 and possibly Level-2 units, such as schools and educational districts nested within a given state, or neighborhoods and counties in a state. Such a design is of increasing relevance in empirical research owing to the growing popularity of large-scale studies in these and cognate disciplines. The present note discusses a readily applicable procedure for point-and-interval estimation of the proportions of second- and third-level variances in such multilevel settings, which may also be employed in model choice considerations regarding ensuing analyses for response variables of interest. The method is developed within the framework of the latent variable modeling methodology, is readily utilized with widely used software, and is illustrated with an example.

8.

On the Relationship Between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments.

Menold, Natalja; Raykov, Tenko.

Educ Psychol Meas ; 82(2): 356-375, 2022 Apr.

Article in English | MEDLINE | ID: mdl-35185163

ABSTRACT

The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item stems. The case of complex item stems involving two stimuli description sentences (double-barreled questions) is thereby compared with the setting where items contained a single sentence. Using empirical data, the latent criterion validity differences are evaluated across three groups that are randomly assigned to conditions characterized by item stems with differing number of stimuli. The results indicate that validity of an instrument can be influenced by the specific way item stem is formulated. Implications for empirical educational, behavioral, and social science research are discussed.

9.

Evaluating Restrictive Models in Educational and Behavioral Research: Local Misfit Overrides Model Tenability.

Raykov, Tenko; DiStefano, Christine.

Educ Psychol Meas ; 81(5): 980-995, 2021 Oct.

Article in English | MEDLINE | ID: mdl-34565814

ABSTRACT

The frequent practice of overall fit evaluation for latent variable models in educational and behavioral research is reconsidered. It is argued that since overall plausibility does not imply local plausibility and is only necessary for the latter, local misfit should be considered a sufficient condition for model rejection, even in the case of omnibus model tenability. The argument is exemplified with a comparison of the widely used one-parameter and two-parameter logistic models. A theoretically and practically relevant setting illustrates how discounting local fit and concentrating instead on overall model fit may lead to incorrect model selection, even if a popular information criterion is also employed. The article concludes with the recommendation for routine examination of particular parameter constraints within latent variable models as part of their fit evaluation.

10.

Model Selection and Average Proportion Explained Variance in Exploratory Factor Analysis.

Raykov, Tenko; Calvocoressi, Lisa.

Educ Psychol Meas ; 81(6): 1203-1220, 2021 Dec.

Article in English | MEDLINE | ID: mdl-34565821

ABSTRACT

A procedure for evaluating the average R-squared index for a given set of observed variables in an exploratory factor analysis model is discussed. The method can be used as an effective aid in the process of model choice with respect to the number of factors underlying the interrelationships among studied measures. The approach is developed within the framework of exploratory structural equation modeling and is readily applicable with popular statistical software. The outlined procedure is illustrated using a numerical example.

11.

On the Pitfalls of Estimating and Using Standardized Reliability Coefficients.

Raykov, Tenko; Marcoulides, George A.

Educ Psychol Meas ; 81(4): 791-810, 2021 Aug.

Article in English | MEDLINE | ID: mdl-34267401

ABSTRACT

The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability coefficient, or alternatively substantially lower than the latter. Based on these findings, it is recommended that scholars avoid estimating, reporting, interpreting, or using standardized scale reliability coefficients in empirical research, unless they have strong reasons to consider standardizing the original components of utilized scales.

12.

Examining Multidimensional Measuring Instruments for Proximity to Unidimensional Structure Using Latent Variable Modeling.

Raykov, Tenko; Bluemke, Matthias.

Educ Psychol Meas ; 81(2): 319-339, 2021 Apr.

Article in English | MEDLINE | ID: mdl-37929257

ABSTRACT

A widely applicable procedure of examining proximity to unidimensionality for multicomponent measuring instruments with multidimensional structure is discussed. The method is developed within the framework of latent variable modeling and allows one to point and interval estimate an explained variance proportion-based index that may be considered a measure of proximity to unidimensional structure. The approach is readily utilized in educational, behavioral, and social research when it is of interest to evaluate whether a more general structure scale, test, or measuring instrument could be treated as being associated with an approximately unidimensional latent structure for some empirical purposes.

13.

A Note on the Presence of Spurious Pseudo-Guessing Parameters for Three-Parameter Logistic Models in Heterogeneous Populations.

Raykov, Tenko; Marcoulides, George A.

Educ Psychol Meas ; 80(3): 604-612, 2020 Jun.

Article in English | MEDLINE | ID: mdl-32425221

ABSTRACT

This note raises caution that a finding of a marked pseudo-guessing parameter for an item within a three-parameter item response model could be spurious in a population with substantial unobserved heterogeneity. A numerical example is presented wherein each of two classes the two-parameter logistic model is used to generate the data on a multi-item measuring instrument, while the three-parameter logistic model is found to be associated with a considerable pseudo-guessing parameter estimate on an item. The implications of the reported results for empirical educational research are subsequently discussed.

14.

Examining Validity-Related Correlations and Their Differences Under Assumption Violations.

Raykov, Tenko; Al-Qataee, Abdullah A; Dimitrov, Dimiter M.

Educ Psychol Meas ; 80(2): 389-398, 2020 Apr.

Article in English | MEDLINE | ID: mdl-32158027

ABSTRACT

A procedure for evaluation of validity related coefficients and their differences is discussed, which is applicable when one or more frequently used assumptions in empirical educational, behavioral and social research are violated. The method is developed within the framework of the latent variable modeling methodology and accomplishes point and interval estimation of convergent and discriminant correlations as well as differences between them in cases of incomplete data sets with data not missing at random, nonnormality, and clustering effects. The procedure uses the full information maximum likelihood approach to model fitting and parameter estimation, does not assume availability of multiple indicators for underlying latent constructs, includes auxiliary variables, and accounts for within-group correlations on main response variables resulting from nesting effects involving studied respondents. The outlined procedure is illustrated on empirical data from a study using tertiary education entrance examination measures.

15.

A Method for Examining the Equating of Psychometric Scales and Tests: An Application Using Dementia Screening Test Batteries.

Dowling, N Maritza; Raykov, Tenko; Marcoulides, George A.

Educ Psychol Meas ; 80(1): 199-209, 2020 Feb.

Article in English | MEDLINE | ID: mdl-31933499

ABSTRACT

Equating of psychometric scales and tests is frequently required and conducted in educational, behavioral, and clinical research. Construct comparability or equivalence between measuring instruments is a necessary condition for making decisions about linking and equating resulting scores. This article is concerned with a widely applicable method for examining if two scales or tests cannot be equated. A latent variable modeling method is discussed that can be used to evaluate whether the tests or parts thereof measure latent constructs that are distinct from each other. The approach can be routinely used before an equating procedure is undertaken, in order to assess whether equating could be meaningfully carried out to begin with. The procedure is readily applicable in empirical research using popular software. The method is illustrated with data from dementia screening test batteries administered as part of two studies designed to evaluate a wide range of biomarkers throughout the process of normal aging to dementia or Alzheimer's disease.

16.

On the Connections Between Item Response Theory and Classical Test Theory: A Note on True Score Evaluation for Polytomous Items via Item Response Modeling.

Raykov, Tenko; Dimitrov, Dimiter M; Marcoulides, George A; Harrison, Michael.

Educ Psychol Meas ; 79(6): 1198-1209, 2019 Dec.

Article in English | MEDLINE | ID: mdl-31619845

ABSTRACT

This note highlights and illustrates the links between item response theory and classical test theory in the context of polytomous items. An item response modeling procedure is discussed that can be used for point and interval estimation of the individual true score on any item in a measuring instrument or item set following the popular and widely applicable graded response model. The method contributes to the body of research on the relationships between classical test theory and item response theory and is illustrated on empirical data.

17.

Evaluation of Variance Inflation Factors in Regression Models Using Latent Variable Modeling Methods.

Marcoulides, Katerina M; Raykov, Tenko.

Educ Psychol Meas ; 79(5): 874-882, 2019 Oct.

Article in English | MEDLINE | ID: mdl-31488917

ABSTRACT

A procedure that can be used to evaluate the variance inflation factors and tolerance indices in linear regression models is discussed. The method permits both point and interval estimation of these factors and indices associated with explanatory variables considered for inclusion in a regression model. The approach makes use of popular latent variable modeling software to obtain these point and interval estimates. The procedure allows more informed evaluation of these quantities when addressing multicollinearity-related issues in empirical research using regression models. The method is illustrated on an empirical example using the popular software Mplus. Results of a simulation study investigating the capabilities of the procedure are also presented.

18.

Examining Population Differences in Within-Person Variability in Longitudinal Designs Using Latent Variable Modeling: An Application to the Study of Cognitive Functioning of Older Adults.

Dowling, N Maritza; Raykov, Tenko; Marcoulides, George A.

Educ Psychol Meas ; 79(3): 598-609, 2019 Jun.

Article in English | MEDLINE | ID: mdl-31105325

ABSTRACT

Longitudinal studies have steadily grown in popularity across the educational and behavioral sciences, particularly with the increased availability of technological devices that allow the easy collection of repeated measures on multiple dimensions of substantive relevance. This article discusses a procedure that can be used to evaluate population differences in within-person (intraindividual) variability in such longitudinal investigations. The method is based on an application of the latent variable modeling methodology within a two-level modeling framework. The approach is used to obtain point and interval estimates of the differences in within-person variance and in the strength of correlative effects of repeated measures between normal and very mildly demented persons in a longitudinal study of a diagnostic cognitive test assessing verbal episodic memory.

19.

Multiple-Component Measurement Instruments in Heterogeneous Populations: Is There a Single Coefficient Alpha?

Raykov, Tenko; Marcoulides, George A; Harrison, Michael; Menold, Natalja.

Educ Psychol Meas ; 79(2): 399-412, 2019 Apr.

Article in English | MEDLINE | ID: mdl-30911199

ABSTRACT

This note confronts the common use of a single coefficient alpha as an index informing about reliability of a multicomponent measurement instrument in a heterogeneous population. Two or more alpha coefficients could instead be meaningfully associated with a given instrument in finite mixture settings, and this may be increasingly more likely the case in empirical educational and psychological research. It is argued that in such situations explicit examination of class-invariance in the alpha coefficient must precede any statements about its possible value in the studied population. The approach permits also the evaluation of between-class alpha differences as well as point and interval estimation of the within-class alpha coefficients. The method can similarly be used in situations with (a) known class membership when distinct (sub)populations are investigated while their number is known beforehand and membership in them is observed for studied persons, as well as (b) in settings where only the number of latent classes is known for a population under investigation. The outlined procedure is illustrated with numerical data.

20.

Thanks Coefficient Alpha, We Still Need You!

Raykov, Tenko; Marcoulides, George A.

Educ Psychol Meas ; 79(1): 200-210, 2019 Feb.

Article in English | MEDLINE | ID: mdl-30636788

ABSTRACT

This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights the fact that as an index aimed at informing about multiple-component measuring instrument reliability, coefficient alpha is dependable then as a reliability estimator. Therefore, alpha should remain in service when these conditions are fulfilled and not be abandoned.

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL