Search | VHL Regional Portal

1.

The InterModel Vigorish as a Lens for Understanding (and Quantifying) the Value of Item Response Models for Dichotomously Coded Items.

Domingue, Benjamin W; Kanopka, Klint; Kapoor, Radhika; Pohl, Steffi; Chalmers, R Philip; Rahal, Charles; Rhemtulla, Mijke.

Psychometrika ; 2024 Jun 03.

Article in English | MEDLINE | ID: mdl-38829495

ABSTRACT

The deployment of statistical models-such as those used in item response theory-necessitates the use of indices that are informative about the degree to which a given model is appropriate for a specific data context. We introduce the InterModel Vigorish (IMV) as an index that can be used to quantify accuracy for models of dichotomous item responses based on the improvement across two sets of predictions (i.e., predictions from two item response models or predictions from a single such model relative to prediction based on the mean). This index has a range of desirable features: It can be used for the comparison of non-nested models and its values are highly portable and generalizable. We use this fact to compare predictive performance across a variety of simulated data contexts and also demonstrate qualitative differences in behavior between the IMV and other common indices (e.g., the AIC and RMSEA). We also illustrate the utility of the IMV in empirical applications with data from 89 dichotomous item response datasets. These empirical applications help illustrate how the IMV can be used in practice and substantiate our claims regarding various aspects of model performance. These findings indicate that the IMV may be a useful indicator in psychometrics, especially as it allows for easy comparison of predictions across a variety of contexts.

2.

Interpretation of course conceptual structure and student self-efficacy: an integrated strategy of knowledge graphs with item response modeling.

Cao, Zhen-Yu; Lin, Feng; Feng, Chun.

BMC Med Educ ; 24(1): 563, 2024 May 23.

Article in English | MEDLINE | ID: mdl-38783267

ABSTRACT

BACKGROUND: There is a scarcity of studies that quantitatively assess the difficulty and importance of knowledge points (KPs) depending on students' self-efficacy for learning (SEL). This study aims to validate the practical application of psychological measurement tools in physical therapy education by analyzing student SEL and course conceptual structure. METHODS: From the "Therapeutic Exercise" course curriculum, we extracted 100 KPs and administered a difficulty rating questionnaire to 218 students post-final exam. The pipeline of the non-parametric Item Response Theory (IRT) and parametric IRT modeling was employed to estimate student SEL and describe the hierarchy of KPs in terms of item difficulty. Additionally, Gaussian Graphical Models with Non-Convex Penalties were deployed to create a Knowledge Graph (KG) and identify the main components. A visual analytics approach was then proposed to understand the correlation and difficulty level of KPs. RESULTS: We identified 50 KPs to create the Mokken scale, which exhibited high reliability (Cronbach's alpha = 0.9675) with no gender bias at the overall or at each item level (p > 0.05). The three-parameter logistic model (3PLM) demonstrated good fitness with questionnaire data, whose Root Mean Square Error Approximation was < 0.05. Also, item-model fitness unveiled good fitness, as indicated by each item with non-significant p-values for chi-square tests. The Wright map revealed item difficulty relative to SEL levels. SEL estimated by the 3PLM correlated significantly with the high-ability range of average Grade-Point Average (p < 0.05). The KG backbone structure consisted of 58 KPs, with 29 KPs overlapping with the Mokken scale. Visual analysis of the KG backbone structure revealed that the difficulty level of KPs in the IRT could not replace their position parameters in the KG. CONCLUSION: The IRT and KG methods utilized in this study offer distinct perspectives for visualizing hierarchical relationships and correlations among the KPs. Based on real-world teaching empirical data, this study helps to provide a research foundation for updating course contents and customizing learning objectives. TRIAL REGISTRATION: Not applicable.

Subject(s)

Curriculum , Educational Measurement , Self Efficacy , Humans , Female , Male , Surveys and Questionnaires , Physical Therapy Specialty/education , Reproducibility of Results

3.

Development and validation of health-oriented personal evaluation for the community-dwelling older person based on the International Classification of Functioning, Disability and Health.

Zhou, Liang; Feng, Chun; Lu, Yue; Zhong, Li-Juan; Gao, Jing; Liu, Na; Lin, Feng; Jiang, Zhong-Li.

Int J Older People Nurs ; 19(3): e12609, 2024 May.

Article in English | MEDLINE | ID: mdl-38622947

ABSTRACT

BACKGROUND: The International Classification of Functioning, Disability and Health (ICF) offers a standardized international terminology to operationalize function management across multiple domains, but the summary score of the ICF qualifier scale provides limited information on the comparison of personal abilities and functioning difficulties. OBJECTIVES: To enhance the interpretative power of the ICF-based Health-oriented Personal Evaluation for the community-dwelling older person (iHOPE-OP) scale through the implementation of the item response theory (IRT) modelling. METHODS: This cross-sectional, multi-centre study administrated 161 ICF categories (58 on body functions, 15 on body structures, 60 on activities or participation and 28 on environmental factors) to evaluate the functional level of 338 older citizens (female = 158, male = 180) residing in community or supportive living facilities. The validation process encompassed assessing the IRT model fitness and evaluating the psychometric properties of the IRT-derived iHOPE-OP scale. RESULTS: The age of participants ranged from 60 to 94.57, with the mean age of 70. The analysis of non-parametric and parametric models revealed that the three-parameter logistic IRT model, with a dichotomous scoring principle, exhibited the best fit. The 53-item iHOPE-OP scale demonstrated high reliability (Cronbach's α = 0.9729, Guttman's lambda-2 = 0.9749, Molenaar-Sijtsma Statistic = 0.9803, latent class reliability coefficient = 0.9882). There was a good validity between person abilities and the Barthel Index (p < .001, r = .83), as well as instrumental activities of daily living (p < .001, r = .84). CONCLUSIONS: IRT methods generate the reliable and valid iHOPE-OP scale with the most discriminable and minimal items to represent the older person's functional performance at a comprehensive level. The use of the Wright map can aid in presby-functioning management by visualizing item difficulties and person abilities. IMPLICATIONS FOR PRACTICE: Considering the intricate and heterogeneous health status of older persons, a single functional assessment tool might not fulfil the need to fully understand the multifaceted health status. For use in conjunction with the IRT and ICF framework, the reliable and valid iHOPE-OP scale was developed and can be applied to capture presby-functioning. The Wright map depicts the distribution of item difficulties and person abilities on the same scale that facilitates person-centred goal setting and tailors intervention.

Subject(s)

Activities of Daily Living , Independent Living , Humans , Male , Female , Aged , Aged, 80 and over , International Classification of Functioning, Disability and Health , Disability Evaluation , Cross-Sectional Studies , Reproducibility of Results

4.

Measuring mental health action competencies in school teachers: internal and external validity evidence.

Kerry, Matthew J; Robin, Dominik; Albermann, Kurt; Dratva, Julia.

Front Digit Health ; 6: 1257392, 2024.

Article in English | MEDLINE | ID: mdl-38414714

ABSTRACT

Introduction: Mental health literacy is receiving increasing research attention due to growing concerns for mental health globally. Among children, teachers have recently been recognized as playing a vital role in the recognition and reporting of potential mental health issues. Methods: A nationally sampled cross-section of teachers was surveyed to examine the discriminant validity of the mental health literacy measure across levels of teaching. A survey collected a total of n = 369 teacher responses in Switzerland (Kindergarten = 76, Primary = 210, Secondary = 83). Item response theory (IRT) analyses were conducted. Results: Inspection of psychometric properties indicated removal of two weak performing items. The 15-item measure exhibited a significant mean difference, such that class-responsibility function scored higher (M = 2.86, SD = .45) than non-responsible function (M = 2.68, SD = .45) teachers [t(309) = -2.20, p = .01]. It also exhibited a significant mean difference, such that more subjective experienced scored higher (M = 2.86, SD = .45) than less subjective experienced (M = 2.68, SD = .45) teachers [t(210) = -8.66, p < .01]. Discussion: Hypotheses regarding age and role tenure were in the expected direction, but non-significant. The MHL measure for teachers demonstrated sound measurement properties supporting usage across teaching levels.

5.

Factor structure of psychosis screening questionnaire in Ugandan adults.

Kwagala, Claire; Ametaj, Amantia; Kim, Hannah H; Kyebuzibwa, Joseph; Okura, Rogers; Stevenson, Anne; Gelaye, Bizu; Akena, Dickens.

BMC Psychiatry ; 24(1): 36, 2024 01 09.

Article in English | MEDLINE | ID: mdl-38195440

ABSTRACT

BACKGROUND: Psychotic disorders are common and contribute significantly to morbidity and mortality of people with psychiatric diseases. Therefore, early screening and detection may facilitate early intervention and reduce adverse outcomes. Screening tools that lay persons can administer are particularly beneficial in low resource settings. However, there is limited research evaluating the validity of psychosis screening instruments in Uganda. We aimed to assess the construct validity and psychometric properties of the Psychosis Screening Questionnaire (PSQ) in Uganda in a population with no history of a psychotic disorder. METHODS: The sample consisted of 2101 Ugandan adults participating as controls in a larger multi-country case-control study on psychiatric genetics who were recruited between February 2018 and March 2020. Participants were individuals seeking outpatient general medical care, caretakers of individuals seeking care, and staff or students recruited from five medical facilities that were age 18 years or older and able to provide consent. Individuals were excluded who had acute levels of alcohol or substance use, including being under inpatient hospitalization or acute medical care for one of these conditions. We used confirmatory factor analysis (CFA) and item response theory (IRT) to evaluate the factor structure and item properties of the PSQ. RESULTS: The overall prevalence screening positive for psychotic symptoms was 13.9% 95% CI (12.4,15.4). "Strange experiences" were the most endorsed symptoms 6.6% 95% CI (5.6,7.8). A unidimensional model seemed to be a good model or well-fitting based on fit indices including the root mean square error of approximation (RMSEA of 0.00), comparative fit index (CFI of 1.000), and Tucker-Lewis Index (TLI of 1.000). The most discriminating items along the latent construct of psychosis were items assessing thought disturbance followed by items assessing paranoia, with a parameter (discrimination) value of 2.53 and 2.40, respectively. CONCLUSION: The PSQ works well in Uganda as an initial screening tool for moderate to high-level of psychotic symptoms.

Subject(s)

Psychotic Disorders , Adult , Humans , Adolescent , Uganda , Case-Control Studies , Psychotic Disorders/diagnosis , Paranoid Disorders , Surveys and Questionnaires

6.

An item response theory approach to the measurement of working memory capacity.

Navarro, Ester; Hao, Han; Rosales, Kevin P; Conway, Andrew R A.

Behav Res Methods ; 56(3): 1697-1714, 2024 Mar.

Article in English | MEDLINE | ID: mdl-37170060

ABSTRACT

Complex span tasks are perhaps the most widely used paradigm to measure working memory capacity (WMC). Researchers assume that all types of complex span tasks assess domain-general WM. However, most research supporting this claim comes from factor analysis approaches that do not examine task performance at the item level, thus not allowing comparison of the characteristics of verbal and spatial complex span tasks. Item response theory (IRT) can help determine the extent to which different complex span tasks assess domain-general WM. In the current study, spatial and verbal complex span tasks were examined using IRT. The results revealed differences between verbal and spatial tasks in terms of item difficulty and block difficulty, and showed that most subjects with below-average ability were able to answer most items correctly across all tasks. In line with previous research, the findings suggest that examining domain-general WM by using only one task might elicit skewed scores based on task domain. Further, visuospatial complex span tasks should be prioritized as a measure of WMC if resources are limited.

Subject(s)

Memory, Short-Term , Humans , Memory, Short-Term/physiology , Factor Analysis, Statistical

7.

Development and validation of the clinical information literacy questionnaire.

Maleki, Elahe; Soleymani, Mohammad R; Ashrafi-Rizi, Hasan; Heidari, Zahra; Nasr-Esfahani, Mohammad.

J Educ Health Promot ; 12: 346, 2023.

Article in English | MEDLINE | ID: mdl-38144008

ABSTRACT

BACKGROUND: Clinical Information Literacy (CIL) seems to be a prerequisite for physicians to implement Evidence-Based Medicine (EBM) effectively. This study endeavors to develop and validate a CIL questionnaire for medical residents of Isfahan University of Medical Sciences. MATERIALS AND METHODS: This study employs sequential-exploratory mixed methods in 2019. The participants were 200 medical residents in different specialties; they are selected through the convenience sampling method. In the first (qualitative) phase, an early CIL questionnaire was designed by reviewing literature and performing complementary interviews with health professionals. In the second (validation) phase, the questionnaire's face validity and content validity were confirmed. In the third (quantitative) phase, the construct validity was examined via Item-Response Theory (IRT) model, and the factor loading was computed. The gathered data were analyzed using descriptive statistics, t-test, two-way ANOVA, as well as two-parameter IRT model in R software. RESULTS: In the qualitative phase, the concept of CIL is initially described in seven main categories and 22 subcategories, and the items were formulated. An initial 125-item questionnaire was analyzed by the research team, leading to a 43-item. Through the content validity and face validity examination, we removed 11 and 4 items in the Content Validity Ratio (CVR) and Content Validity Index (CVI), respectively. Throughout the face validity analysis, none of the items were removed. According to the construct validity results, difficulty coefficient, discriminant coefficient, and factor loading were confirmed, most of the other questions achieved a proper factor loading value that is higher than 0.30, and a value of 0.66 was achieved for the reliability via the Kuder-Richardson method. Ultimately, the real-assessment 28-item CIL questionnaire was developed with four components. CONCLUSIONS: The CIL questionnaire could be employed to examine the actual CIL basic knowledge. Because of using the real-assessment approach rather than self-assessment in the design, it can be claimed that this instrument can provide a more accurate assessment of the information literacy status of medical residents. This valid questionnaire is used to measure and train the skills needed by healthcare professionals in the effective implementation of EBM.

8.

Use of the Rasch Model for Fit Statistics and Rating Scale Diagnosis for the Student Perception Appraisal-Revised.

Hawkins, Robert J; Hawkins, Janice; Tremblay, Beth; Wiles, Lynn; Higgins, Karen.

J Nurs Meas ; 2023 Nov 21.

Article in English | MEDLINE | ID: mdl-37989504

ABSTRACT

Background and Purpose: Nursing student retention is essential to meet workforce demands. Jeffrey's Nursing Student Retention Student Perception Appraisal-Revised (SPA-R1) has been used extensively to understand factors that impact retention. Psychometric testing of the SPA-R1 contributes to greater confidence in the instrument's reliability and validity. Methods: Item response theory and specifically, the single parameter polytomous Rasch model was used as a framework for fit statistic testing and rating scale diagnostics of the SPA-R1. This was a secondary analysis of a convenience sample of undergraduate prelicensure nursing students. The setting for the previous study was virtual, and the study period was 2022. Results: The model item characteristic curves for the 27 items of the SPA-R1 have similar shapes and are clustered in proximity. Overall, there are three clusters of items evident in the Rasch standardized residual contrast. The Rasch scale diagnostics indicated that the scale appropriately monotonically increases. However, there is a greater than 5 logit distance between does not apply and severely restricts, between severely restricts and moderately restricts, and between does not restrict or support and moderately supports. These large threshold distances indicate that additional steps in the scale may be warranted. The items cover the mid-range of the amount of retention perceptions; however, there are no items that represent the highest magnitude of the perceived amount of influence on retention. Conclusions: This study contributes further evidence to support the validity and reliability of the SPA-R1. We recommend adding steps to the scale, removing the does not apply response option, and considering scoring by three domains or clusters.

9.

Determining intra-standard-setter inconsistency in the Angoff method using the three-parameter item response theory.

Tavakol, Mohsen; O'Brien, David; Stewart, Claire.

Int J Med Educ ; 14: 123-130, 2023 09 07.

Article in English | MEDLINE | ID: mdl-37678838

ABSTRACT

Objectives: To measure intra-standard-setter variability and assess the variations between the pass marks obtained from Angoff ratings, guided by the latent trait theory as the theoretical model. Methods: A non-experimental cross-sectional study was conducted to achieve the purpose of the study. Two knowledge-based tests were administered to 358 final-year medical students (223 females and 135 males) as part of their normal summative programme of assessments. The results of judgmental standard-setting using the Angoff method, which is widely used in medical schools, were used to determine intra-standard-setter inconsistency using the three-parameter item response theory (IRT). Permission for this study was granted by the local Research Ethics Committee of the University of Nottingham. To ensure anonymity and confidentiality, all identifiers at the student level were removed before the data were analysed. Results: The results of this study confirm that the three-parameter IRT can be used to analyse the results of individual judgmental standard setters. Overall, standard-setters behaved fairly consistently in both tests. The mean Angoff ratings and conditional probability were strongly positively correlated, which is a matter of inter-standard-setter validity. Conclusions: We recommend that assessment providers adopt the methodology used in this study to help determine inter and intra-judgmental inconsistencies across standard setters to minimise the number of false positive and false negative decisions.

Subject(s)

Academic Performance , Education, Medical , Program Evaluation , Humans , Male , Female , Students, Medical , Education, Medical/standards , Cross-Sectional Studies , Models, Theoretical

10.

Estimating the Multidimensional Generalized Graded Unfolding Model with Covariates Using a Bayesian Approach.

Tu, Naidan; Zhang, Bo; Angrave, Lawrence; Sun, Tianjun; Neuman, Mathew.

J Intell ; 11(8)2023 Aug 14.

Article in English | MEDLINE | ID: mdl-37623546

ABSTRACT

Noncognitive constructs are commonly assessed in educational and organizational research. They are often measured by summing scores across items, which implicitly assumes a dominance item response process. However, research has shown that the unfolding response process may better characterize how people respond to noncognitive items. The Generalized Graded Unfolding Model (GGUM) representing the unfolding response process has therefore become increasingly popular. However, the current implementation of the GGUM is limited to unidimensional cases, while most noncognitive constructs are multidimensional. Fitting a unidimensional GGUM separately for each dimension and ignoring the multidimensional nature of noncognitive data may result in suboptimal parameter estimation. Recently, an R package bmggum was developed that enables the estimation of the Multidimensional Generalized Graded Unfolding Model (MGGUM) with covariates using a Bayesian algorithm. However, no simulation evidence is available to support the accuracy of the Bayesian algorithm implemented in bmggum. In this research, two simulation studies were conducted to examine the performance of bmggum. Results showed that bmggum can estimate MGGUM parameters accurately, and that multidimensional estimation and incorporating relevant covariates into the estimation process improved estimation accuracy. The effectiveness of two Bayesian model selection indices, WAIC and LOO, were also investigated and found to be satisfactory for model selection. Empirical data were used to demonstrate the use of bmggum and its performance was compared with three other GGUM software programs: GGUM2004, GGUM, and mirt.

11.

Assessing subjective cognitive decline in older adults attending primary health care centers: what question should be asked?

Molina-Donoso, Matías; Parrao, Teresa; Meillon, Céline; Thumala, Daniela; Lillo, Patricia; Villagra, Roque; Ibañez, Agustín; Cerda, Mauricio; Zitko, Pedro; Amieva, Hélène; Slachevsky, Andrea.

J Clin Exp Neuropsychol ; 45(3): 313-320, 2023 05.

Article in English | MEDLINE | ID: mdl-37403327

ABSTRACT

INTRODUCTION: Subjective Cognitive Decline (SCD) refers to a self-perceived experience of decreased cognitive function without objective signs of cognitive impairment in neuropsychological tests or daily living activities. Despite the abundance of instruments addressing SCD, there is no consensus on the methods to be used. Our study is founded on 11 questions selected due to their recurrence in most instruments. The objective was to determine which one of these questions could be used as a simple screening tool. METHODS: 189 participants aged 65 and over selected from Primary Care centers in Santiago de Chile responded to these 11 questions and were evaluated with the MiniMental State Examination (MMSE), the Free and Cued Selective Reminding Test (FCSRT), the Pfeffer functional scale, and the Geriatric Depression Scale (GDS). An Item ResponseTheory (IRT) method was performed to assess the contribution of each of the 11 questions to the SCD latent trait and its discrimination ability. RESULTS: Based on the results of the exploratory factor analysis showing very high/low saturation of several questions on the factors, and the high residual correlation between some questions, the IRT methods led to select one question ("Do you feel like your memory has become worse?") which revealed to be the most contributive and discriminant. Participants who answered yes had a higher GDS score. There was no association with MMSE, FCSRT, and Pfeffer scores. CONCLUSION: The question "Do you feel like your memory has become worse?" may be a good proxy of SCD and could be included in routine medical checkups.

Subject(s)

Cognitive Dysfunction , Humans , Aged , Cognitive Dysfunction/diagnosis , Cognitive Dysfunction/psychology , Cognition , Neuropsychological Tests , Cues , Primary Health Care

12.

Psychometric Properties of the German Version of the Quality of Life after Brain Injury Scale for Kids and Adolescents (QOLIBRI-KID/ADO) Using Item Response Theory Framework: Results from the Pilot Study.

Zeldovich, Marina; Cunitz, Katrin; Greving, Sven; Muehlan, Holger; Bockhop, Fabian; Krenz, Ugne; Timmermann, Dagmar; Koerte, Inga K; Rojczyk, Philine; Roediger, Maike; Lendt, Michael; von Steinbuechel, Nicole.

J Clin Med ; 12(11)2023 May 27.

Article in English | MEDLINE | ID: mdl-37297911

ABSTRACT

Health-related quality of life (HRQOL) is an important indicator for recovery after pediatric TBI. To date, there are a few questionnaires available for assessing generic HRQOL in children and adolescents, but there are not yet any TBI-specific measures of HRQOL that are applicable to pediatric populations. The aim of the present study was to examine psychometric characteristics of the newly developed Quality of Life After Brain Injury Scale for Kids and Adolescents (QOLIBRI-KID/ADO) questionnaire capturing TBI-specific HRQOL in children and adolescents using an item response theory (IRT) framework. Children (8-12 years; n = 152) and adolescents (13-17 years; n = 148) participated in the study. The final version of the QOLIBRI-KID/ADO, comprising 35 items forming 6 scales, was investigated using the partial credit model (PCM). A scale-wise examination for unidimensionality, monotonicity, item infit and outfit, person homogeneity, and local independency was conducted. The questionnaire widely fulfilled the predefined assumptions, with a few restrictions. The newly developed QOLIBRI-KID/ADO instrument shows at least satisfactory psychometric properties according to the results of both classical test theoretical and IRT analyses. Further evidence of its applicability should be explored in the ongoing validation study by performing multidimensional IRT analyses.

13.

Construct Validity of the Psychosis Screening Questionnaire in Ugandan Adults.

Kwagala, Claire; Ametaj, Amantia; Kim, Hannah H; Kyebuzibwa, Joseph; Rogers, Okura; Stevenson, Anne; Gelaye, Bizu; Akena, Dickens.

Res Sq ; 2023 Jan 31.

Article in English | MEDLINE | ID: mdl-36778438

ABSTRACT

Background: Psychotic disorders are common and contribute significantly to morbidity and mortality of people with psychiatric diseases. Therefore, early screening and detection may facilitate early intervention and reduce adverse outcomes. Screening tools that lay persons can administer are particularly beneficial in low resource settings. However, there is limited research evaluating the validity of psychosis screening instruments in Uganda. We aimed to assess the construct validity and psychometric properties of the Psychosis Screening Questionnaire (PSQ) in Uganda in a population with no history of a psychotic disorder. Methods: The sample consisted of 2101 Ugandan adults participating as controls in a larger multi-country case-control study on psychiatric genetics. We used confirmatory factor analysis (CFA) and item response theory (IRT) to evaluate the factor structure and item properties of the PSQ. Results: The overall prevalence screening positive for psychotic symptoms was 13.9%. "Strange experiences" were the most endorsed symptoms (6.6%). A unidimensional factor was the best fitting model based on the fit indices including the root mean square error of approximation (RMSEA of 0.00), comparative fit index (CFI of 1.000), and Tucker-Lewis Index (TLI of 1.000). The most discriminating items along the latent construct of psychosis were items assessing thought disturbance followed by items assessing paranoia, with a parameter (discrimination) value of 2.53 and 2.40, respectively. Conclusion: The PSQ works well in Uganda as an initial screening tool for moderate to high-level of psychotic symptoms.

14.

Estimating meaningful thresholds for multi-item questionnaires using item response theory.

Terluin, Berend; Koopman, Jaimy E; Hoogendam, Lisa; Griffiths, Pip; Terwee, Caroline B; Bjorner, Jakob B.

Qual Life Res ; 32(6): 1819-1830, 2023 Jun.

Article in English | MEDLINE | ID: mdl-36780033

ABSTRACT

PURPOSE: Meaningful thresholds are needed to interpret patient-reported outcome measure (PROM) results. This paper introduces a new method, based on item response theory (IRT), to estimate such thresholds. The performance of the method is examined in simulated datasets and two real datasets, and compared with other methods. METHODS: The IRT method involves fitting an IRT model to the PROM items and an anchor item indicating the criterion state of interest. The difficulty parameter of the anchor item represents the meaningful threshold on the latent trait. The latent threshold is then linked to the corresponding expected PROM score. We simulated 4500 item response datasets to a 10-item PROM, and an anchor item. The datasets varied with respect to the mean and standard deviation of the latent trait, and the reliability of the anchor item. The real datasets consisted of a depression scale with a clinical depression diagnosis as anchor variable and a pain scale with a patient acceptable symptom state (PASS) question as anchor variable. RESULTS: The new IRT method recovered the true thresholds accurately across the simulated datasets. The other methods, except one, produced biased threshold estimates if the state prevalence was smaller or greater than 0.5. The adjusted predictive modeling method matched the new IRT method (also in the real datasets) but showed some residual bias if the prevalence was smaller than 0.3 or greater than 0.7. CONCLUSIONS: The new IRT method perfectly recovers meaningful (interpretational) thresholds for multi-item questionnaires, provided that the data satisfy the assumptions for IRT analysis.

Subject(s)

Quality of Life , Humans , Reproducibility of Results , Quality of Life/psychology , Surveys and Questionnaires , Psychometrics/methods

15.

ICF-Based simple scale for children with cerebral palsy: Application of Mokken scale analysis and Rasch modeling.

Jiang, Yu-Er; Zhang, Dong-Mei; Jiang, Zhong-Li; Tao, Xue-Jiao; Dai, Min-Jun; Lin, Feng.

Dev Neurorehabil ; 26(2): 71-88, 2023 Feb.

Article in English | MEDLINE | ID: mdl-36659872

ABSTRACT

Children with cerebral palsy (CP) are faced with long-term dysfunction. The International Classification of Functioning, Disability and Health for Children and Youth (ICF-CY) has been proposed but the complicated procedure limits the feasibility of clinical application and the exploration of health degrees. This study was aimed to establish a Mokken scale based on the ICF-CY for CP, and then to estimate psychometric properties through the derived Rasch model. 150 children with CP were assessed by the categories of "b" and "d" components in the core set. The binarized data was screened by the Mokken scale analysis and utilized for generating a reliable Rasch model. The validity of the final model was checked by the correlation between person ability, Gross Motor Function Classification System (GMFCS) and total scores. Using the Mokken scale to guide Rasch modeling, we can parameterize the properties of ICF-CY and realize the simple assessment of person abilities for children with CP.

Subject(s)

Cerebral Palsy , Disabled Persons , Adolescent , Child , Humans , Disability Evaluation , International Classification of Functioning, Disability and Health , Psychometrics

16.

The psychometric properties of PSYCHLOPS, an individualized patient-reported outcome measure of personal distress.

Sales, Celia; Faísca, Luis; Ashworth, Mark; Ayis, Salma.

J Clin Psychol ; 79(3): 622-640, 2023 03.

Article in English | MEDLINE | ID: mdl-34800336

ABSTRACT

OBJECTIVE: Few studies report the psychometric properties of individualized patient-reported outcome measures (I-PROMs) combining traditional analysis and Item Response Theory (IRT). METHODS: Pre- and posttreatment PSYCHLOPS data derived from six clinical samples (n = 939) were analyzed for validity, reliability, and responsiveness; caseness cutoffs and reliable change index were calculated. Exploratory and confirmatory factor analyses were used to determine whether items represented a unidimensional construct; IRT examined item properties of this construct. RESULTS: Values for internal consistency, construct validity, convergent and discriminant validity, and structural validity were satisfactory. Responsiveness was high: Cohen's d, 1.48. Caseness cutoff and reliable clinical change scores were 6.41 and 4.63, respectively. IRT analysis confirmed that item scores possess strong properties in assessing the underlying trait measured by PSYCHLOPS. CONCLUSION: PSYCHLOPS met the criteria for norm-referenced measurement of patient psychological distress. PSYCHLOPS functioned as a measure of a single latent trait, which we describe as "personal distress."

Subject(s)

Patient Reported Outcome Measures , Humans , Psychometrics , Reproducibility of Results , Factor Analysis, Statistical , Surveys and Questionnaires

17.

Harmonizing DSM-IV and DSM-5 Versions of ADHD "A Criteria": An Item Response Theory Analysis.

Coxe, Stefany; Sibley, Margaret H.

Assessment ; 30(3): 606-617, 2023 04.

Article in English | MEDLINE | ID: mdl-34905981

ABSTRACT

The transition from Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) to Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) attention deficit/hyperactivity disorder (ADHD) checklists included item wording changes that require psychometric validation. A large sample of 854 adolescents across four randomized trials of psychosocial ADHD treatments was used to evaluate the comparability of the DSM-IV-TR and DSM-5 versions of the ADHD symptom checklist. Item response theory (IRT) was used to evaluate item characteristics and determine differences across versions and studies. Item characteristics varied across items. No consistent differences in item characteristics were found across versions. Some differences emerged between studies. IRT models were used to create continuous, harmonized scores that take item, study, and version differences into account and are therefore comparable. DSM-IV-TR ADHD checklists will generalize to the DSM-5 era. Researchers should consider using modern measurement methods (such as IRT) to better understand items and create continuous variables that better reflect the variability in their samples.

Subject(s)

Attention Deficit Disorder with Hyperactivity , Adolescent , Humans , Diagnostic and Statistical Manual of Mental Disorders , Attention Deficit Disorder with Hyperactivity/diagnosis , Checklist

18.

An Examination of the Latent Constructs in Risk Tools for Individuals Who Sexually Offend: Applying Multidimensional Item Response Theory to the Static-2002R.

Brouillette-Alarie, Sébastien; Lee, Seung C; Longpré, Nicholas; Babchishin, Kelly M.

Assessment ; 30(4): 1249-1264, 2023 06.

Article in English | MEDLINE | ID: mdl-35176903

ABSTRACT

Multidimensional item response theory (MIRT) was used to study the construct validity of the Static-2002R, an actuarial scale for the assessment of reoffending among adult men who sexually offended. Using a sample of 2,569 individuals with a history of sexual crime, exploratory factor analysis (EFA) extracted three factors: Persistence/Paraphilia, General Criminality, and Youthful Stranger Aggression. MIRT confirmed the factor structure identified in the EFA model and provided item-level data on discrimination and difficulty. All Static-2002R items showed moderate to very high discrimination and covered a wide range of risk levels (i.e., difficulty). MIRT analyses attested to the construct validity of the scale, as no items were identified as problematic and the resulting factor structure was consistent with that of earlier studies. Considering the stability of results pertaining to the factor structure of the Static-2002R and the advantages of dimensional scoring, we recommend the integration of dimensional scores in the scale.

Subject(s)

Sex Offenses , Adult , Male , Humans , Sexual Behavior , Aggression , Factor Analysis, Statistical

19.

Harmonizing Depression Measures Across Studies: a Tutorial for Data Harmonization.

Zhao, Xin; Coxe, Stefany; Sibley, Margaret H; Zulauf-McCurdy, Courtney; Pettit, Jeremy W.

Prev Sci ; 24(8): 1569-1580, 2023 Nov.

Article in English | MEDLINE | ID: mdl-35798992

ABSTRACT

There has been increasing interest in applying integrative data analysis (IDA) to analyze data across multiple studies to increase sample size and statistical power. Measures of a construct are frequently not consistent across studies. This article provides a tutorial on the complex decisions that occur when conducting harmonization of measures for an IDA, including item selection, response coding, and modeling decisions. We analyzed caregivers' self-reported data from the ADHD Teen Integrative Data Analysis Longitudinal (ADHD TIDAL) dataset; data from 621 of 854 caregivers were available. We used moderated nonlinear factor analysis (MNLFA) to harmonize items reflecting depressive symptoms. Items were drawn from the Symptom Checklist 90-Revised, the Patient Health Questionnaire-9, and the World Health Organization Quality of Life questionnaire. Conducting IDA often requires more programming skills (e.g., Mplus), statistical knowledge (e.g., IRT framework), and complex decision-making processes than single-study analyses and meta-analyses. Through this paper, we described how we evaluated item characteristics, determined differences across studies, and created a single harmonized factor score that can be used to analyze data across all four studies. We also presented our questions, challenges, and decision-making processes; for example, we explained the thought process and course of actions when models did not converge. This tutorial provides a resource to support prevention scientists to generate harmonized variables accounting for sample and study differences.

Subject(s)

Depression , Quality of Life , Adolescent , Humans , Surveys and Questionnaires , Self Report , Factor Analysis, Statistical

20.

Comparison of parameter estimation approaches for multi-unidimensional pairwise preference tests.

Tu, Naidan; Joo, Sean; Lee, Philseok; Stark, Stephen.

Behav Res Methods ; 55(6): 2764-2786, 2023 09.

Article in English | MEDLINE | ID: mdl-35931936

ABSTRACT

Multidimensional forced-choice (MFC) testing has been proposed as a way of reducing response biases in noncognitive measurement. Although early item response theory (IRT) research focused on illustrating that person parameter estimates with normative properties could be obtained using various MFC models and formats, more recent attention has been devoted to exploring the processes involved in test construction and how that influences MFC scores. This research compared two approaches for estimating multi-unidimensional pairwise preference model (MUPP; Stark et al., 2005) parameters based on the generalized graded unfolding model (GGUM; Roberts et al., 2000). More specifically, we compared the efficacy of statement and person parameter estimation based on a "two-step" process, developed by Stark et al. (2005), with a more recently developed "direct" estimation approach (Lee et al., 2019) in a Monte Carlo study that also manipulated test length, test dimensionality, sample size, and the correlations between generating person parameters for each dimension. Results indicated that the two approaches had similar scoring accuracy, although the two-step approach had better statement parameter recovery than the direct approach. Limitations, implications for MFC test construction and scoring, and recommendations for future MFC research and practice are discussed.

Subject(s)

Monte Carlo Method , Humans

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL