Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Psychometrika ; 85(2): 301-321, 2020 06.
Article in English | MEDLINE | ID: mdl-32556745

ABSTRACT

A shadow-test approach to the calibration of field-test items embedded in adaptive testing is presented. The objective function used in the shadow-test model selects both the operational and field-test items adaptively using a Bayesian version of the criterion of [Formula: see text]-optimality. The constraint set for the model can be used to hide the field-test items completely in the content of the test as well as to deal with such practical issues as random control of their exposure rates. The approach runs on efficient implementations of the Gibbs sampler for the real-time updating of the ability and field-test parameters. Optimal settings for the proposed algorithms were found and used to demonstrate item calibration with smaller than traditional sample sizes in runtimes fully comparable with conventional adaptive testing.


Subject(s)
Algorithms , Psychometrics , Bayes Theorem , Calibration , Computer Simulation , Humans , Markov Chains , Models, Statistical
2.
Eur Urol Oncol ; 2(3): 333-336, 2019 05.
Article in English | MEDLINE | ID: mdl-31200849

ABSTRACT

Within the Movember Foundation's Global Action Plan Prostate Cancer Active Surveillance (GAP3) initiative, 25 centers across the globe collaborate to standardize active surveillance (AS) protocols for men with low-risk prostate cancer (PCa). A centralized PCa AS database, comprising data of more than 15000 patients worldwide, was created. Comparability of the histopathology between the different cohorts was assessed by a centralized pathology review of 445 biopsies from 15 GAP3 centers. Grade group 1 (Gleason score 6) in 85% and grade group ≥2 (Gleason score ≥7) in 15% showed 89% concordance at review with moderate agreement (κ=0.56). Average biopsy core length was similar among the analyzed cohorts. Recently established highly adverse pathologies, including cribriform and/or intraductal carcinoma, were observed in 3.6% of the reviewed biopsies. In conclusion, the centralized pathology review of 445 biopsies revealed comparable histopathology among the 15 GAP3 centers with a low frequency of high-risk features. This enables further data analyses-without correction-toward uniform global AS guidelines for men with low-risk PCa. PATIENT SUMMARY: Movember Foundation's Global Action Plan Prostate Cancer Active Surveillance (GAP3) initiative combines data from 15000 men with low-risk prostate cancer (PCa) across the globe to standardize active surveillance protocols. Histopathology review confirmed that the histopathology was consistent with low-risk PCa in most men and comparable between different centers.


Subject(s)
Prostatic Neoplasms/pathology , Watchful Waiting/standards , Biopsy/standards , Biopsy/statistics & numerical data , Humans , Male , Neoplasm Grading , Quality of Health Care , Watchful Waiting/organization & administration , Watchful Waiting/statistics & numerical data
3.
Eur Urol ; 75(3): 523-531, 2019 03.
Article in English | MEDLINE | ID: mdl-30385049

ABSTRACT

BACKGROUND: Careful assessment of the reasons for discontinuation of active surveillance (AS) is required for men with prostate cancer (PCa). OBJECTIVE: Using Movember's Global Action Plan Prostate Cancer Active Surveillance initiative (GAP3) database, we report on reasons for AS discontinuation. DESIGN, SETTING, AND PARTICIPANTS: We compared data from 10296 men on AS from 21 centres across 12 countries. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS: Cumulative incidence methods were used to estimate the cumulative incidence rates of AS discontinuation. RESULTS AND LIMITATIONS: During 5-yr follow-up, 27.5% (95% confidence interval [CI]: 26.4-28.6%) men showed signs of disease progression, 12.8% (95% CI: 12.0-13.6%) converted to active treatment without evidence of progression, 1.7% (95% CI: 1.5-2.0%) continued to watchful waiting, and 1.7% (95% CI: 1.4-2.1%) died from other causes. Of the 7049 men who remained on AS, 2339 had follow-up for >5yr, 4561 had follow-up for <5yr, and 149 were lost to follow-up. Cumulative incidence of progression was 27.5% (95% CI: 26.4-28.6%) at 5yr and 38.2% (95% CI: 36.7-39.9%) at 10yr. A limitation is that not all centres were included due to limited information on the reason for discontinuation and limited follow-up. CONCLUSIONS: Our descriptive analyses of current AS practices worldwide showed that 43.6% of men drop out of AS during 5-yr follow-up, mainly due to signs of disease progression. Improvements in selection tools for AS are thus needed to correctly allocate men with PCa to AS, which will also reduce discontinuation due to conversion to active treatment without evidence of disease progression. PATIENT SUMMARY: Our assessment of a worldwide database of men with prostate cancer (PCa) on active surveillance (AS) shows that 43.6% drop out of AS within 5yr, mainly due to signs of disease progression. Better tools are needed to select and monitor men with PCa as part of AS.


Subject(s)
Early Detection of Cancer , Kallikreins/blood , Patient Dropouts , Prostate-Specific Antigen/blood , Prostatic Neoplasms/therapy , Watchful Waiting , Aged , Asia/epidemiology , Australia/epidemiology , Biopsy , Cause of Death , Clinical Decision-Making , Databases, Factual , Disease Progression , Early Detection of Cancer/methods , Europe/epidemiology , Humans , Male , Middle Aged , North America/epidemiology , Predictive Value of Tests , Prostatic Neoplasms/blood , Prostatic Neoplasms/mortality , Prostatic Neoplasms/pathology , Risk Assessment , Risk Factors , Time Factors
4.
Qual Life Res ; 27(7): 1683-1693, 2018 07.
Article in English | MEDLINE | ID: mdl-28710673

ABSTRACT

PURPOSE: Most computerized adaptive testing (CAT) applications in patient-reported outcomes (PRO) measurement to date are reliability-centric, with a primary objective of maximizing measurement efficiency. A key concern and a potential threat to validity is that, when left unconstrained, individual CAT administrations could have items with systematically different attributes, e.g., sub-domain coverage. This paper aims to provide a solution to the problem from an optimal test design framework using the shadow-test approach to CAT. METHODS: Following the approach, a case study was conducted using the PROMIS® (Patient-Reported Outcomes Measurement Information System) fatigue item bank both with empirical and simulated response data. Comparisons between CAT administrations without and with the enforcement of content and item pool usage constraints were examined. RESULTS: The unconstrained CAT exhibited a high degree of variation in items selected from different substrata of the item bank. Contrastingly, the shadow-test approach delivered CAT administrations conforming to all specifications with a minimal loss in measurement efficiency. CONCLUSIONS: The optimal test design and shadow-test approach to CAT provide a flexible framework for solving complex test-assembly problems with better control of their domain coverage than for the conventional use of CAT in PRO measurement. Applications in a wide array of PRO domains are expected to lead to more controlled and balanced use of CAT in the field.


Subject(s)
Patient Reported Outcome Measures , Reproducibility of Results , Fatigue/physiopathology , Fatigue/psychology , Humans , Psychometrics , Quality of Life , Software , Surveys and Questionnaires
5.
Psychometrika ; 82(2): 498-522, 2017 06.
Article in English | MEDLINE | ID: mdl-28290109

ABSTRACT

Parameter recovery and item utilization were investigated for different designs for online test item calibration. The design was adaptive in a double sense: it assumed both adaptive testing of examinees from an operational pool of previously calibrated items and adaptive assignment of field-test items to the examinees. Four criteria of optimality for the assignment of the field-test items were used, each of them based on the information in the posterior distributions of the examinee's ability parameter during adaptive testing as well as the sequentially updated posterior distributions of the field-test item parameters. In addition, different stopping rules based on target values for the posterior standard deviations of the field-test parameters and the size of the calibration sample were used. The impact of each of the criteria and stopping rules on the statistical efficiency of the estimates of the field-test parameters and on the time spent by the items in the calibration procedure was investigated. Recommendations as to the practical use of the designs are given.


Subject(s)
Calibration , Psychometrics , Humans
6.
Psychometrika ; 82(1): 273, 2017 03.
Article in English | MEDLINE | ID: mdl-28116569
7.
Psychometrika ; 81(3): 650-73, 2016 09.
Article in English | MEDLINE | ID: mdl-26155754

ABSTRACT

With a few exceptions, the problem of linking item response model parameters from different item calibrations has been conceptualized as an instance of the problem of test equating scores on different test forms. This paper argues, however, that the use of item response models does not require any test score equating. Instead, it involves the necessity of parameter linking due to a fundamental problem inherent in the formal nature of these models-their general lack of identifiability. More specifically, item response model parameters need to be linked to adjust for the different effects of the identifiability restrictions used in separate item calibrations. Our main theorems characterize the formal nature of these linking functions for monotone, continuous response models, derive their specific shapes for different parameterizations of the 3PL model, and show how to identify them from the parameter values of the common items or persons in different linking designs.


Subject(s)
Educational Measurement , Models, Statistical , Models, Theoretical , Calibration , Humans
8.
Appl Psychol Meas ; 40(7): 469-485, 2016 Oct.
Article in English | MEDLINE | ID: mdl-29881064

ABSTRACT

Even in the age of abundant and fast computing resources, concurrency requirements for large-scale online testing programs still put an uninterrupted delivery of computer-adaptive tests at risk. In this study, to increase the concurrency for operational programs that use the shadow-test approach to adaptive testing, we explored various strategies aiming for reducing the number of reassembled shadow tests without compromising the measurement quality. Strategies requiring fixed intervals between reassemblies, a certain minimal change in the interim ability estimate since the last assembly before triggering a reassembly, and a hybrid of the two strategies yielded substantial reductions in the number of reassemblies without degradation in the measurement accuracy. The strategies effectively prevented unnecessary reassemblies due to adapting to the noise in the early test stages. They also highlighted the practicality of the shadow-test approach by minimizing the computational load involved in its use of mixed-integer programming.

9.
Appl Psychol Meas ; 40(8): 641-649, 2016 Nov.
Article in English | MEDLINE | ID: mdl-29881074

ABSTRACT

A recent article in this journal addressed the choice between specialized heuristics and mixed-integer programming (MIP) solvers for automated test assembly. This reaction is to comment on the mischaracterization of the general nature of MIP solvers in this article, highlight the quite inefficient modeling of the test-assembly problems used in its empirical examples, and counter these examples by presenting the MIP solutions for a set of 35 real-world multiple-form assembly problems.

10.
Psychometrika ; 80(2): 263-88, 2015 Jun.
Article in English | MEDLINE | ID: mdl-24407735

ABSTRACT

An optimal adaptive design for test-item calibration based on Bayesian optimality criteria is presented. The design adapts the choice of field-test items to the examinees taking an operational adaptive test using both the information in the posterior distributions of their ability parameters and the current posterior distributions of the field-test parameters. Different criteria of optimality based on the two types of posterior distributions are possible. The design can be implemented using an MCMC scheme with alternating stages of sampling from the posterior distributions of the test takers' ability parameters and the parameters of the field-test items while reusing samples from earlier posterior distributions of the other parameters. Results from a simulation study demonstrated the feasibility of the proposed MCMC implementation for operational item calibration. A comparison of performances for different optimality criteria showed faster calibration of substantial numbers of items for the criterion of D-optimality relative to A-optimality, a special case of c-optimality, and random assignment of items to the test takers.


Subject(s)
Bayes Theorem , Psychometrics/methods , Algorithms , Humans , Mathematical Concepts , Research Design
11.
Psychometrika ; 80(3): 689-706, 2015 Sep.
Article in English | MEDLINE | ID: mdl-24915988

ABSTRACT

Posterior odds of cheating on achievement tests are presented as an alternative to [Formula: see text] values reported for statistical hypothesis testing for several of the probabilistic models in the literature on the detection of cheating. It is shown how to calculate their combinatorial expressions with the help of a reformulation of the simple recursive algorithm for the calculation of number-correct score distributions used throughout the testing industry. Using the odds avoids the arbitrary choice between statistical tests of answer copying that do and do not condition on the responses the test taker is suspected to have copied and allows the testing agency to account for existing circumstantial evidence of cheating through the specification of prior odds.


Subject(s)
Algorithms , Bayes Theorem , Deception , Educational Measurement/methods , Models, Statistical , Binomial Distribution , Humans , Psychometrics
12.
PLoS One ; 7(4): e34491, 2012.
Article in English | MEDLINE | ID: mdl-22496816

ABSTRACT

INTRODUCTION: Inadequate flow enhancement on the one hand, and excessive flow enhancement on the other hand, remain frequent complications of arteriovenous fistula (AVF) creation, and hamper hemodialysis therapy in patients with end-stage renal disease. In an effort to reduce these, a patient-specific computational model, capable of predicting postoperative flow, has been developed. The purpose of this study was to determine the accuracy of the patient-specific model and to investigate its feasibility to support decision-making in AVF surgery. METHODS: Patient-specific pulse wave propagation models were created for 25 patients awaiting AVF creation. Model input parameters were obtained from clinical measurements and literature. For every patient, a radiocephalic AVF, a brachiocephalic AVF, and a brachiobasilic AVF configuration were simulated and analyzed for their postoperative flow. The most distal configuration with a predicted flow between 400 and 1500 ml/min was considered the preferred location for AVF surgery. The suggestion of the model was compared to the choice of an experienced vascular surgeon. Furthermore, predicted flows were compared to measured postoperative flows. RESULTS: Taken into account the confidence interval (25(th) and 75(th) percentile interval), overlap between predicted and measured postoperative flows was observed in 70% of the patients. Differentiation between upper and lower arm configuration was similar in 76% of the patients, whereas discrimination between two upper arm AVF configurations was more difficult. In 3 patients the surgeon created an upper arm AVF, while model based predictions allowed for lower arm AVF creation, thereby preserving proximal vessels. In one patient early thrombosis in a radiocephalic AVF was observed which might have been indicated by the low predicted postoperative flow. CONCLUSIONS: Postoperative flow can be predicted relatively accurately for multiple AVF configurations by using computational modeling. This model may therefore be considered a valuable additional tool in the preoperative work-up of patients awaiting AVF creation.


Subject(s)
Arteriovenous Shunt, Surgical , Computational Biology , Decision Making , Upper Extremity/blood supply , Blood Circulation , Feasibility Studies , Humans , Postoperative Period , Preoperative Period , Prospective Studies , Vascular Patency
13.
Br J Math Stat Psychol ; 63(Pt 3): 603-26, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20109271

ABSTRACT

Marginal maximum-likelihood procedures for parameter estimation and testing the fit of a hierarchical model for speed and accuracy on test items are presented. The model is a composition of two first-level models for dichotomous responses and response times along with multivariate normal models for their item and person parameters. It is shown how the item parameters can easily be estimated using Fisher's identity. To test the fit of the model, Lagrange multiplier tests of the assumptions of subpopulation invariance of the item parameters (i.e., no differential item functioning), the shape of the response functions, and three different types of conditional independence were derived. Simulation studies were used to show the feasibility of the estimation and testing procedures and to estimate the power and Type I error rate of the latter. In addition, the procedures were applied to an empirical data set from a computerized adaptive test of language comprehension.


Subject(s)
Data Collection/statistics & numerical data , Likelihood Functions , Models, Statistical , Psychological Tests/statistics & numerical data , Reaction Time , Algorithms , Comprehension , Computer Simulation , Educational Measurement/statistics & numerical data , Feasibility Studies , Humans , Mathematical Computing , Multilingualism , Multivariate Analysis , Probability , Psychometrics/statistics & numerical data , Reproducibility of Results , Software
14.
Psychometrika ; 74(2): 273-296, 2009 Jun.
Article in English | MEDLINE | ID: mdl-20119511

ABSTRACT

Several criteria from the optimal design literature are examined for use with item selection in multidimensional adaptive testing. In particular, it is examined what criteria are appropriate for adaptive testing in which all abilities are intentional, some should be considered as a nuisance, or the interest is in the testing of a composite of the abilities. Both the theoretical analyses and the studies of simulated data in this paper suggest that the criteria of A-optimality and D-optimality lead to the most accurate estimates when all abilities are intentional, with the former slightly outperforming the latter. The criterion of E-optimality showed occasional erratic behavior for this case of adaptive testing, and its use is not recommended. If some of the abilities are nuisances, application of the criterion of A(s)-optimality (or D(s)-optimality), which focuses on the subset of intentional abilities is recommended. For the measurement of a linear combination of abilities, the criterion of c-optimality yielded the best results. The preferences of each of these criteria for items with specific patterns of parameter values was also assessed. It was found that the criteria differed mainly in their preferences of items with different patterns of values for their discrimination parameters.

SELECTION OF CITATIONS
SEARCH DETAIL
...