Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.349
Filter
1.
Phys Med Biol ; 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38981591

ABSTRACT

Objective We propose a nonparametric figure of merit, the contrast equivalent distance CED, to measure contrast directly from clinical images. Approach A relative brightness distance δ is calculated by making use of the order statistic of the pixel values. By multiplying δ with the grey value range R, the mean brightness distance MBD is obtained. From the MBD, the CED and the distance-to- noise ratio DNR can be derived. The latter is the ratio of the MBD and a previously suggested nonparametric measure τ for the noise. Since the order statistic is independent of the spatial arrangement of the pixel values, the measures can be obtained directly from clinical images. We apply the new measures to mammography images of an anthropomorphic phantom and of a phantom with a step wedge as well as to CT images of a head phantom. Main results For low-noise images of a step wedge, the MBD is equivalent to the conventional grey value distance. While this measure permits the evaluation of clinical images, it is sensitive to noise. Therefore, noise has to be quantified at the same time. When the ratio σ/τ of the noise standard deviation σ to τ is available, validity limits for the CED as a measure of contrast can be established. The new figures of merit can be calculated for entire images as well as on regions of interest (ROI) with an edge length not smaller than 32 px. Significance The new figures of merit are suited to quantify the quality of clinical images without relying on the assumption of a linear, shift-invariant system. They can be used for any kind of greyscale image, provided the ratio σ/τ can be estimated. This will hopefully help to achieve the optimisation of image quality vs dose required by radioprotection laws.

2.
J Environ Manage ; 365: 121641, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38959764

ABSTRACT

Urban areas contribute 85% of China's CO2 emissions. Green finance is an important means to support green energy development and achieve the low-carbon transformation of high-energy-consuming industries. The motivation of this article is to investigate the impact and mechanism of green finance on urban carbon intensity. Most existing literature uses linear models to investigate urban carbon intensity, ignoring the nonlinear relationships between economic variables. The nonparametric models can fill the inherent shortcomings of linear models and effectively simulate the nonlinear nexus between economic variables. Based on the 2011-2021 panel data of 237 cities in China, this paper applies the nonparametric additive model to survey the influence of green finance on urban carbon intensity. Empirical findings exhibit that green finance exerts an inverted U-shaped effect on urban carbon intensity, indicating that the carbon reduction effect of green finance has gradually shifted from inconspicuous in the early stages to prominent in the later stages. Then, from the perspectives of region, city size, and carbon intensity, this article conducts heterogeneity analysis. The results show that the impact of green finance on various carbon intensities all exhibits obvious nonlinear feature. Furthermore, this article employs a mediation effect model to conduct mechanism analysis. The results display that technological progress and industrial structure are two important mediating variables, both of which produce an inverted U-shaped nonlinear impact on urban carbon intensity.

3.
Sensors (Basel) ; 24(11)2024 May 24.
Article in English | MEDLINE | ID: mdl-38894148

ABSTRACT

Birth asphyxia is a potential cause of death that is also associated with acute and chronic morbidities. The traditional and immediate approach for monitoring birth asphyxia (i.e., arterial blood gas analysis) is highly invasive and intermittent. Additionally, alternative noninvasive approaches such as pulse oximeters can be problematic, due to the possibility of false and erroneous measurements. Therefore, further research is needed to explore alternative noninvasive and accurate monitoring methods for asphyxiated neonates. This study aims to investigate the prominent ECG features based on pH estimation that could potentially be used to explore the noninvasive, accurate, and continuous monitoring of asphyxiated neonates. The dataset used contained 274 segments of ECG and pH values recorded simultaneously. After preprocessing the data, principal component analysis and the Pan-Tompkins algorithm were used for each segment to determine the most significant ECG cycle and to compute the ECG features. Descriptive statistics were performed to describe the main properties of the processed dataset. A Kruskal-Wallis nonparametric test was then used to analyze differences between the asphyxiated and non-asphyxiated groups. Finally, a Dunn-Sidák post hoc test was used for individual comparison among the mean ranks of all groups. The findings of this study showed that ECG features (T/QRS, T Amplitude, Tslope, Tslope/T, Tslope/|T|, HR, QT, and QTc) based on pH estimation differed significantly (p < 0.05) in asphyxiated neonates. All these key ECG features were also found to be significantly different between the two groups.


Subject(s)
Asphyxia Neonatorum , Electrocardiography , Humans , Electrocardiography/methods , Infant, Newborn , Hydrogen-Ion Concentration , Asphyxia Neonatorum/diagnosis , Asphyxia Neonatorum/physiopathology , Algorithms , Feasibility Studies , Blood Gas Analysis/methods , Principal Component Analysis , Female , Male
4.
Stat Med ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38897921

ABSTRACT

Biomarkers are often measured in bulk to diagnose patients, monitor patient conditions, and research novel drug pathways. The measurement of these biomarkers often suffers from detection limits that result in missing and untrustworthy measurements. Frequently, missing biomarkers are imputed so that down-stream analysis can be conducted with modern statistical methods that cannot normally handle data subject to informative censoring. This work develops an empirical Bayes g $$ g $$ -modeling method for imputing and denoising biomarker measurements. We establish superior estimation properties compared to popular methods in simulations and with real data, providing the useful biomarker measurement estimations for down-stream analysis.

5.
Stat Med ; 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38885953

ABSTRACT

Recent advances in engineering technologies have enabled the collection of a large number of longitudinal features. This wealth of information presents unique opportunities for researchers to investigate the complex nature of diseases and uncover underlying disease mechanisms. However, analyzing such kind of data can be difficult due to its high dimensionality, heterogeneity and computational challenges. In this article, we propose a Bayesian nonparametric mixture model for clustering high-dimensional mixed-type (eg, continuous, discrete and categorical) longitudinal features. We employ a sparse factor model on the joint distribution of random effects and the key idea is to induce clustering at the latent factor level instead of the original data to escape the curse of dimensionality. The number of clusters is estimated through a Dirichlet process prior. An efficient Gibbs sampler is developed to estimate the posterior distribution of the model parameters. Analysis of real and simulated data is presented and discussed. Our study demonstrates that the proposed model serves as a useful analytical tool for clustering high-dimensional longitudinal data.

6.
Restor Dent Endod ; 49(2): e21, 2024 May.
Article in English | MEDLINE | ID: mdl-38841381

ABSTRACT

Objectives: This paper aims to serve as a useful guide for sample size determination for various correlation analyses that are based on effect sizes and confidence interval width. Materials and Methods: Sample size determinations are calculated for Pearson's correlation, Spearman's rank correlation, and Kendall's Tau-b correlation. Examples of sample size statements and their justification are also included. Results: Using the same effect sizes, there are differences between the sample size determination of the 3 statistical tests. Based on an empirical calculation, a minimum sample size of 149 is usually adequate for performing both parametric and non-parametric correlation analysis to determine at least a moderate to an excellent degree of correlation with acceptable confidence interval width. Conclusions: Determining data assumption(s) is one of the challenges to offering a valid technique to estimate the required sample size for correlation analyses. Sample size tables are provided and these will help researchers to estimate a minimum sample size requirement based on correlation analyses.

7.
Front Neurosci ; 18: 1344114, 2024.
Article in English | MEDLINE | ID: mdl-38933813

ABSTRACT

One-shot learning, the ability to learn a new concept from a single instance, is a distinctive brain function that has garnered substantial interest in machine learning. While modeling physiological mechanisms poses challenges, advancements in artificial neural networks have led to performances in specific tasks that rival human capabilities. Proposing one-shot learning methods with these advancements, especially those involving simple mechanisms, not only enhance technological development but also contribute to neuroscience by proposing functionally valid hypotheses. Among the simplest methods for one-shot class addition with deep learning image classifiers is "weight imprinting," which uses neural activity from a new class image data as the corresponding new synaptic weights. Despite its simplicity, its relevance to neuroscience is ambiguous, and it often interferes with original image classification, which is a significant drawback in practical applications. This study introduces a novel interpretation where a part of the weight imprinting process aligns with the Hebbian rule. We show that a single Hebbian-like process enables pre-trained deep learning image classifiers to perform one-shot class addition without any modification to the original classifier's backbone. Using non-parametric normalization to mimic brain's fast Hebbian plasticity significantly reduces the interference observed in previous methods. Our method is one of the simplest and most practical for one-shot class addition tasks, and its reliance on a single fast Hebbian-like process contributes valuable insights to neuroscience hypotheses.

8.
Ultrasonics ; 142: 107391, 2024 Jun 24.
Article in English | MEDLINE | ID: mdl-38936287

ABSTRACT

Diagnosis of early hepatic steatosis would allow timely intervention. B-mode ultrasound imaging was in question for detecting early steatosis, especially with a variety of concomitant parenchymal disease. This study aimed to use the surgical specimen as a reference standard to elucidate the clinical performance of ultrasonic echogenicity and backscatter parametric and nonparametric statistics in real-world scenarios. Ultrasound radio-frequency (RF) signals of right liver lobe and patient data were collected preoperatively. Surgical specimen was then used to histologically determine staging of steatosis. A backscatter nonparametric statistic (h), a known backscatter parametric statistic, i.e., the Nakagami parameter (m), and a quantitative echo intensity (env) were calculated. Among the 236 patients included in the study, 93 were grade 0 (<5% fat) and 143 were with steatosis. All the env, m and h statistics had shown significant discriminatory power of steatosis grades (AUC = 0.643-0.907 with p-value < 0.001). Mann-Whitney U tests, however, revealed that only the backscatter statistics m and h were significantly different between the groups of grades 0 and 1 steatosis. The two-way ANOVA showed a significant confounding effect of the elevated ALT on env (p-value = 0.028), but no effect on m or h. Additionally, the severe fibrosis was found to be a significant covariate for m and h. Ultrasonic signals acquired from different scanners were found linearly comparable.

9.
Entropy (Basel) ; 26(6)2024 May 31.
Article in English | MEDLINE | ID: mdl-38920496

ABSTRACT

The joint probability density function of wind speed and wind direction serves as the mathematical basis for directional wind energy assessment. In this study, a nonparametric joint probability estimation system for wind velocity and direction based on copulas is proposed and empirically investigated in Inner Mongolia, China. Optimal bandwidth algorithms and transformation techniques are used to determine the nonparametric copula method. Various parameter copula models and models without considering dependency relationships are introduced and compared with this approach. The results indicate a significant advantage of employing the nonparametric copula model for fitting joint probability distributions of both wind speed and wind direction, as well as conducting correlation analyses. By utilizing the proposed KDE-COP-CV model, it becomes possible to accurately and reliably analyze how wind power density fluctuates in relation to wind direction. This study reveals the researched region possesses abundant wind resources, with the highest wind power density being highly dependent on wind direction at maximum speeds. Wind resources in selected regions of Inner Mongolia are predominantly concentrated in the northwest and west directions. These findings can contribute to improving the accuracy of micro-siting for wind farms, as well as optimizing the design and capacity of wind turbine generators.

10.
J Environ Manage ; 365: 121553, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38908148

ABSTRACT

Carbon dioxide (CO2) emissions are the primary contributors to climate change. Addressing and mitigating climate change necessitates the effective management and utilization of renewable energy consumption, which poses a substantial challenge for the forthcoming decades. This study explores the dynamic effects of service value added (SVA) and renewable energy on environmental quality, particularly focusing on CO2 emissions. Unlike previous studies, we employ a non-parametric modeling approach to uncover the time-varying influence of service sector growth on CO2 emissions. Specifically, we apply the local linear dummy variable estimation (LLDVE) method to a panel of the 17 highest-emitting nations over the period 1980-2021. Our study uncovers a non-linear relationship between CO2 emissions and SVA. From 1980 to 2003, we observe a negative correlation. However, starting from 2005 to 2020, we witness a shift towards a positive correlation, indicating a rise in energy consumption within the service sector. The results indicate that significant emitter economies have yet to achieve sustainability, with the service sector continuing to contribute to pollution. Addressing this issue necessitates more robust climate change policies and increased investment in clean energy, specifically targeting the service sector, including buildings and transport.

11.
Sci Rep ; 14(1): 13561, 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38866892

ABSTRACT

In various practical situations, the information about the process distribution is sometimes partially or completely unavailable. In these instances, practitioners prefer to use nonparametric charts as they don't restrict the assumption of normality or specific distribution. In this current article, a nonparametric double homogeneously weighted moving average control chart based on the Wilcoxon signed-rank statistic is developed for monitoring the location parameter of the process. The run-length profiles of the newly developed chart are obtained by using Monte Carlo simulations. Comparisons are made based on various performance metrics of run-length distribution among proposed and existing nonparametric counterparts charts. The extra quadratic loss is used to evaluate the overall performance of the proposed and existing charts. The newly developed scheme showed comparatively better results than its existing counterparts. For practical implementation of the suggested scheme, the real-world dataset related to the inside diameter of the automobile piston rings is also used.

12.
Nan Fang Yi Ke Da Xue Xue Bao ; 44(4): 689-696, 2024 Apr 20.
Article in Chinese | MEDLINE | ID: mdl-38708502

ABSTRACT

OBJECTIVE: To construct a nonparametric proportional hazards (PH) model for mixed informative interval-censored failure time data for predicting the risks in heart transplantation surgeries. METHODS: Based on the complexity of mixed informative interval-censored failure time data, we considered the interdependent relationship between failure time process and observation time process, constructed a nonparametric proportional hazards (PH) model to describe the nonlinear relationship between the risk factors and heart transplant surgery risks and proposed a two-step sieve estimation maximum likelihood algorithm. An estimation equation was established to estimate frailty variables using the observation process model. Ⅰ-spline and B-spline were used to approximate the unknown baseline hazard function and nonparametric function, respectively, to obtain the working likelihood function in the sieve space. The partial derivative of the model parameters was used to obtain the scoring equation. The maximum likelihood estimation of the parameters was obtained by solving the scoring equation, and a function curve of the impact of risk factors on the risk of heart transplantation surgery was drawn. RESULTS: Simulation experiment suggested that the estimated values obtained by the proposed method were consistent and asymptotically effective under various settings with good fitting effects. Analysis of heart transplant surgery data showed that the donor's age had a positive linear relationship with the surgical risk. The impact of the recipient's age at disease onset increased at first and then stabilized, but increased against at an older age. The donor-recipient age difference had a positive linear relationship with the surgical risk of heart transplantation. CONCLUSION: The nonparametric PH model established in this study can be used for predicting the risks in heart transplantation surgery and exploring the functional relationship between the surgery risks and the risk factors.


Subject(s)
Heart Transplantation , Proportional Hazards Models , Humans , Risk Factors , Algorithms , Likelihood Functions
13.
J Am Stat Assoc ; 119(545): 297-307, 2024.
Article in English | MEDLINE | ID: mdl-38716406

ABSTRACT

The weighted nearest neighbors (WNN) estimator has been popularly used as a flexible and easy-to-implement nonparametric tool for mean regression estimation. The bagging technique is an elegant way to form WNN estimators with weights automatically generated to the nearest neighbors (Steele, 2009; Biau et al., 2010); we name the resulting estimator as the distributional nearest neighbors (DNN) for easy reference. Yet, there is a lack of distributional results for such estimator, limiting its application to statistical inference. Moreover, when the mean regression function has higher-order smoothness, DNN does not achieve the optimal nonparametric convergence rate, mainly because of the bias issue. In this work, we provide an in-depth technical analysis of the DNN, based on which we suggest a bias reduction approach for the DNN estimator by linearly combining two DNN estimators with different subsampling scales, resulting in the novel two-scale DNN (TDNN) estimator. The two-scale DNN estimator has an equivalent representation of WNN with weights admitting explicit forms and some being negative. We prove that, thanks to the use of negative weights, the two-scale DNN estimator enjoys the optimal nonparametric rate of convergence in estimating the regression function under the fourth-order smoothness condition. We further go beyond estimation and establish that the DNN and two-scale DNN are both asymptotically normal as the subsampling scales and sample size diverge to infinity. For the practical implementation, we also provide variance estimators and a distribution estimator using the jackknife and bootstrap techniques for the two-scale DNN. These estimators can be exploited for constructing valid confidence intervals for nonparametric inference of the regression function. The theoretical results and appealing finite-sample performance of the suggested two-scale DNN method are illustrated with several simulation examples and a real data application.

14.
Trials ; 25(1): 312, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38725072

ABSTRACT

BACKGROUND: Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring. METHODS: Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility. RESULTS: The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O'Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2-22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0-15% lower. CONCLUSIONS: In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.


Subject(s)
Clinical Trials as Topic , Computer Simulation , Medical Futility , Research Design , Humans , Clinical Trials as Topic/methods , Sample Size , Data Interpretation, Statistical , Models, Statistical , Treatment Outcome
15.
Healthcare (Basel) ; 12(9)2024 May 02.
Article in English | MEDLINE | ID: mdl-38727496

ABSTRACT

Understanding the intricate relationships between diseases is critical for both prevention and recovery. However, there is a lack of suitable methodologies for exploring the precedence relationships within multiple censored time-to-event data, resulting in decreased analytical accuracy. This study introduces the Censored Event Precedence Analysis (CEPA), which is a nonparametric Bayesian approach suitable for understanding the precedence relationships in censored multivariate events. CEPA aims to analyze the precedence relationships between events to predict subsequent occurrences effectively. We applied CEPA to neonatal data from the National Health Insurance Service, identifying the precedence relationships among the seven most commonly diagnosed diseases categorized by the International Classification of Diseases. This analysis revealed a typical diagnostic sequence, starting with respiratory diseases, followed by skin, infectious, digestive, ear, eye, and injury-related diseases. Furthermore, simulation studies were conducted to demonstrate CEPA suitability for censored multivariate datasets compared to traditional models. The performance accuracy reached 76% for uniform distribution and 65% for exponential distribution, showing superior performance in all four tested environments. Therefore, the statistical approach based on CEPA enhances our understanding of disease interrelationships beyond competitive methodologies. By identifying disease precedence with CEPA, we can preempt subsequent disease occurrences and propose a healthcare system based on these relationships.

16.
Environ Sci Pollut Res Int ; 31(25): 36796-36813, 2024 May.
Article in English | MEDLINE | ID: mdl-38755475

ABSTRACT

The purpose of this article is to investigate the new driving forces behind China's green energy and further assess the impact of green energy on climate change. The existing literature has used linear methods to investigate green energy, ignoring the non-linear relationships between economic variables. The nonparametric models can accurately simulate nonlinear relationships between economic variables. This paper constructs a nonparametric additive model and uses it to explore green energy. The empirical results show that the impact of green finance on green energy is more prominent in the later stage (a U-shaped impact). Fiscal decentralization also exerts a positive U-shaped impact, meaning that expanding local fiscal autonomy has contributed to green energy growth in the later stage. Similarly, the impact of oil prices and foreign direct investment demonstrates a positive U-shaped pattern. However, the nonlinear impact of environmental pressure displays an inverted U-shaped pattern. Furthermore, this article explores the impact of green energy on climate change and its impact mechanisms. The results exhibit green energy generates a positive U-shaped impact on climate change, meaning that the role of green energy in mitigating climate change gradually becomes prominent over time. Mechanism analysis exhibits that industrial structure and energy structure both produce a nonlinear influence on climate change.


Subject(s)
Climate Change , China
17.
Pain Med ; 2024 May 22.
Article in English | MEDLINE | ID: mdl-38775642

ABSTRACT

OBJECTIVE: The statistical analysis typically employed to compare pain both before and after interventions assumes scores are normally distributed. The present study evaluates whether Numeric Rating Scale (NRS), specifically the NRS-11, scores are indeed normally distributed in a clinically-relevant cohort of adults with chronic axial spine pain pre- and post-analgesic intervention. METHODS: Retrospective review from four academic medical centers of prospectively collected data from a uniform pain diary administered to consecutive patients after undergoing medial branch blocks. The pain diary assessed NRS-11 scores immediately pre-injection and at 12 different time points post-injection up to 48 hours. D'Agostino-Pearson tests were used to test normality at all time points. RESULTS: One hundred fifty pain diaries were reviewed and despite normally distributed pre-injection NRS-11 scores (K2 = 0.655, p = 0.72), all post-injection NRS-11 data was not normally distributed (K2 = 9.70- 17.62, p = 0.0001-0.008). CONCLUSIONS: Although the results of parametric analyses of NRS-11 scores are commonly reported in pain research, some properties of NRS-11 do not satisfy the assumptions required for these analyses. The data demonstrate non-normal distributions in post-intervention NRS-11 scores, thereby violating a key requisite for parametric analysis. We urge pain researchers to consider appropriate statistical analysis and reporting for non-normally distributed NRS-11 scores to ensure accurate interpretation and communication of these data. Practicing pain physicians should similarly recognize that parametric post-intervention pain score statistics may not accurately describe the data and should expect manuscripts to utilize measures of normality to justify the selected statistical methods.

18.
Stat Methods Med Res ; : 9622802241254196, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38767219

ABSTRACT

In many cluster-correlated data analyses, informative cluster size poses a challenge that can potentially introduce bias in statistical analyses. Different methodologies have been introduced in statistical literature to address this bias. In this study, we consider a complex form of informativeness where the number of observations corresponding to latent levels of a unit-level continuous covariate within a cluster is associated with the response variable. This type of informativeness has not been explored in prior research. We present a novel test statistic designed to evaluate the effect of the continuous covariate while accounting for the presence of informativeness. The covariate induces a continuum of latent subgroups within the clusters, and our test statistic is formulated by aggregating values from an established statistic that accounts for informative subgroup sizes when comparing group-specific marginal distributions. Through carefully designed simulations, we compare our test with four traditional methods commonly employed in the analysis of cluster-correlated data. Only our test maintains the size across all data-generating scenarios with informativeness. We illustrate the proposed method to test for marginal associations in periodontal data with this distinctive form of informativeness.

19.
Entropy (Basel) ; 26(5)2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38785636

ABSTRACT

Using information-theoretic quantities in practical applications with continuous data is often hindered by the fact that probability density functions need to be estimated in higher dimensions, which can become unreliable or even computationally unfeasible. To make these useful quantities more accessible, alternative approaches such as binned frequencies using histograms and k-nearest neighbors (k-NN) have been proposed. However, a systematic comparison of the applicability of these methods has been lacking. We wish to fill this gap by comparing kernel-density-based estimation (KDE) with these two alternatives in carefully designed synthetic test cases. Specifically, we wish to estimate the information-theoretic quantities: entropy, Kullback-Leibler divergence, and mutual information, from sample data. As a reference, the results are compared to closed-form solutions or numerical integrals. We generate samples from distributions of various shapes in dimensions ranging from one to ten. We evaluate the estimators' performance as a function of sample size, distribution characteristics, and chosen hyperparameters. We further compare the required computation time and specific implementation challenges. Notably, k-NN estimation tends to outperform other methods, considering algorithmic implementation, computational efficiency, and estimation accuracy, especially with sufficient data. This study provides valuable insights into the strengths and limitations of the different estimation methods for information-theoretic quantities. It also highlights the significance of considering the characteristics of the data, as well as the targeted information-theoretic quantity when selecting an appropriate estimation technique. These findings will assist scientists and practitioners in choosing the most suitable method, considering their specific application and available data. We have collected the compared estimation methods in a ready-to-use open-source Python 3 toolbox and, thereby, hope to promote the use of information-theoretic quantities by researchers and practitioners to evaluate the information in data and models in various disciplines.

20.
Article in English | MEDLINE | ID: mdl-38794963

ABSTRACT

Computerized adaptive testing for cognitive diagnosis (CD-CAT) achieves remarkable estimation efficiency and accuracy by adaptively selecting and then administering items tailored to each examinee. The process of item selection stands as a pivotal component of a CD-CAT algorithm, with various methods having been developed for binary responses. However, multiple-choice (MC) items, an important item type that allows for the extraction of richer diagnostic information from incorrect answers, have been underemphasized. Currently, the Jensen-Shannon divergence (JSD) index introduced by Yigit et al. (Applied Psychological Measurement, 2019, 43, 388) is the only item selection method exclusively designed for MC items. However, the JSD index requires a large sample to calibrate item parameters, which may be infeasible when there is only a small or no calibration sample. To bridge this gap, the study first proposes a nonparametric item selection method for MC items (MC-NPS) by implementing novel discrimination power that measures an item's ability to effectively distinguish among different attribute profiles. A Q-optimal procedure for MC items is also developed to improve the classification during the initial phase of a CD-CAT algorithm. The effectiveness and efficiency of the two proposed algorithms were confirmed by simulation studies.

SELECTION OF CITATIONS
SEARCH DETAIL
...