Search | VHL Regional Portal

1.

Generalized Confidence Intervals for Ratios of Standard Deviations Based on Log-Normal Distribution when Times Follow Weibull Distributions.

Chen, Pei-Fu; Dexter, Franklin.

J Med Syst ; 48(1): 58, 2024 Jun 01.

Article in English | MEDLINE | ID: mdl-38822876

ABSTRACT

Modern anesthetic drugs ensure the efficacy of general anesthesia. Goals include reducing variability in surgical, tracheal extubation, post-anesthesia care unit, or intraoperative response recovery times. Generalized confidence intervals based on the log-normal distribution compare variability between groups, specifically ratios of standard deviations. The alternative statistical approaches, performing robust variance comparison tests, give P-values, not point estimates nor confidence intervals for the ratios of the standard deviations. We performed Monte-Carlo simulations to learn what happens to confidence intervals for ratios of standard deviations of anesthesia-associated times when analyses are based on the log-normal, but the true distributions are Weibull. We used simulation conditions comparable to meta-analyses of most randomized trials in anesthesia, n ≈ 25 and coefficients of variation ≈ 0.30 . The estimates of the ratios of standard deviations were positively biased, but slightly, the ratios being 0.11% to 0.33% greater than nominal. In contrast, the 95% confidence intervals were very wide (i.e., > 95% of P ≥ 0.05). Although substantive inferentially, the differences in the confidence limits were small from a clinical or managerial perspective, with a maximum absolute difference in ratios of 0.016. Thus, P < 0.05 is reliable, but investigators should plan for Type II errors at greater than nominal rates.

Subject(s)

Monte Carlo Method , Humans , Confidence Intervals , Anesthesia, General , Time Factors , Models, Statistical

2.

Forecasting the onset of depression with limited baseline data only: A comparison of a person-specific and a multilevel modeling based exponentially weighted moving average approach.

Schat, Evelien; Tuerlinckx, Francis; Schreuder, Marieke J; De Ketelaere, Bart; Ceulemans, Eva.

Psychol Assess ; 36(6-7): 379-394, 2024.

Article in English | MEDLINE | ID: mdl-38829348

ABSTRACT

The onset of depressive episodes is preceded by changes in mean levels of affective experiences, which can be detected using the exponentially weighted moving average procedure on experience sampling method (ESM) data. Applying the exponentially weighted moving average procedure requires sufficient baseline data from the person under study in healthy times, which is needed to calculate a control limit for monitoring incoming ESM data. It is, however, not trivial to obtain sufficient baseline data from a single person. We therefore investigate whether historical ESM data from healthy individuals can help establish an adequate control limit for the person under study via multilevel modeling. Specifically, we focus on the case in which there is very little baseline data available of the person under study (i.e., up to 7 days). This multilevel approach is compared with the traditional, person-specific approach, where estimates are obtained using the person's available baseline data. Predictive performance in terms of Matthews correlation coefficient did not differ much between the approaches; however, the multilevel approach was more sensitive at detecting mean changes. This implies that for low-cost and nonharmful interventions, the multilevel approach may prove particularly beneficial. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

Subject(s)

Ecological Momentary Assessment , Multilevel Analysis , Humans , Adult , Female , Male , Depression/psychology , Depression/diagnosis , Models, Statistical , Young Adult , Middle Aged

3.

Estimating classification consistency of machine learning models for screening measures.

Gonzalez, Oscar; Georgeson, A R; Pelham, William E.

Psychol Assess ; 36(6-7): 395-406, 2024.

Article in English | MEDLINE | ID: mdl-38829349

ABSTRACT

This article illustrates novel quantitative methods to estimate classification consistency in machine learning models used for screening measures. Screening measures are used in psychology and medicine to classify individuals into diagnostic classifications. In addition to achieving high accuracy, it is ideal for the screening process to have high classification consistency, which means that respondents would be classified into the same group every time if the assessment was repeated. Although machine learning models are increasingly being used to predict a screening classification based on individual item responses, methods to describe the classification consistency of machine learning models have not yet been developed. This article addresses this gap by describing methods to estimate classification inconsistency in machine learning models arising from two different sources: sampling error during model fitting and measurement error in the item responses. These methods use data resampling techniques such as the bootstrap and Monte Carlo sampling. These methods are illustrated using three empirical examples predicting a health condition/diagnosis from item responses. R code is provided to facilitate the implementation of the methods. This article highlights the importance of considering classification consistency alongside accuracy when studying screening measures and provides the tools and guidance necessary for applied researchers to obtain classification consistency indices in their machine learning research on diagnostic assessments. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

Subject(s)

Machine Learning , Humans , Models, Statistical , Mass Screening

4.

Using a Leroux-prior-based conditional autoregression-based strategy to map the short-term association between temperature and bacillary dysentery and its attributable burden in China.

Wang, Jianping; Lu, Kai; Wei, Yuxin; Wang, Wei; Zhou, Yongming; Zeng, Jing; Deng, Ying; Zhang, Tao; Yin, Fei; Ma, Yue; Shui, Tiejun.

Front Public Health ; 12: 1297635, 2024.

Article in English | MEDLINE | ID: mdl-38827625

ABSTRACT

Background: In China, bacillary dysentery (BD) is the third most frequently reported infectious disease, with the greatest annual incidence rate of 38.03 cases per 10,000 person-years. It is well acknowledged that temperature is associated with BD and the previous studies of temperature-BD association in different provinces of China present a considerable heterogeneity, which may lead to an inaccurate estimation for a region-specific association and incorrect attributable burdens. Meanwhile, the common methods for multi-city studies, such as stratified strategy and meta-analysis, have their own limitations in handling the heterogeneity. Therefore, it is necessary to adopt an appropriate method considering the spatial autocorrelation to accurately characterize the spatial distribution of temperature-BD association and obtain its attributable burden in 31 provinces of China. Methods: A novel three-stage strategy was adopted. In the first stage, we used the generalized additive model (GAM) model to independently estimate the province-specific association between monthly average temperature (MAT) and BD. In the second stage, the Leroux-prior-based conditional autoregression (LCAR) was used to spatially smooth the association and characterize its spatial distribution. In the third stage, we calculate the attribute BD cases based on a more accurate estimation of association. Results: The smoothed association curves generally show a higher relative risk with a higher MAT, but some of them have an inverted "V" shape. Meanwhile, the spatial distribution of association indicates that western provinces have a higher relative risk of MAT than eastern provinces with 0.695 and 0.645 on average, respectively. The maximum and minimum total attributable number of cases are 224,257 in Beijing and 88,906 in Hainan, respectively. The average values of each province in the eastern, western, and central areas are approximately 40,991, 42,025, and 26,947, respectively. Conclusion: Based on the LCAR-based three-stage strategy, we can obtain a more accurate spatial distribution of temperature-BD association and attributable BD cases. Furthermore, the results can help relevant institutions to prevent and control the epidemic of BD efficiently.

Subject(s)

Dysentery, Bacillary , Temperature , China/epidemiology , Humans , Dysentery, Bacillary/epidemiology , Incidence , Spatial Analysis , Models, Statistical

5.

The performance of restricted AIC for irregular histogram models.

Gokmen, Sahika; Lyhagen, Johan.

PLoS One ; 19(5): e0289822, 2024.

Article in English | MEDLINE | ID: mdl-38691561

ABSTRACT

Histograms are frequently used to perform a preliminary study of data, such as finding outliers and determining the distribution's shape. It is common knowledge that choosing an appropriate number of bins is crucial to revealing the right information. It's also well known that using bins of different widths, which called unequal bin width, is preferable to using bins of equal width if the bin width is selected carefully. However this is a much difficult issue. In this research, a novel approach to AIC for histograms with unequal bin widths was proposed. We demonstrate the advantage of the suggested approach in comparison to others using both extensive Monte Carlo simulations and empirical examples.

Subject(s)

Monte Carlo Method , Models, Statistical , Computer Simulation , Algorithms , Humans

6.

Uncertainty Computation at Finite Distance in Nonlinear Mixed Effects Models-a New Method Based on Metropolis-Hastings Algorithm.

Guhl, Mélanie; Bertrand, Julie; Fayette, Lucie; Mercier, François; Comets, Emmanuelle.

AAPS J ; 26(3): 53, 2024 Apr 23.

Article in English | MEDLINE | ID: mdl-38722435

ABSTRACT

The standard errors (SE) of the maximum likelihood estimates (MLE) of the population parameter vector in nonlinear mixed effect models (NLMEM) are usually estimated using the inverse of the Fisher information matrix (FIM). However, at a finite distance, i.e. far from the asymptotic, the FIM can underestimate the SE of NLMEM parameters. Alternatively, the standard deviation of the posterior distribution, obtained in Stan via the Hamiltonian Monte Carlo algorithm, has been shown to be a proxy for the SE, since, under some regularity conditions on the prior, the limiting distributions of the MLE and of the maximum a posterior estimator in a Bayesian framework are equivalent. In this work, we develop a similar method using the Metropolis-Hastings (MH) algorithm in parallel to the stochastic approximation expectation maximisation (SAEM) algorithm, implemented in the saemix R package. We assess this method on different simulation scenarios and data from a real case study, comparing it to other SE computation methods. The simulation study shows that our method improves the results obtained with frequentist methods at finite distance. However, it performed poorly in a scenario with the high variability and correlations observed in the real case study, stressing the need for calibration.

Subject(s)

Algorithms , Computer Simulation , Monte Carlo Method , Nonlinear Dynamics , Uncertainty , Likelihood Functions , Bayes Theorem , Humans , Models, Statistical

7.

Causal inference for time-to-event data with a cured subpopulation.

Wang, Yi; Deng, Yuhao; Zhou, Xiao-Hua.

Biometrics ; 80(2)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38708764

ABSTRACT

When studying the treatment effect on time-to-event outcomes, it is common that some individuals never experience failure events, which suggests that they have been cured. However, the cure status may not be observed due to censoring which makes it challenging to define treatment effects. Current methods mainly focus on estimating model parameters in various cure models, ultimately leading to a lack of causal interpretations. To address this issue, we propose 2 causal estimands, the timewise risk difference and mean survival time difference, in the always-uncured based on principal stratification as a complement to the treatment effect on cure rates. These estimands allow us to study the treatment effects on failure times in the always-uncured subpopulation. We show the identifiability using a substitutional variable for the potential cure status under ignorable treatment assignment mechanism, these 2 estimands are identifiable. We also provide estimation methods using mixture cure models. We applied our approach to an observational study that compared the leukemia-free survival rates of different transplantation types to cure acute lymphoblastic leukemia. Our proposed approach yielded insightful results that can be used to inform future treatment decisions.

Subject(s)

Models, Statistical , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Humans , Precursor Cell Lymphoblastic Leukemia-Lymphoma/mortality , Precursor Cell Lymphoblastic Leukemia-Lymphoma/therapy , Precursor Cell Lymphoblastic Leukemia-Lymphoma/drug therapy , Causality , Biometry/methods , Treatment Outcome , Computer Simulation , Disease-Free Survival , Survival Analysis

8.

Identifying temporal pathways using biomarkers in the presence of latent non-Gaussian components.

Xie, Shanghong; Zeng, Donglin; Wang, Yuanjia.

Biometrics ; 80(2)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38708763

ABSTRACT

Time-series data collected from a network of random variables are useful for identifying temporal pathways among the network nodes. Observed measurements may contain multiple sources of signals and noises, including Gaussian signals of interest and non-Gaussian noises, including artifacts, structured noise, and other unobserved factors (eg, genetic risk factors, disease susceptibility). Existing methods, including vector autoregression (VAR) and dynamic causal modeling do not account for unobserved non-Gaussian components. Furthermore, existing methods cannot effectively distinguish contemporaneous relationships from temporal relations. In this work, we propose a novel method to identify latent temporal pathways using time-series biomarker data collected from multiple subjects. The model adjusts for the non-Gaussian components and separates the temporal network from the contemporaneous network. Specifically, an independent component analysis (ICA) is used to extract the unobserved non-Gaussian components, and residuals are used to estimate the contemporaneous and temporal networks among the node variables based on method of moments. The algorithm is fast and can easily scale up. We derive the identifiability and the asymptotic properties of the temporal and contemporaneous networks. We demonstrate superior performance of our method by extensive simulations and an application to a study of attention-deficit/hyperactivity disorder (ADHD), where we analyze the temporal relationships between brain regional biomarkers. We find that temporal network edges were across different brain regions, while most contemporaneous network edges were bilateral between the same regions and belong to a subset of the functional connectivity network.

Subject(s)

Algorithms , Biomarkers , Computer Simulation , Models, Statistical , Humans , Biomarkers/analysis , Normal Distribution , Attention Deficit Disorder with Hyperactivity , Time Factors , Biometry/methods

9.

Using Bayesian statistics in confirmatory clinical trials in the regulatory setting: a tutorial review.

Lee, Se Yoon.

BMC Med Res Methodol ; 24(1): 110, 2024 May 07.

Article in English | MEDLINE | ID: mdl-38714936

ABSTRACT

Bayesian statistics plays a pivotal role in advancing medical science by enabling healthcare companies, regulators, and stakeholders to assess the safety and efficacy of new treatments, interventions, and medical procedures. The Bayesian framework offers a unique advantage over the classical framework, especially when incorporating prior information into a new trial with quality external data, such as historical data or another source of co-data. In recent years, there has been a significant increase in regulatory submissions using Bayesian statistics due to its flexibility and ability to provide valuable insights for decision-making, addressing the modern complexity of clinical trials where frequentist trials are inadequate. For regulatory submissions, companies often need to consider the frequentist operating characteristics of the Bayesian analysis strategy, regardless of the design complexity. In particular, the focus is on the frequentist type I error rate and power for all realistic alternatives. This tutorial review aims to provide a comprehensive overview of the use of Bayesian statistics in sample size determination, control of type I error rate, multiplicity adjustments, external data borrowing, etc., in the regulatory environment of clinical trials. Fundamental concepts of Bayesian sample size determination and illustrative examples are provided to serve as a valuable resource for researchers, clinicians, and statisticians seeking to develop more complex and innovative designs.

Subject(s)

Bayes Theorem , Clinical Trials as Topic , Humans , Clinical Trials as Topic/methods , Clinical Trials as Topic/statistics & numerical data , Research Design/standards , Sample Size , Data Interpretation, Statistical , Models, Statistical

10.

The development of a road safety policy index and its application in evaluating the effects of road safety policy.

Elvik, Rune.

Accid Anal Prev ; 202: 107612, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38703590

ABSTRACT

The paper presents an exploratory study of a road safety policy index developed for Norway. The index consists of ten road safety measures for which data on their use from 1980 to 2021 are available. The ten measures were combined into an index which had an initial value of 50 in 1980 and increased to a value of 185 in 2021. To assess the application of the index in evaluating the effects of road safety policy, negative binomial regression models and multivariate time series models were developed for traffic fatalities, fatalities and serious injuries, and all injuries. The coefficient for the policy index was negative, indicating the road safety policy has contributed to reducing the number of fatalities and injuries. The size of this contribution can be estimated by means of at least three estimators that do not always produce identical values. There is little doubt about the sign of the relationship: a stronger road safety policy (as indicated by index values) is associated with a larger decline in fatalities and injuries. A precise quantification is, however, not possible. Different estimators of effect, all of which can be regarded as plausible, yield different results.

Subject(s)

Accidents, Traffic , Safety , Accidents, Traffic/mortality , Accidents, Traffic/prevention & control , Accidents, Traffic/statistics & numerical data , Humans , Norway , Wounds and Injuries/prevention & control , Wounds and Injuries/mortality , Wounds and Injuries/epidemiology , Public Policy , Models, Statistical , Regression Analysis , Automobile Driving/legislation & jurisprudence , Automobile Driving/statistics & numerical data

11.

On the sensitivity of centrality metrics.

Cavallaro, Lucia; De Meo, Pasquale; Fiumara, Giacomo; Liotta, Antonio.

PLoS One ; 19(5): e0299255, 2024.

Article in English | MEDLINE | ID: mdl-38722923

ABSTRACT

Despite the huge importance that the centrality metrics have in understanding the topology of a network, too little is known about the effects that small alterations in the topology of the input graph induce in the norm of the vector that stores the node centralities. If so, then it could be possible to avoid re-calculating the vector of centrality metrics if some minimal changes occur in the network topology, which would allow for significant computational savings. Hence, after formalising the notion of centrality, three of the most basic metrics were herein considered (i.e., Degree, Eigenvector, and Katz centrality). To perform the simulations, two probabilistic failure models were used to describe alterations in network topology: Uniform (i.e., all nodes can be independently deleted from the network with a fixed probability) and Best Connected (i.e., the probability a node is removed depends on its degree). Our analysis suggests that, in the case of degree, small variations in the topology of the input graph determine small variations in Degree centrality, independently of the topological features of the input graph; conversely, both Eigenvector and Katz centralities can be extremely sensitive to changes in the topology of the input graph. In other words, if the input graph has some specific features, even small changes in the topology of the input graph can have catastrophic effects on the Eigenvector or Katz centrality.

Subject(s)

Algorithms , Computer Simulation , Models, Theoretical , Models, Statistical , Probability

12.

Developing Bayesian EWMA chart for change detection in the shape parameter of Inverse Gaussian process.

Javed, Amara; Abbas, Tahir; Abbas, Nasir.

PLoS One ; 19(5): e0301259, 2024.

Article in English | MEDLINE | ID: mdl-38709733

ABSTRACT

Bayesian Control charts are emerging as the most efficient statistical tools for monitoring manufacturing processes and providing effective control over process variability. The Bayesian approach is particularly suitable for addressing parametric uncertainty in the manufacturing industry. In this study, we determine the monitoring threshold for the shape parameter of the Inverse Gaussian distribution (IGD) and design different exponentially-weighted-moving-average (EWMA) control charts based on different loss functions (LFs). The impact of hyperparameters is investigated on Bayes estimates (BEs) and posterior risks (PRs). The performance measures such as average run length (ARL), standard deviation of run length (SDRL), and median of run length (MRL) are employed to evaluate the suggested approach. The designed Bayesian charts are evaluated for different settings of smoothing constant of the EWMA chart, different sample sizes, and pre-specified false alarm rates. The simulative study demonstrates the effectiveness of the suggested Bayesian method-based EWMA charts as compared to the conventional classical setup-based EWMA charts. The proposed techniques of EWMA charts are highly efficient in detecting shifts in the shape parameter and outperform their classical counterpart in detecting faults quickly. The proposed technique is also applied to real-data case studies from the aerospace manufacturing industry. The quality characteristic of interest was selected as the monthly industrial production index of aircraft from January 1980 to December 2022. The real-data-based findings also validate the conclusions based on the simulative results.

Subject(s)

Bayes Theorem , Normal Distribution , Algorithms , Humans , Models, Statistical

13.

A joint penalized spline smoothing model for the number of positive and negative COVID-19 tests.

De Witte, Dries; Abad, Ariel Alonso; Neyens, Thomas; Verbeke, Geert; Molenberghs, Geert.

PLoS One ; 19(5): e0303254, 2024.

Article in English | MEDLINE | ID: mdl-38709776

ABSTRACT

One of the key tools to understand and reduce the spread of the SARS-CoV-2 virus is testing. The total number of tests, the number of positive tests, the number of negative tests, and the positivity rate are interconnected indicators and vary with time. To better understand the relationship between these indicators, against the background of an evolving pandemic, the association between the number of positive tests and the number of negative tests is studied using a joint modeling approach. All countries in the European Union, Switzerland, the United Kingdom, and Norway are included in the analysis. We propose a joint penalized spline model in which the penalized spline is reparameterized as a linear mixed model. The model allows for flexible trajectories by smoothing the country-specific deviations from the overall penalized spline and accounts for heteroscedasticity by allowing the autocorrelation parameters and residual variances to vary among countries. The association between the number of positive tests and the number of negative tests is derived from the joint distribution for the random intercepts and slopes. The correlation between the random intercepts and the correlation between the random slopes were both positive. This suggests that, when countries increase their testing capacity, both the number of positive tests and negative tests will increase. A significant correlation was found between the random intercepts, but the correlation between the random slopes was not significant due to a wide credible interval.

Subject(s)

COVID-19 Testing , COVID-19 , SARS-CoV-2 , Humans , COVID-19/epidemiology , COVID-19/virology , SARS-CoV-2/isolation & purification , United Kingdom/epidemiology , COVID-19 Testing/methods , Norway/epidemiology , Models, Statistical , Switzerland/epidemiology , Pandemics , European Union

14.

Estimating causal effects from panel data with dynamic multivariate panel models.

Helske, Jouni; Tikka, Santtu.

Adv Life Course Res ; 60: 100617, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38759570

ABSTRACT

Panel data are ubiquitous in scientific fields such as social sciences. Various modeling approaches have been presented for observational causal inference based on such data. Existing approaches typically impose restrictive assumptions on the data-generating process such as Gaussian responses or time-invariant effects, or they can only consider short-term causal effects. To surmount these restrictions, we present the dynamic multivariate panel model (DMPM) that supports time-varying, time-invariant, and individual-specific effects, multiple responses across a wide variety of distributions, and arbitrary dependency structures of lagged responses of any order. We formally demonstrate how DMPM facilitates causal inference within the structural causal modeling framework and we take a Bayesian approach for the estimation of the posterior distributions of the model parameters and causal effects of interest. We demonstrate the use of DMPM by applying the approach to both real and synthetic data.

Subject(s)

Bayes Theorem , Causality , Models, Statistical , Humans , Multivariate Analysis

15.

Statistical design and analysis of controlled human malaria infection trials.

Tian, Xiaowen; Janes, Holly E; Kublin, James G.

Malar J ; 23(1): 133, 2024 May 03.

Article in English | MEDLINE | ID: mdl-38702775

ABSTRACT

BACKGROUND: Malaria is a potentially life-threatening disease caused by Plasmodium protozoa transmitted by infected Anopheles mosquitoes. Controlled human malaria infection (CHMI) trials are used to assess the efficacy of interventions for malaria elimination. The operating characteristics of statistical methods for assessing the ability of interventions to protect individuals from malaria is uncertain in small CHMI studies. This paper presents simulation studies comparing the performance of a variety of statistical methods for assessing efficacy of intervention in CHMI trials. METHODS: Two types of CHMI designs were investigated: the commonly used single high-dose design (SHD) and the repeated low-dose design (RLD), motivated by simian immunodeficiency virus (SIV) challenge studies. In the context of SHD, the primary efficacy endpoint is typically time to infection. Using a continuous time survival model, five statistical tests for assessing the extent to which an intervention confers partial or full protection under single dose CHMI designs were evaluated. For RLD, the primary efficacy endpoint is typically the binary infection status after a specific number of challenges. A discrete time survival model was used to study the characteristics of RLD versus SHD challenge studies. RESULTS: In a SHD study with the continuous time survival model, log-rank test and t-test are the most powerful and provide more interpretable results than Wilcoxon rank-sum tests and Lachenbruch tests, while the likelihood ratio test is uniformly most powerful but requires knowledge of the underlying probability model. In the discrete time survival model setting, SHDs are more powerful for assessing the efficacy of an intervention to prevent infection than RLDs. However, additional information can be inferred from RLD challenge designs, particularly using a likelihood ratio test. CONCLUSIONS: Different statistical methods can be used to analyze controlled human malaria infection (CHMI) experiments, and the choice of method depends on the specific characteristics of the experiment, such as the sample size allocation between the control and intervention groups, and the nature of the intervention. The simulation results provide guidance for the trade off in statistical power when choosing between different statistical methods and study designs.

Subject(s)

Malaria , Humans , Malaria/prevention & control , Animals , Research Design , Controlled Clinical Trials as Topic , Models, Statistical , Anopheles/parasitology

16.

A new and unified method for regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates.

Lou, Yichen; Ma, Yuqing; Du, Mingyue.

Stat Med ; 43(11): 2062-2082, 2024 May 20.

Article in English | MEDLINE | ID: mdl-38757695

ABSTRACT

This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of missing covariates. Although some methods have been developed for the problem, they either apply only to limited situations or may have some computational issues. Corresponding to these, we propose a new and unified two-step inference procedure that can be easily implemented using the existing or standard software. The proposed method makes use of a set of working models to extract partial information from incomplete observations and yields a consistent estimator of regression parameters assuming missing at random. An extensive simulation study is conducted and indicates that it performs well in practical situations. Finally, we apply the proposed approach to an Alzheimer's Disease study that motivated this study.

Subject(s)

Alzheimer Disease , Computer Simulation , Models, Statistical , Humans , Regression Analysis , Data Interpretation, Statistical

17.

Flexible model-based non-negative matrix factorization with application to mutational signatures.

Laursen, Ragnhild; Maretty, Lasse; Hobolth, Asger.

Stat Appl Genet Mol Biol ; 23(1)2024 Jan 01.

Article in English | MEDLINE | ID: mdl-38753402

ABSTRACT

Somatic mutations in cancer can be viewed as a mixture distribution of several mutational signatures, which can be inferred using non-negative matrix factorization (NMF). Mutational signatures have previously been parametrized using either simple mono-nucleotide interaction models or general tri-nucleotide interaction models. We describe a flexible and novel framework for identifying biologically plausible parametrizations of mutational signatures, and in particular for estimating di-nucleotide interaction models. Our novel estimation procedure is based on the expectation-maximization (EM) algorithm and regression in the log-linear quasi-Poisson model. We show that di-nucleotide interaction signatures are statistically stable and sufficiently complex to fit the mutational patterns. Di-nucleotide interaction signatures often strike the right balance between appropriately fitting the data and avoiding over-fitting. They provide a better fit to data and are biologically more plausible than mono-nucleotide interaction signatures, and the parametrization is more stable than the parameter-rich tri-nucleotide interaction signatures. We illustrate our framework in a large simulation study where we compare to state of the art methods, and show results for three data sets of somatic mutation counts from patients with cancer in the breast, Liver and urinary tract.

Subject(s)

Algorithms , Mutation , Neoplasms , Humans , Neoplasms/genetics , Models, Genetic , Computer Simulation , Models, Statistical

18.

Direct and indirect treatment effects in the presence of semicompeting risks.

Deng, Yuhao; Wang, Yi; Zhou, Xiao-Hua.

Biometrics ; 80(2)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38742906

ABSTRACT

Semicompeting risks refer to the phenomenon that the terminal event (such as death) can censor the nonterminal event (such as disease progression) but not vice versa. The treatment effect on the terminal event can be delivered either directly following the treatment or indirectly through the nonterminal event. We consider 2 strategies to decompose the total effect into a direct effect and an indirect effect under the framework of mediation analysis in completely randomized experiments by adjusting the prevalence and hazard of nonterminal events, respectively. They require slightly different assumptions on cross-world quantities to achieve identifiability. We establish asymptotic properties for the estimated counterfactual cumulative incidences and decomposed treatment effects. We illustrate the subtle difference between these 2 decompositions through simulation studies and two real-data applications in the Supplementary Materials.

Subject(s)

Computer Simulation , Humans , Models, Statistical , Risk , Randomized Controlled Trials as Topic/statistics & numerical data , Mediation Analysis , Treatment Outcome , Biometry/methods

19.

Testing conditional quantile independence with functional covariate.

Feng, Yongzhen; Li, Jie; Song, Xiaojun.

Biometrics ; 80(2)2024 Mar 27.

Article in English | MEDLINE | ID: mdl-38742907

ABSTRACT

We propose a new non-parametric conditional independence test for a scalar response and a functional covariate over a continuum of quantile levels. We build a Cramer-von Mises type test statistic based on an empirical process indexed by random projections of the functional covariate, effectively avoiding the "curse of dimensionality" under the projected hypothesis, which is almost surely equivalent to the null hypothesis. The asymptotic null distribution of the proposed test statistic is obtained under some mild assumptions. The asymptotic global and local power properties of our test statistic are then investigated. We specifically demonstrate that the statistic is able to detect a broad class of local alternatives converging to the null at the parametric rate. Additionally, we recommend a simple multiplier bootstrap approach for estimating the critical values. The finite-sample performance of our statistic is examined through several Monte Carlo simulation experiments. Finally, an analysis of an EEG data set is used to show the utility and versatility of our proposed test statistic.

Subject(s)

Computer Simulation , Models, Statistical , Monte Carlo Method , Humans , Electroencephalography/statistics & numerical data , Data Interpretation, Statistical , Biometry/methods , Statistics, Nonparametric

20.

Weibull parametric model for survival analysis in women with endometrial cancer using clinical and T2-weighted MRI radiomic features.

Li, Xingfeng; Marcus, Diana; Russell, James; Aboagye, Eric O; Ellis, Laura Burney; Sheeka, Alexander; Park, Won-Ho Edward; Bharwani, Nishat; Ghaem-Maghami, Sadaf; Rockall, Andrea G.

BMC Med Res Methodol ; 24(1): 107, 2024 May 09.

Article in English | MEDLINE | ID: mdl-38724889

ABSTRACT

BACKGROUND: Semiparametric survival analysis such as the Cox proportional hazards (CPH) regression model is commonly employed in endometrial cancer (EC) study. Although this method does not need to know the baseline hazard function, it cannot estimate event time ratio (ETR) which measures relative increase or decrease in survival time. To estimate ETR, the Weibull parametric model needs to be applied. The objective of this study is to develop and evaluate the Weibull parametric model for EC patients' survival analysis. METHODS: Training (n = 411) and testing (n = 80) datasets from EC patients were retrospectively collected to investigate this problem. To determine the optimal CPH model from the training dataset, a bi-level model selection with minimax concave penalty was applied to select clinical and radiomic features which were obtained from T2-weighted MRI images. After the CPH model was built, model diagnostic was carried out to evaluate the proportional hazard assumption with Schoenfeld test. Survival data were fitted into a Weibull model and hazard ratio (HR) and ETR were calculated from the model. Brier score and time-dependent area under the receiver operating characteristic curve (AUC) were compared between CPH and Weibull models. Goodness of the fit was measured with Kolmogorov-Smirnov (KS) statistic. RESULTS: Although the proportional hazard assumption holds for fitting EC survival data, the linearity of the model assumption is suspicious as there are trends in the age and cancer grade predictors. The result also showed that there was a significant relation between the EC survival data and the Weibull distribution. Finally, it showed that Weibull model has a larger AUC value than CPH model in general, and it also has smaller Brier score value for EC survival prediction using both training and testing datasets, suggesting that it is more accurate to use the Weibull model for EC survival analysis. CONCLUSIONS: The Weibull parametric model for EC survival analysis allows simultaneous characterization of the treatment effect in terms of the hazard ratio and the event time ratio (ETR), which is likely to be better understood. This method can be extended to study progression free survival and disease specific survival. TRIAL REGISTRATION: ClinicalTrials.gov NCT03543215, https://clinicaltrials.gov/ , date of registration: 30th June 2017.

Subject(s)

Endometrial Neoplasms , Magnetic Resonance Imaging , Proportional Hazards Models , Humans , Female , Endometrial Neoplasms/mortality , Endometrial Neoplasms/diagnostic imaging , Middle Aged , Magnetic Resonance Imaging/methods , Retrospective Studies , Survival Analysis , Aged , ROC Curve , Adult , Models, Statistical , Radiomics

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL