Search | VHL Regional Portal

1.

Survival Analysis Without Sharing of Individual Patient Data by Using a Gaussian Copula.

Bonofiglio, Federico.

Pharm Stat ; 2024 Jul 07.

Article in English | MEDLINE | ID: mdl-38973072

ABSTRACT

Cox regression and Kaplan-Meier estimations are often needed in clinical research and this requires access to individual patient data (IPD). However, IPD cannot always be shared because of privacy or proprietary restrictions, which complicates the making of such estimations. We propose a method that generates pseudodata replacing the IPD by only sharing non-disclosive aggregates such as IPD marginal moments and a correlation matrix. Such aggregates are collected by a central computer and input as parameters to a Gaussian copula (GC) that generates the pseudodata. Survival inferences are computed on the pseudodata as if it were the IPD. Using practical examples we demonstrate the utility of the method, via the amount of IPD inferential content recoverable by the GC. We compare GC to a summary-based meta-analysis and an IPD bootstrap distributed across several centers. Other pseudodata approaches are also considered. In the empirical results, GC approximates the utility of the IPD bootstrap although it might yield more conservative inferences and it might have limitations in subgroup analyses. Overall, GC avoids many legal problems related to IPD privacy or property while enabling approximation of common IPD survival analyses otherwise difficult to conduct. Sharing more IPD aggregates than is currently practiced could facilitate "second purpose"-research and relax concerns regarding IPD access.

2.

Long-term High Resolution Image Dataset of Antarctic Coastal Benthic Fauna.

Marini, Simone; Bonofiglio, Federico; Corgnati, Lorenzo Paolo; Bordone, Andrea; Schiaparelli, Stefano; Peirano, Andrea.

Sci Data ; 9(1): 750, 2022 12 03.

Article in English | MEDLINE | ID: mdl-36463241

ABSTRACT

Antarctica is a remote place, the continent is covered by ice and its surrounding coastal areas are frozen for the majority of the year. Due to its peculiarity the observation of the underwater organisms is particularly difficult, complicated by logistic factors. We present a long-term dataset consisting of 755 images acquired by using a non-invasive, autonomous imaging device and encompassing both the Antarctic daylight and dark periods, including the corresponding transition phases. All images have the same field of view showing the benthic fauna and part of the water column above, including fishes present in the monitored period. All the images are manually annotated after a visual inspection performed by expert biologists. The extended monitoring period and the annotated images make the dataset a valuable benchmark suitable for studying the dynamics of the long-term Antarctic underwater fauna as well as for developing and testing algorithms for automated image analysis focused on the recognition and classification of the Antarctic organisms and the automated analysis of their long-term dynamics.

3.

The Hierarchic Treatment of Marine Ecological Information from Spatial Networks of Benthic Platforms.

Aguzzi, Jacopo; Chatzievangelou, Damianos; Francescangeli, Marco; Marini, Simone; Bonofiglio, Federico; Del Rio, Joaquin; Danovaro, Roberto.

Sensors (Basel) ; 20(6)2020 Mar 21.

Article in English | MEDLINE | ID: mdl-32245204

ABSTRACT

Measuring biodiversity simultaneously in different locations, at different temporal scales, and over wide spatial scales is of strategic importance for the improvement of our understanding of the functioning of marine ecosystems and for the conservation of their biodiversity. Monitoring networks of cabled observatories, along with other docked autonomous systems (e.g., Remotely Operated Vehicles [ROVs], Autonomous Underwater Vehicles [AUVs], and crawlers), are being conceived and established at a spatial scale capable of tracking energy fluxes across benthic and pelagic compartments, as well as across geographic ecotones. At the same time, optoacoustic imaging is sustaining an unprecedented expansion in marine ecological monitoring, enabling the acquisition of new biological and environmental data at an appropriate spatiotemporal scale. At this stage, one of the main problems for an effective application of these technologies is the processing, storage, and treatment of the acquired complex ecological information. Here, we provide a conceptual overview on the technological developments in the multiparametric generation, storage, and automated hierarchic treatment of biological and environmental information required to capture the spatiotemporal complexity of a marine ecosystem. In doing so, we present a pipeline of ecological data acquisition and processing in different steps and prone to automation. We also give an example of population biomass, community richness and biodiversity data computation (as indicators for ecosystem functionality) with an Internet Operated Vehicle (a mobile crawler). Finally, we discuss the software requirements for that automated data processing at the level of cyber-infrastructures with sensor calibration and control, data banking, and ingestion into large data portals.

Subject(s)

Marine Biology/methods , Artificial Intelligence , Conservation of Natural Resources/methods , Environmental Monitoring/methods

4.

Recovery of original individual person data (IPD) inferences from empirical IPD summaries only: Applications to distributed computing under disclosure constraints.

Bonofiglio, Federico; Schumacher, Martin; Binder, Harald.

Stat Med ; 39(8): 1183-1198, 2020 04 15.

Article in English | MEDLINE | ID: mdl-31944335

ABSTRACT

There are many settings where individual person data (IPD) are not available, due to privacy or technical reasons, and one must work with IPD proxies, such as summary statistics, to approximate original IPD inferences, that is, the results of statistical analyses that would ideally have been performed on individual-level data. For instance, in a distributed computing setting, as implemented in the DataSHIELD software framework, different centers can only share IPD proxies to obtain pooled IPD inferences. Such privacy requirements limit the scope of statistical investigation. For example, it can be challenging to perform between-center random-effect regression models. To increase modeling freedom we propose a method that only uses simple nondisclosive summaries of the original IPD as input, such as empirical marginal moments and correlation matrices, and generates artificial data compatible with those summary features. Specifically, data are generated from a Gaussian copula with marginal and joint components specified by the above summaries. The goal is to reproduce original IPD features in the artificial data, such that original IPD inferences are recovered from the artificial data. In an application example, and through simulations, we show that we can recover estimates of a multivariable IPD random-effect logistic regression, from artificial data generated via the Gaussian copula using the above IPD summaries, suggesting the proposed approach provides a generally applicable strategy for distributed computing settings with data protection constraints.

Subject(s)

Disclosure , Research Design , Computer Security , Data Interpretation, Statistical , Humans , Logistic Models

5.

Meta-analysis for aggregated survival data with competing risks: a parametric approach using cumulative incidence functions.

Bonofiglio, Federico; Beyersmann, Jan; Schumacher, Martin; Koller, Michael; Schwarzer, Guido.

Res Synth Methods ; 7(3): 282-93, 2016 Sep.

Article in English | MEDLINE | ID: mdl-26387882

ABSTRACT

Meta-analysis of a survival endpoint is typically based on the pooling of hazard ratios (HRs). If competing risks occur, the HRs may lose translation into changes of survival probability. The cumulative incidence functions (CIFs), the expected proportion of cause-specific events over time, re-connect the cause-specific hazards (CSHs) to the probability of each event type. We use CIF ratios to measure treatment effect on each event type. To retrieve information on aggregated, typically poorly reported, competing risks data, we assume constant CSHs. Next, we develop methods to pool CIF ratios across studies. The procedure computes pooled HRs alongside and checks the influence of follow-up time on the analysis. We apply the method to a medical example, showing that follow-up duration is relevant both for pooled cause-specific HRs and CIF ratios. Moreover, if all-cause hazard and follow-up time are large enough, CIF ratios may reveal additional information about the effect of treatment on the cumulative probability of each event type. Finally, to improve the usefulness of such analysis, better reporting of competing risks data is needed. Copyright © 2015 John Wiley & Sons, Ltd.

Subject(s)

Meta-Analysis as Topic , Survival Analysis , Algorithms , Bayes Theorem , Clinical Trials as Topic , Computer Simulation , Data Collection , Humans , Incidence , Probability , Programming Languages , Proportional Hazards Models , Reproducibility of Results , Research Design , Risk Factors , Statistics as Topic

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL