Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38836701

RESUMO

Biomedical data are generated and collected from various sources, including medical imaging, laboratory tests and genome sequencing. Sharing these data for research can help address unmet health needs, contribute to scientific breakthroughs, accelerate the development of more effective treatments and inform public health policy. Due to the potential sensitivity of such data, however, privacy concerns have led to policies that restrict data sharing. In addition, sharing sensitive data requires a secure and robust infrastructure with appropriate storage solutions. Here, we examine and compare the centralized and federated data sharing models through the prism of five large-scale and real-world use cases of strategic significance within the European data sharing landscape: the French Health Data Hub, the BBMRI-ERIC Colorectal Cancer Cohort, the federated European Genome-phenome Archive, the Observational Medical Outcomes Partnership/OHDSI network and the EBRAINS Medical Informatics Platform. Our analysis indicates that centralized models facilitate data linkage, harmonization and interoperability, while federated models facilitate scaling up and legal compliance, as the data typically reside on the data generator's premises, allowing for better control of how data are shared. This comparative study thus offers guidance on the selection of the most appropriate sharing strategy for sensitive datasets and provides key insights for informed decision-making in data sharing efforts.


Assuntos
Disciplinas das Ciências Biológicas , Disseminação de Informação , Humanos , Informática Médica/métodos
2.
Ann Pathol ; 42(2): 119-128, 2022 Mar.
Artigo em Francês | MEDLINE | ID: mdl-35012784

RESUMO

The french society of pathology (SFP) organized in 2020 its first data challenge with the help of Health Data Hub (HDH). The organisation of this event first consisted in recruiting almost 5000 slides of uterus cervical biopsies obtained in 20 pathology centers. After having made sure that patients did not refuse to include their slides in the project, the slides were anonymised, digitized and annotated by expert pathologists, and were finally uploaded on a data challenge platform for competitors all around the world. Competitors teams had to develop algorithms that could distinguish among four diagnostic classes in epithelial lesions of uterine cervix. Among many submissions by competitors, the best algorithms obtained an overall score close to 95%. The best 3 teams shared 25k€ prizes during a special session organised during the national congress of the SFP. The final part of the competition lasted only 6 weeks and the goal of SFP and HDH is now to allow for the collection to be published in open access. This final step will allow data scientists and pathologists to further develop artificial intelligence algorithms in this medical area.


Assuntos
Algoritmos , Inteligência Artificial , Biópsia , Colo do Útero , Feminino , Humanos , Patologistas
3.
J Biomed Inform ; 110: 103531, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32818667

RESUMO

This paper considers the problems of modeling and predicting a long-term and "blurry" relapse that occurs after a medical act, such as a surgery. We do not consider a short-term complication related to the act itself, but a long-term relapse that clinicians cannot explain easily, since it depends on unknown sets or sequences of past events that occurred before the act. The relapse is observed only indirectly, in a "blurry" fashion, through longitudinal prescriptions of drugs over a long period of time after the medical act. We introduce a new model, called ZiMM (Zero-inflated Mixture of Multinomial distributions) in order to capture long-term and blurry relapses. On top of it, we build an end-to-end deep-learning architecture called ZiMM Encoder-Decoder (ZiMM ED) that can learn from the complex, irregular, highly heterogeneous and sparse patterns of health events that are observed through a claims-only database. ZiMM ED is applied on a "non-clinical" claims database, that contains only timestamped reimbursement codes for drug purchases, medical procedures and hospital diagnoses, the only available clinical feature being the age of the patient. This setting is more challenging than a setting where bedside clinical signals are available. Our motivation for using such a non-clinical claims database is its exhaustivity population-wise, compared to clinical electronic health records coming from a single or a small set of hospitals. Indeed, we consider a dataset containing the claims of almost all French citizens who had surgery for prostatic problems, with a history between 1.5 and 5 years. We consider a long-term (18 months) relapse (urination problems still occur despite surgery), which is blurry since it is observed only through the reimbursement of a specific set of drugs for urination problems. Our experiments show that ZiMM ED improves several baselines, including non-deep learning and deep-learning approaches, and that it allows working on such a dataset with minimal preprocessing work.


Assuntos
Aprendizado Profundo , Bases de Dados Factuais , Registros Eletrônicos de Saúde , Humanos , Recidiva
4.
Int J Med Inform ; 141: 104203, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32485553

RESUMO

OBJECTIVE: This article introduces SCALPEL3 (Scalable Pipeline for Health Data), a scalable open-source framework for studies involving Large Observational Databases (LODs). It focuses on scalable medical concept extraction, easy interactive analysis, and helpers for data flow analysis to accelerate studies performed on LODs. MATERIALS AND METHODS: Inspired from web analytics, SCALPEL3 relies on distributed computing, data denormalization and columnar storage. It was compared to the existing SAS-Oracle SNDS infrastructure by performing several queries on a dataset containing a three years-long history of healthcare claims of 13.7 million patients. RESULTS AND DISCUSSION: SCALPEL3 horizontal scalability allows handling large tasks quicker than the existing infrastructure while it has comparable performance when using only a few executors. SCALPEL3 provides a sharp interactive control of data processing through legible code, which helps to build studies with full reproducibility, leading to improved maintainability and audit of studies performed on LODs. CONCLUSION: SCALPEL3 makes studies based on SNDS much easier and more scalable than the existing framework [1]. It is now used at the agency collecting SNDS data, at the French Ministry of Health and soon at the National Health Data Hub in France [2].


Assuntos
Atenção à Saúde , Bases de Dados Factuais , França , Humanos , Reprodutibilidade dos Testes
5.
Biostatistics ; 21(4): 758-774, 2020 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30851046

RESUMO

With the increased availability of large electronic health records databases comes the chance of enhancing health risks screening. Most post-marketing detection of adverse drug reaction (ADR) relies on physicians' spontaneous reports, leading to under-reporting. To take up this challenge, we develop a scalable model to estimate the effect of multiple longitudinal features (drug exposures) on a rare longitudinal outcome. Our procedure is based on a conditional Poisson regression model also known as self-controlled case series (SCCS). To overcome the need of precise risk periods specification, we model the intensity of outcomes using a convolution between exposures and step functions, which are penalized using a combination of group-Lasso and total-variation. Up to our knowledge, this is the first SCCS model with flexible intensity able to handle multiple longitudinal features in a single model. We show that this approach improves the state-of-the-art in terms of mean absolute error and computation time for the estimation of relative risks on simulated data. We apply this method on an ADR detection problem, using a cohort of diabetic patients extracted from the large French national health insurance database (SNIIRAM), a claims database containing medical reimbursements of more than 53 million people. This work has been done in the context of a research partnership between Ecole Polytechnique and CNAMTS (in charge of SNIIRAM).


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Estudos de Coortes , Bases de Dados Factuais , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Registros Eletrônicos de Saúde , Humanos , Projetos de Pesquisa
6.
Artigo em Inglês | MEDLINE | ID: mdl-31717923

RESUMO

Background: No comparative data is available to report on the effect of online self-exclusion. The aim of this study was to assess the effect of self-exclusion in online poker gambling as compared to matched controls, after the end of the self-exclusion period. Methods: We included all gamblers who were first-time self-excluders over a 7-year period (n = 4887) on a poker website, and gamblers matched for gender, age and account duration (n = 4451). We report the effects over time of self-exclusion after it ended, on money (net losses) and time spent (session duration) using an analysis of variance procedure between mixed models with and without the interaction of time and self-exclusion. Analyzes were performed on the whole sample, on the sub-groups that were the most heavily involved in terms of time or money (higher quartiles) and among short-duration self-excluders (<3 months). Results: Significant effects of self-exclusion and short-duration self-exclusion were found for money and time spent over 12 months. Among the gamblers that were the most heavily involved financially, no significant effect on the amount spent was found. Among the gamblers who were the most heavily involved in terms of time, a significant effect was found on time spent. Short-duration self-exclusions showed no significant effect on the most heavily involved gamblers. Conclusions: Self-exclusion seems efficient in the long term. However, the effect on money spent of self-exclusions and of short-duration self-exclusions should be further explored among the most heavily involved gamblers.


Assuntos
Comportamento Aditivo , Financiamento Pessoal , Jogo de Azar/psicologia , Adulto , Estudos de Casos e Controles , Feminino , Humanos , Masculino , Estudos de Tempo e Movimento , Adulto Jovem
7.
BMJ Open ; 8(12): e022541, 2018 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-30580263

RESUMO

OBJECTIVE: Self-exclusion is one of the main responsible gambling tools. The aim of this study was to assess the reliability of self-exclusion motives in self-reports to the gambling service provider. SETTINGS: This is a retrospective cohort using prospective account-based gambling data obtained from a poker gambling provider. PARTICIPANTS: Over a period of 7 years we included all poker gamblers self-excluding for the first time, and reporting a motive for their self-exclusion (n=1996). We explored two groups: self-excluders who self-reported a motive related to addiction and those who reported a commercial motive. RESULTS: No between-group adjusted difference was found on gambling summary variables. Sessions in the two groups were poorly discriminated one from another on four different machine-learning models. More than two-thirds of the gamblers resumed poker gambling after a first self-exclusion (n=1368), half of them within the first month. No between-group difference was found for the course of gambling after the first self-exclusion. 60.1% of first-time self-excluders self-excluded again (n=822). Losses in the previous month were greater before second self-exclusions than before the first. CONCLUSIONS: Reported motives for self-exclusion appear non-informative, and could be misleading. Multiple self-exclusions seem to be more the rule than the exception. The process of self-exclusion should therefore be optimised from the first occurrence to protect heavy gamblers.


Assuntos
Controle Comportamental , Comportamento Aditivo/psicologia , Jogo de Azar/psicologia , Confiança/psicologia , Adaptação Psicológica , Adulto , Estudos de Coortes , Bases de Dados Factuais , Feminino , Jogo de Azar/epidemiologia , Humanos , Masculino , Pessoa de Meia-Idade , Motivação , Estudos Retrospectivos , Autorrelato
8.
Artigo em Inglês | MEDLINE | ID: mdl-25974473

RESUMO

In this work we investigate the generic properties of a stochastic linear model in the regime of high dimensionality. We consider in particular the vector autoregressive (VAR) model and the multivariate Hawkes process. We analyze both deterministic and random versions of these models, showing the existence of a stable phase and an unstable phase. We find that along the transition region separating the two regimes the correlations of the process decay slowly, and we characterize the conditions under which these slow correlations are expected to become power laws. We check our findings with numerical simulations showing remarkable agreement with our predictions. We finally argue that real systems with a strong degree of self-interaction are naturally characterized by this type of slow relaxation of the correlations.

9.
Artigo em Inglês | MEDLINE | ID: mdl-23679479

RESUMO

In this paper we propose a new model for volatility fluctuations in financial time series. This model relies on a nonstationary Gaussian process that exhibits aging behavior. It turns out that its properties, over any finite time interval, are very close to continuous cascade models. These latter models are indeed well known to reproduce faithfully the main stylized facts of financial time series. However, it involves a large-scale parameter (the so-called "integral scale" where the cascade is initiated) that is hard to interpret in finance. Moreover, the empirical value of the integral scale is in general deeply correlated to the overall length of the sample. This feature is precisely predicted by our model, which, as illustrated by various examples from daily stock index data, quantitatively reproduces the empirical observations.

10.
Phys Rev E Stat Nonlin Soft Matter Phys ; 66(5 Pt 2): 056121, 2002 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-12513570

RESUMO

We define a large class of continuous time multifractal random measures and processes with arbitrary log infinitely divisible exact or asymptotic scaling law. These processes generalize within a unified framework both the recently defined log-normal multifractal random walk [J.F. Muzy, J. Delour, and E. Bacry, Eur. J. Phys. B 17, 537 (2000), E. Bacry, J. Delour, and J.F. Muzy, Phys. Rev. E 64, 026103 (2001)] and the log-Poisson "product of cylindrical pulses" [J. Barral and B.B. Mandelbrot, Cowles Foundation Discussion Paper No. 1287, 2001 (unpublished)]. Our construction is based on some "continuous stochastic multiplication" [as introduced in F. Schmitt and D. Marsan, Eur. J. Phys. B. 20, 3 (2001)] from coarse to fine scales that can be seen as a continuous interpolation of discrete multiplicative cascades. We prove the stochastic convergence of the defined processes and study their main statistical properties. The question of genericity (universality) of limit multifractal processes is addressed within this new framework. We finally provide a method for numerical simulations and discuss some specific examples.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...