Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Data ; 11(1): 358, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38594314

RESUMO

This paper presents a standardised dataset versioning framework for improved reusability, recognition and data version tracking, facilitating comparisons and informed decision-making for data usability and workflow integration. The framework adopts a software engineering-like data versioning nomenclature ("major.minor.patch") and incorporates data schema principles to promote reproducibility and collaboration. To quantify changes in statistical properties over time, the concept of data drift metrics (d) is introduced. Three metrics (dP, dE,PCA, and dE,AE) based on unsupervised Machine Learning techniques (Principal Component Analysis and Autoencoders) are evaluated for dataset creation, update, and deletion. The optimal choice is the dE,PCA metric, combining PCA models with splines. It exhibits efficient computational time, with values below 50 for new dataset batches and values consistent with seasonal or trend variations. Major updates (i.e., values of 100) occur when scaling transformations are applied to over 30% of variables while efficiently handling information loss, yielding values close to 0. This metric achieved a favourable trade-off between interpretability, robustness against information loss, and computation time.


Assuntos
Conjuntos de Dados como Assunto , Software , Análise de Componente Principal , Reprodutibilidade dos Testes , Fluxo de Trabalho , Conjuntos de Dados como Assunto/normas , Aprendizado de Máquina
2.
Stud Health Technol Inform ; 294: 755-759, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612198

RESUMO

The pharmaceutical industry is a data-intensive environment and a heavily-regulated sector, where exhaustive audits and inspections are performed to ensure the safety of drugs. In this context, processing and evaluating the data generated in the manufacturing lines is a relevant challenge since it requires compliance with pharma regulations. This work combines data integrity metrics and blockchain technology to evaluate the compliance-degree of ALCOA+ principles among different levels of drug manufacturing data. We propose the DIALCOA tool, a software to assess the compliance-degree for each ALCOA+ principle, based on the assessment of data from manufacturing batch reports and its different levels of information.


Assuntos
Blockchain , Indústria Farmacêutica , Comércio , Tecnologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...