Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 14(1): 14217, 2024 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-38902282

RESUMO

As interest in using machine learning models to support clinical decision-making increases, explainability is an unequivocal priority for clinicians, researchers and regulators to comprehend and trust their results. With many clinical datasets containing a range of modalities, from the free-text of clinician notes to structured tabular data entries, there is a need for frameworks capable of providing comprehensive explanation values across diverse modalities. Here, we present a multimodal masking framework to extend the reach of SHapley Additive exPlanations (SHAP) to text and tabular datasets to identify risk factors for companion animal mortality in first-opinion veterinary electronic health records (EHRs) from across the United Kingdom. The framework is designed to treat each modality consistently, ensuring uniform and consistent treatment of features and thereby fostering predictability in unimodal and multimodal contexts. We present five multimodality approaches, with the best-performing method utilising PetBERT, a language model pre-trained on a veterinary dataset. Utilising our framework, we shed light for the first time on the reasons each model makes its decision and identify the inclination of PetBERT towards a more pronounced engagement with free-text narratives compared to BERT-base's predominant emphasis on tabular data. The investigation also explores the important features on a more granular level, identifying distinct words and phrases that substantially influenced an animal's life status prediction. PetBERT showcased a heightened ability to grasp phrases associated with veterinary clinical nomenclature, signalling the productivity of additional pre-training of language models.


Assuntos
Registros Eletrônicos de Saúde , Animais de Estimação , Animais , Aprendizado de Máquina , Reino Unido/epidemiologia , Fatores de Risco , Gatos , Cães
2.
Front Vet Sci ; 11: 1352239, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38322169

RESUMO

The development of natural language processing techniques for deriving useful information from unstructured clinical narratives is a fast-paced and rapidly evolving area of machine learning research. Large volumes of veterinary clinical narratives now exist curated by projects such as the Small Animal Veterinary Surveillance Network (SAVSNET) and VetCompass, and the application of such techniques to these datasets is already (and will continue to) improve our understanding of disease and disease patterns within veterinary medicine. In part one of this two part article series, we discuss the importance of understanding the lexical structure of clinical records and discuss the use of basic tools for filtering records based on key words and more complex rule based pattern matching approaches. We discuss the strengths and weaknesses of these approaches highlighting the on-going potential value in using these "traditional" approaches but ultimately recognizing that these approaches constrain how effectively information retrieval can be automated. This sets the scene for the introduction of machine-learning methodologies and the plethora of opportunities for automation of information extraction these present which is discussed in part two of the series.

3.
Sci Rep ; 13(1): 18015, 2023 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-37865683

RESUMO

Effective public health surveillance requires consistent monitoring of disease signals such that researchers and decision-makers can react dynamically to changes in disease occurrence. However, whilst surveillance initiatives exist in production animal veterinary medicine, comparable frameworks for companion animals are lacking. First-opinion veterinary electronic health records (EHRs) have the potential to reveal disease signals and often represent the initial reporting of clinical syndromes in animals presenting for medical attention, highlighting their possible significance in early disease detection. Yet despite their availability, there are limitations surrounding their free text-based nature, inhibiting the ability for national-level mortality and morbidity statistics to occur. This paper presents PetBERT, a large language model trained on over 500 million words from 5.1 million EHRs across the UK. PetBERT-ICD is the additional training of PetBERT as a multi-label classifier for the automated coding of veterinary clinical EHRs with the International Classification of Disease 11 framework, achieving F1 scores exceeding 83% across 20 disease codings with minimal annotations. PetBERT-ICD effectively identifies disease outbreaks, outperforming current clinician-assigned point-of-care labelling strategies up to 3 weeks earlier. The potential for PetBERT-ICD to enhance disease surveillance in veterinary medicine represents a promising avenue for advancing animal health and improving public health outcomes.


Assuntos
Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Animais , Surtos de Doenças/veterinária , Vigilância em Saúde Pública
4.
Cancer Med ; 12(17): 17856-17865, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37610318

RESUMO

BACKGROUND: In those receiving chemotherapy, renal and hepatic dysfunction can increase the risk of toxicity and should therefore be monitored. We aimed to develop a machine learning model to identify those patients that need closer monitoring, enabling a safer and more efficient service. METHODS: We used retrospective data from a large academic hospital, for patients treated with chemotherapy for breast cancer, colorectal cancer and diffuse-large B-cell lymphoma, to train and validate a Multi-Layer Perceptrons (MLP) model to predict the outcomes of unacceptable rises in bilirubin or creatinine. To assess the performance of the model, validation was performed using patient data from a separate, independent hospital using the same variables. Using this dataset, we evaluated the sensitivity and specificity of the model. RESULTS: 1214 patients in total were identified. The training set had almost perfect sensitivity and specificity of >0.95; the area under the curve (AUC) was 0.99 (95% CI 0.98-1.00) for creatinine and 0.97 (95% CI: 0.95-0.99) for bilirubin. The validation set had good sensitivity (creatinine: 0.60, 95% CI: 0.55-0.64, bilirubin: 0.54, 95% CI: 0.52-0.56), and specificity (creatinine 0.98, 95% CI: 0.96-0.99, bilirubin 0.90, 95% CI: 0.87-0.94) and area under the curve (creatinine: 0.76, 95% CI: 0.70, 0.82, bilirubin 0.72, 95% CI: 0.68-0.76). CONCLUSIONS: We have demonstrated that a MLP model can be used to reduce the number of blood tests required for some patients at low risk of organ dysfunction, whilst improving safety for others at high risk.


Assuntos
Bilirrubina , Aprendizado de Máquina , Humanos , Estudos Retrospectivos , Creatinina , Sensibilidade e Especificidade
5.
Sci Rep ; 13(1): 13563, 2023 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-37604974

RESUMO

The emergency department (ED) is a fast-paced environment responsible for large volumes of patients with varied disease acuity. Operational pressures on EDs are increasing, which creates the imperative to efficiently identify patients at imminent risk of acute deterioration. The aim of this study is to systematically compare the performance of machine learning algorithms based on logistic regression, gradient boosted decision trees, and support vector machines for predicting imminent clinical deterioration for patients based on cross-sectional patient data extracted from electronic patient records (EPR) at the point of entry to the hospital. We apply state-of-the-art machine learning methods to predict early patient deterioration, based on their first recorded vital signs, observations, laboratory results, and other predictors documented in the EPR. Clinical deterioration in this study is measured by in-hospital mortality and/or admission to critical care. We build on prior work by incorporating interpretable machine learning and fairness-aware modelling, and use a dataset comprising 118, 886 unplanned admissions to Salford Royal Hospital, UK, to systematically compare model variations for predicting mortality and critical care utilisation within 24 hours of admission. We compare model performance to the National Early Warning Score 2 (NEWS2) and yield up to a 0.366 increase in average precision, up to a [Formula: see text] reduction in daily alert rate, and a median 0.599 reduction in differential bias amplification across the protected demographics of age and sex. We use Shapely Additive exPlanations to justify the models' outputs, verify that the captured data associations align with domain knowledge, and pair predictions with the causal context of each patient's most influential characteristics. Introducing our modelling to clinical practice has the potential to reduce alert fatigue and identify high-risk patients with a lower NEWS2 that might be missed currently, but further work is needed to trial the models in clinical practice. We encourage future research to follow a systematised approach to data-driven risk modelling to obtain clinically applicable support tools.


Assuntos
Deterioração Clínica , Serviços Médicos de Emergência , Humanos , Estudos Transversais , Aprendizado de Máquina , Tomada de Decisões
7.
Sci Rep ; 12(1): 19899, 2022 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-36400825

RESUMO

It has been shown that identical deep learning (DL) architectures will produce distinct explanations when trained with different hyperparameters that are orthogonal to the task (e.g. random seed, training set order). In domains such as healthcare and finance, where transparency and explainability is paramount, this can be a significant barrier to DL adoption. In this study we present a further analysis of explanation (in)consistency on 6 tabular datasets/tasks, with a focus on Electronic Health Records data. We propose a novel deep learning ensemble architecture that trains its sub-models to produce consistent explanations, improving explanation consistency by as much as 315% (e.g. from 0.02433 to 0.1011 on MIMIC-IV), and on average by 124% (e.g. from 0.12282 to 0.4450 on the BCW dataset). We evaluate the effectiveness of our proposed technique and discuss the implications our results have for both industrial applications of DL and explainability as well as future methodological work.


Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Previsões
8.
Sci Rep ; 12(1): 13468, 2022 08 05.
Artigo em Inglês | MEDLINE | ID: mdl-35931710

RESUMO

We approach the task of detecting the illicit movement of cultural heritage from a machine learning perspective by presenting a framework for detecting a known artefact in a new and unseen image. To this end, we explore the machine learning problem of instance classification for large archaeological images datasets, i.e. where each individual object (instance) is itself a class that all of the multiple images of that object belongs. We focus on a wide variety of objects in the Durham Oriental Museum with which we build a dataset with over 24,502 images of 4332 unique object instances. We experiment with state-of-the-art convolutional neural network models, the smaller variations of which are suitable for deployment on mobile applications. We find the exact object instance of a given image can be predicted from among 4332 others with ~ 72% accuracy, showing how effectively machine learning can detect a known object from a new image. We demonstrate that accuracy significantly improves as the number of images-per-object instance increases (up to ~ 83%), with an ensemble of classifiers scoring as high as 84%. We find that the correct instance is found in the top 3, 5, or 10 predictions of our best models ~ 91%, ~ 93%, or ~ 95% of the time respectively. Our findings contribute to the emerging overlap of machine learning and cultural heritage, and highlights the potential available to future applications and research.


Assuntos
Aprendizado Profundo , Artefatos , Aprendizado de Máquina , Redes Neurais de Computação
9.
PeerJ Comput Sci ; 8: e974, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35721409

RESUMO

Bilinear pooling (BLP) refers to a family of operations recently developed for fusing features from different modalities predominantly for visual question answering (VQA) models. Successive BLP techniques have yielded higher performance with lower computational expense, yet at the same time they have drifted further from the original motivational justification of bilinear models, instead becoming empirically motivated by task performance. Furthermore, despite significant success in text-image fusion in VQA, BLP has not yet gained such notoriety in video question answering (video-QA). Though BLP methods have continued to perform well on video tasks when fusing vision and non-textual features, BLP has recently been overshadowed by other vision and textual feature fusion techniques in video-QA. We aim to add a new perspective to the empirical and motivational drift in BLP. We take a step back and discuss the motivational origins of BLP, highlighting the often-overlooked parallels to neurological theories (Dual Coding Theory and The Two-Stream Model of Vision). We seek to carefully and experimentally ascertain the empirical strengths and limitations of BLP as a multimodal text-vision fusion technique in video-QA using two models (TVQA baseline and heterogeneous-memory-enchanced 'HME' model) and four datasets (TVQA, TGif-QA, MSVD-QA, and EgoVQA). We examine the impact of both simply replacing feature concatenation in the existing models with BLP, and a modified version of the TVQA baseline to accommodate BLP that we name the 'dual-stream' model. We find that our relatively simple integration of BLP does not increase, and mostly harms, performance on these video-QA benchmarks. Using our insights on recent work in BLP for video-QA results and recently proposed theoretical multimodal fusion taxonomies, we offer insight into why BLP-driven performance gain for video-QA benchmarks may be more difficult to achieve than in earlier VQA models. We share our perspective on, and suggest solutions for, the key issues we identify with BLP techniques for multimodal fusion in video-QA. We look beyond the empirical justification of BLP techniques and propose both alternatives and improvements to multimodal fusion by drawing neurological inspiration from Dual Coding Theory and the Two-Stream Model of Vision. We qualitatively highlight the potential for neurological inspirations in video-QA by identifying the relative abundance of psycholinguistically 'concrete' words in the vocabularies for each of the text components (e.g., questions and answers) of the four video-QA datasets we experiment with.

10.
Evol Comput ; 30(4): 479-501, 2022 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-35289840

RESUMO

Evolution-in-Materio is a computational paradigm in which an algorithm reconfigures a material's properties to achieve a specific computational function. This article addresses the question of how successful and well performing Evolution-in-Materio processors can be designed through the selection of nanomaterials and an evolutionary algorithm for a target application. A physical model of a nanomaterial network is developed which allows for both randomness, and the possibility of Ohmic and non-Ohmic conduction, that are characteristic of such materials. These differing networks are then exploited by differential evolution, which optimises several configuration parameters (e.g., configuration voltages, weights, etc.), to solve different classification problems. We show that ideal nanomaterial choice depends upon problem complexity, with more complex problems being favoured by complex voltage dependence of conductivity and vice versa. Furthermore, we highlight how intrinsic nanomaterial electrical properties can be exploited by differing configuration parameters, clarifying the role and limitations of these techniques. These findings provide guidance for the rational design of nanomaterials and algorithms for future Evolution-in-Materio processors.


Assuntos
Algoritmos
11.
PeerJ Comput Sci ; 7: e759, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34805510

RESUMO

Multitask learning has led to significant advances in Natural Language Processing, including the decaNLP benchmark where question answering is used to frame 10 natural language understanding tasks in a single model. In this work we show how models trained to solve decaNLP fail with simple paraphrasing of the question. We contribute a crowd-sourced corpus of paraphrased questions (PQ-decaNLP), annotated with paraphrase phenomena. This enables analysis of how transformations such as swapping the class labels and changing the sentence modality lead to a large performance degradation. Training both MQAN and the newer T5 model using PQ-decaNLP improves their robustness and for some tasks improves the performance on the original questions, demonstrating the benefits of a model which is more robust to paraphrasing. Additionally, we explore how paraphrasing knowledge is transferred between tasks, with the aim of exploiting the multitask property to improve the robustness of the models. We explore the addition of paraphrase detection and paraphrase generation tasks, and find that while both models are able to learn these new tasks, knowledge about paraphrasing does not transfer to other decaNLP tasks.

12.
JMIR Med Inform ; 9(10): e29871, 2021 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-34652278

RESUMO

BACKGROUND: Data science offers an unparalleled opportunity to identify new insights into many aspects of human life with recent advances in health care. Using data science in digital health raises significant challenges regarding data privacy, transparency, and trustworthiness. Recent regulations enforce the need for a clear legal basis for collecting, processing, and sharing data, for example, the European Union's General Data Protection Regulation (2016) and the United Kingdom's Data Protection Act (2018). For health care providers, legal use of the electronic health record (EHR) is permitted only in clinical care cases. Any other use of the data requires thoughtful considerations of the legal context and direct patient consent. Identifiable personal and sensitive information must be sufficiently anonymized. Raw data are commonly anonymized to be used for research purposes, with risk assessment for reidentification and utility. Although health care organizations have internal policies defined for information governance, there is a significant lack of practical tools and intuitive guidance about the use of data for research and modeling. Off-the-shelf data anonymization tools are developed frequently, but privacy-related functionalities are often incomparable with regard to use in different problem domains. In addition, tools to support measuring the risk of the anonymized data with regard to reidentification against the usefulness of the data exist, but there are question marks over their efficacy. OBJECTIVE: In this systematic literature mapping study, we aim to alleviate the aforementioned issues by reviewing the landscape of data anonymization for digital health care. METHODS: We used Google Scholar, Web of Science, Elsevier Scopus, and PubMed to retrieve academic studies published in English up to June 2020. Noteworthy gray literature was also used to initialize the search. We focused on review questions covering 5 bottom-up aspects: basic anonymization operations, privacy models, reidentification risk and usability metrics, off-the-shelf anonymization tools, and the lawful basis for EHR data anonymization. RESULTS: We identified 239 eligible studies, of which 60 were chosen for general background information; 16 were selected for 7 basic anonymization operations; 104 covered 72 conventional and machine learning-based privacy models; four and 19 papers included seven and 15 metrics, respectively, for measuring the reidentification risk and degree of usability; and 36 explored 20 data anonymization software tools. In addition, we also evaluated the practical feasibility of performing anonymization on EHR data with reference to their usability in medical decision-making. Furthermore, we summarized the lawful basis for delivering guidance on practical EHR data anonymization. CONCLUSIONS: This systematic literature mapping study indicates that anonymization of EHR data is theoretically achievable; yet, it requires more research efforts in practical implementations to balance privacy preservation and usability to ensure more reliable health care applications.

13.
JMIR Med Inform ; 9(5): e25237, 2021 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-34028357

RESUMO

BACKGROUND: Predicting the risk of glycated hemoglobin (HbA1c) elevation can help identify patients with the potential for developing serious chronic health problems, such as diabetes. Early preventive interventions based upon advanced predictive models using electronic health records data for identifying such patients can ultimately help provide better health outcomes. OBJECTIVE: Our study investigated the performance of predictive models to forecast HbA1c elevation levels by employing several machine learning models. We also examined the use of patient electronic health record longitudinal data in the performance of the predictive models. Explainable methods were employed to interpret the decisions made by the black box models. METHODS: This study employed multiple logistic regression, random forest, support vector machine, and logistic regression models, as well as a deep learning model (multilayer perceptron) to classify patients with normal (<5.7%) and elevated (≥5.7%) levels of HbA1c. We also integrated current visit data with historical (longitudinal) data from previous visits. Explainable machine learning methods were used to interrogate the models and provide an understanding of the reasons behind the decisions made by the models. All models were trained and tested using a large data set from Saudi Arabia with 18,844 unique patient records. RESULTS: The machine learning models achieved promising results for predicting current HbA1c elevation risk. When coupled with longitudinal data, the machine learning models outperformed the multiple logistic regression model used in the comparative study. The multilayer perceptron model achieved an accuracy of 83.22% for the area under receiver operating characteristic curve when used with historical data. All models showed a close level of agreement on the contribution of random blood sugar and age variables with and without longitudinal data. CONCLUSIONS: This study shows that machine learning models can provide promising results for the task of predicting current HbA1c levels (≥5.7% or less). Using patients' longitudinal data improved the performance and affected the relative importance for the predictors used. The models showed results that are consistent with comparable studies.

14.
JMIR Med Inform ; 8(7): e18963, 2020 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-32618575

RESUMO

BACKGROUND: Electronic health record (EHR) systems generate large datasets that can significantly enrich the development of medical predictive models. Several attempts have been made to investigate the effect of glycated hemoglobin (HbA1c) elevation on the prediction of diabetes onset. However, there is still a need for validation of these models using EHR data collected from different populations. OBJECTIVE: The aim of this study is to perform a replication study to validate, evaluate, and identify the strengths and weaknesses of replicating a predictive model that employed multiple logistic regression with EHR data to forecast the levels of HbA1c. The original study used data from a population in the United States and this differentiated replication used a population in Saudi Arabia. METHODS: A total of 3 models were developed and compared with the model created in the original study. The models were trained and tested using a larger dataset from Saudi Arabia with 36,378 records. The 10-fold cross-validation approach was used for measuring the performance of the models. RESULTS: Applying the method employed in the original study achieved an accuracy of 74% to 75% when using the dataset collected from Saudi Arabia, compared with 77% obtained from using the population from the United States. The results also show a different ranking of importance for the predictors between the original study and the replication. The order of importance for the predictors with our population, from the most to the least importance, is age, random blood sugar, estimated glomerular filtration rate, total cholesterol, non-high-density lipoprotein, and body mass index. CONCLUSIONS: This replication study shows that direct use of the models (calculators) created using multiple logistic regression to predict the level of HbA1c may not be appropriate for all populations. This study reveals that the weighting of the predictors needs to be calibrated to the population used. However, the study does confirm that replicating the original study using a different population can help with predicting the levels of HbA1c by using the predictors that are routinely collected and stored in hospital EHR systems.

15.
PeerJ Comput Sci ; 6: e252, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33816904

RESUMO

The article presents a discriminative approach to complement the unsupervised probabilistic nature of topic modelling. The framework transforms the probabilities of the topics per document into class-dependent deep learning models that extract highly discriminatory features suitable for classification. The framework is then used for sentiment analysis with minimum feature engineering. The approach transforms the sentiment analysis problem from the word/document domain to the topics domain making it more robust to noise and incorporating complex contextual information that are not represented otherwise. A stacked denoising autoencoder (SDA) is then used to model the complex relationship among the topics per sentiment with minimum assumptions. To achieve this, a distinct topic model and SDA per sentiment polarity is built with an additional decision layer for classification. The framework is tested on a comprehensive collection of benchmark datasets that vary in sample size, class bias and classification task. A significant improvement to the state of the art is achieved without the need for a sentiment lexica or over-engineered features. A further analysis is carried out to explain the observed improvement in accuracy.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...