Search | VHL Regional Portal

1.

Physical Adversarial Attacks for Surveillance: A Survey.

Nguyen, Kien; Fernando, Tharindu; Fookes, Clinton; Sridharan, Sridha.

IEEE Trans Neural Netw Learn Syst ; PP2023 Oct 12.

Article in English | MEDLINE | ID: mdl-37824320

ABSTRACT

Modern automated surveillance techniques are heavily reliant on deep learning methods. Despite the superior performance, these learning systems are inherently vulnerable to adversarial attacks-maliciously crafted inputs that are designed to mislead, or trick, models into making incorrect predictions. An adversary can physically change their appearance by wearing adversarial t-shirts, glasses, or hats or by specific behavior, to potentially avoid various forms of detection, tracking, and recognition of surveillance systems; and obtain unauthorized access to secure properties and assets. This poses a severe threat to the security and safety of modern surveillance systems. This article reviews recent attempts and findings in learning and designing physical adversarial attacks for surveillance applications. In particular, we propose a framework to analyze physical adversarial attacks and provide a comprehensive survey of physical adversarial attacks on four key surveillance tasks: detection, identification, tracking, and action recognition under this framework. Furthermore, we review and analyze strategies to defend against physical adversarial attacks and the methods for evaluating the strengths of the defense. The insights in this article present an important step in building resilience within surveillance systems to physical adversarial attacks.

2.

Complex-Valued Iris Recognition Network.

Nguyen, Kien; Fookes, Clinton; Sridharan, Sridha; Ross, Arun.

IEEE Trans Pattern Anal Mach Intell ; 45(1): 182-196, 2023 Jan.

Article in English | MEDLINE | ID: mdl-35201979

ABSTRACT

In this work, we design a fully complex-valued neural network for the task of iris recognition. Unlike the problem of general object recognition, where real-valued neural networks can be used to extract pertinent features, iris recognition depends on the extraction of both phase and magnitude information from the input iris texture in order to better represent its biometric content. This necessitates the extraction and processing of phase information that cannot be effectively handled by a real-valued neural network. In this regard, we design a fully complex-valued neural network that can better capture the multi-scale, multi-resolution, and multi-orientation phase and amplitude features of the iris texture. We show a strong correspondence of the proposed complex-valued iris recognition network with Gabor wavelets that are used to generate the classical IrisCode; however, the proposed method enables a new capability of automatic complex-valued feature learning that is tailored for iris recognition. We conduct experiments on three benchmark datasets - ND-CrossSensor-2013, CASIA-Iris-Thousand and UBIRIS.v2 - and show the benefit of the proposed network for the task of iris recognition. We exploit visualization schemes to convey how the complex-valued network, when compared to standard real-valued networks, extracts fundamentally different features from the iris texture.

3.

Generalized Generative Deep Learning Models for Biosignal Synthesis and Modality Transfer.

Dissanayake, Theekshana; Fernando, Tharindu; Denman, Simon; Sridharan, Sridha; Fookes, Clinton.

IEEE J Biomed Health Inform ; 27(2): 968-979, 2023 02.

Article in English | MEDLINE | ID: mdl-36409802

ABSTRACT

Generative Adversarial Networks (GANs) are a revolutionary innovation in machine learning that enables the generation of artificial data. Artificial data synthesis is valuable especially in the medical field where it is difficult to collect and annotate real data due to privacy issues, limited access to experts, and cost. While adversarial training has led to significant breakthroughs in the computer vision field, biomedical research has not yet fully exploited the capabilities of generative models for data generation, and for more complex tasks such as biosignal modality transfer. We present a broad analysis on adversarial learning on biosignal data. Our study is the first in the machine learning community to focus on synthesizing 1D biosignal data using adversarial models. We consider three types of deep generative adversarial networks: a classical GAN, an adversarial AE, and a modality transfer GAN; individually designed for biosignal synthesis and modality transfer purposes. We evaluate these methods on multiple datasets for different biosignal modalites, including phonocardiogram (PCG), electrocardiogram (ECG), vectorcardiogram and 12-lead electrocardiogram. We follow subject-independent evaluation protocols, by evaluating the proposed models' performance on completely unseen data to demonstrate generalizability. We achieve superior results in generating biosignals, specifically in conditional generation, by synthesizing realistic samples while preserving domain-relevant characteristics. We also demonstrate insightful results in biosignal modality transfer that can generate expanded representations from fewer input-leads, ultimately making the clinical monitoring setting more convenient for the patient. Furthermore our longer duration ECGs generated, maintain clear ECG rhythmic regions, which has been proven using ad-hoc segmentation models.

Subject(s)

Biomedical Research , Deep Learning , Humans , Electrocardiography , Machine Learning , Privacy , Image Processing, Computer-Assisted

4.

Accurate 3D hand mesh recovery from a single RGB image.

Pemasiri, Akila; Nguyen, Kien; Sridharan, Sridha; Fookes, Clinton.

Sci Rep ; 12(1): 11043, 2022 06 30.

Article in English | MEDLINE | ID: mdl-35773266

ABSTRACT

This work addresses hand mesh recovery from a single RGB image. In contrast to most of the existing approaches where parametric hand models are employed as the prior, we show that the hand mesh can be learned directly from the input image. We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training. By interpreting the mesh as a graph, our model is able to capture the topological relationship among the mesh vertices. We also introduce a 3D surface descriptor into the GAN architecture to further capture the associated 3D features. We conduct experiments with the proposed Im2Mesh GAN architecture in two settings: one where we can reap the benefits of coupled groundtruth data availability of the images and the corresponding meshes; and the other which combats the more challenging problem of mesh estimation without the corresponding groundtruth. Through extensive evaluations we demonstrate that even without using any hand priors the proposed method performs on par or better than the state-of-the-art.

Subject(s)

Image Processing, Computer-Assisted , Surgical Mesh , Hand , Image Processing, Computer-Assisted/methods

5.

Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings.

Fernando, Tharindu; Sridharan, Sridha; Denman, Simon; Ghaemmaghami, Houman; Fookes, Clinton.

IEEE J Biomed Health Inform ; 26(7): 2898-2908, 2022 07.

Article in English | MEDLINE | ID: mdl-35061595

ABSTRACT

OBJECTIVE: This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition of each event. METHODS: We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length. RESULTS: The proposed method is evaluated on multiple public and in-house benchmarks, containing irregular and noisy recordings of the respiratory auscultation process for the identification of auscultation events including inhalation, crackles, and rhonchi. Moreover, we provide an end-to-end model interpretation pipeline. CONCLUSION: Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network. SIGNIFICANCE: Lung sound event detection is a primary diagnostic step for numerous respiratory diseases. The proposed method provides a cost-effective and efficient alternative to exhaustive manual segmentation, and provides more accurate segmentation than existing methods. The end-to-end model interpretability helps to build the required trust in the system for use in clinical settings.

Subject(s)

Respiratory Sounds , Sound Recordings , Algorithms , Auscultation/methods , Humans , Lung

6.

Geometric Deep Learning for Subject Independent Epileptic Seizure Prediction Using Scalp EEG Signals.

Dissanayake, Theekshana; Fernando, Tharindu; Denman, Simon; Sridharan, Sridha; Fookes, Clinton.

IEEE J Biomed Health Inform ; 26(2): 527-538, 2022 02.

Article in English | MEDLINE | ID: mdl-34314363

ABSTRACT

Recently, researchers in the biomedical community have introduced deep learning-based epileptic seizure prediction models using electroencephalograms (EEGs) that can anticipate an epileptic seizure by differentiating between the pre-ictal and interictal stages of the subject's brain. Despite having the appearance of a typical anomaly detection task, this problem is complicated by subject-specific characteristics in EEG data. Therefore, studies that investigate seizure prediction widely employ subject-specific models. However, this approach is not suitable in situations where a target subject has limited (or no) data for training. Subject-independent models can address this issue by learning to predict seizures from multiple subjects, and therefore are of greater value in practice. In this study, we propose a subject-independent seizure predictor using Geometric Deep Learning (GDL). In the first stage of our GDL-based method we use graphs derived from physical connections in the EEG grid. We subsequently seek to synthesize subject-specific graphs using deep learning. The models proposed in both stages achieve state-of-the-art performance using a one-hour early seizure prediction window on two benchmark datasets (CHB-MIT-EEG: 95.38% with 23 subjects and Siena-EEG: 96.05% with 15 subjects). To the best of our knowledge, this is the first study that proposes synthesizing subject-specific graphs for seizure prediction. Furthermore, through model interpretation we outline how this method can potentially contribute towards Scalp EEG-based seizure localization.

Subject(s)

Deep Learning , Algorithms , Electroencephalography/methods , Humans , Scalp , Seizures/diagnosis

7.

TMMF: Temporal Multi-Modal Fusion for Single-Stage Continuous Gesture Recognition.

Gammulle, Harshala; Denman, Simon; Sridharan, Sridha; Fookes, Clinton.

IEEE Trans Image Process ; 30: 7689-7701, 2021.

Article in English | MEDLINE | ID: mdl-34478365

ABSTRACT

Gesture recognition is a much studied research area which has myriad real-world applications including robotics and human-machine interaction. Current gesture recognition methods have focused on recognising isolated gestures, and existing continuous gesture recognition methods are limited to two-stage approaches where independent models are required for detection and classification, with the performance of the latter being constrained by detection performance. In contrast, we introduce a single-stage continuous gesture recognition framework, called Temporal Multi-Modal Fusion (TMMF), that can detect and classify multiple gestures in a video via a single model. This approach learns the natural transitions between gestures and non-gestures without the need for a pre-processing segmentation step to detect individual gestures. To achieve this, we introduce a multi-modal fusion mechanism to support the integration of important information that flows from multi-modal inputs, and is scalable to any number of modes. Additionally, we propose Unimodal Feature Mapping (UFM) and Multi-modal Feature Mapping (MFM) models to map uni-modal features and the fused multi-modal features respectively. To further enhance performance, we propose a mid-point based loss function that encourages smooth alignment between the ground truth and the prediction, helping the model to learn natural gesture transitions. We demonstrate the utility of our proposed framework, which can handle variable-length input videos, and outperforms the state-of-the-art on three challenging datasets: EgoGesture, IPN hand and ChaLearn LAP Continuous Gesture Dataset (ConGD). Furthermore, ablation experiments show the importance of different components of the proposed framework.

Subject(s)

Gestures , Pattern Recognition, Automated , Algorithms , Hand , Humans

8.

Domain Generalization in Biosignal Classification.

Dissanayake, Theekshana; Fernando, Tharindu; Denman, Simon; Ghaemmaghami, Houman; Sridharan, Sridha; Fookes, Clinton.

IEEE Trans Biomed Eng ; 68(6): 1978-1989, 2021 06.

Article in English | MEDLINE | ID: mdl-33338009

ABSTRACT

OBJECTIVE: When training machine learning models, we often assume that the training data and evaluation data are sampled from the same distribution. However, this assumption is violated when the model is evaluated on another unseen but similar database, even if that database contains the same classes. This problem is caused by domain-shift and can be solved using two approaches: domain adaptation and domain generalization. Simply, domain adaptation methods can access data from unseen domains during training; whereas in domain generalization, the unseen data is not available during training. Hence, domain generalization concerns models that perform well on inaccessible, domain-shifted data. METHOD: Our proposed domain generalization method represents an unseen domain using a set of known basis domains, afterwhich we classify the unseen domain using classifier fusion. To demonstrate our system, we employ a collection of heart sound databases that contain normal and abnormal sounds (classes). RESULTS: Our proposed classifier fusion method achieves accuracy gains of up to 16% for four completely unseen domains. CONCLUSION: Recognizing the complexity induced by the inherent temporal nature of biosignal data, the two-stage method proposed in this study is able to effectively simplify the whole process of domain generalization while demonstrating good results on unseen domains and the adopted basis domains. SIGNIFICANCE: To our best knowledge, this is the first study that investigates domain generalization for biosignal data. Our proposed learning strategy can be used to effectively learn domain-relevant features while being aware of the class differences in the data.

Subject(s)

Heart Sounds , Machine Learning , Databases, Factual

9.

Identification of Children at Risk of Schizophrenia via Deep Learning and EEG Responses.

Ahmedt-Aristizabal, David; Fernando, Tharindu; Denman, Simon; Robinson, Jonathan Edward; Sridharan, Sridha; Johnston, Patrick J; Laurens, Kristin R; Fookes, Clinton.

IEEE J Biomed Health Inform ; 25(1): 69-76, 2021 01.

Article in English | MEDLINE | ID: mdl-32310808

ABSTRACT

The prospective identification of children likely to develop schizophrenia is a vital tool to support early interventions that can mitigate the risk of progression to clinical psychosis. Electroencephalographic (EEG) patterns from brain activity and deep learning techniques are valuable resources in achieving this identification. We propose automated techniques that can process raw EEG waveforms to identify children who may have an increased risk of schizophrenia compared to typically developing children. We also analyse abnormal features that remain during developmental follow-up over a period of â¼ 4 years in children with a vulnerability to schizophrenia initially assessed when aged 9 to 12 years. EEG data from participants were captured during the recording of a passive auditory oddball paradigm. We undertake a holistic study to identify brain abnormalities, first by exploring traditional machine learning algorithms using classification methods applied to hand-engineered features (event-related potential components). Then, we compare the performance of these methods with end-to-end deep learning techniques applied to raw data. We demonstrate via average cross-validation performance measures that recurrent deep convolutional neural networks can outperform traditional machine learning methods for sequence modeling. We illustrate the intuitive salient information of the model with the location of the most relevant attributes of a post-stimulus window. This baseline identification system in the area of mental illness supports the evidence of developmental and disease effects in a pre-prodromal phase of psychosis. These results reinforce the benefits of deep learning to support psychiatric classification and neuroscientific research more broadly.

Subject(s)

Deep Learning , Schizophrenia , Child , Electroencephalography , Humans , Neural Networks, Computer , Prospective Studies , Schizophrenia/diagnosis

10.

A Robust Interpretable Deep Learning Classifier for Heart Anomaly Detection Without Segmentation.

Dissanayake, Theekshana; Fernando, Tharindu; Denman, Simon; Sridharan, Sridha; Ghaemmaghami, Houman; Fookes, Clinton.

IEEE J Biomed Health Inform ; 25(6): 2162-2171, 2021 06.

Article in English | MEDLINE | ID: mdl-32997637

ABSTRACT

Traditionally, abnormal heart sound classification is framed as a three-stage process. The first stage involves segmenting the phonocardiogram to detect fundamental heart sounds; after which features are extracted and classification is performed. Some researchers in the field argue the segmentation step is an unwanted computational burden, whereas others embrace it as a prior step to feature extraction. When comparing accuracies achieved by studies that have segmented heart sounds before analysis with those who have overlooked that step, the question of whether to segment heart sounds before feature extraction is still open. In this study, we explicitly examine the importance of heart sound segmentation as a prior step for heart sound classification, and then seek to apply the obtained insights to propose a robust classifier for abnormal heart sound detection. Furthermore, recognizing the pressing need for explainable Artificial Intelligence (AI) models in the medical domain, we also unveil hidden representations learned by the classifier using model interpretation techniques. Experimental results demonstrate that the segmentation which can be learned by the model plays an essential role in abnormal heart sound classification. Our new classifier is also shown to be robust, stable and most importantly, explainable, with an accuracy of almost 100% on the widely used PhysioNet dataset.

Subject(s)

Deep Learning , Signal Processing, Computer-Assisted , Algorithms , Artificial Intelligence , Phonocardiography

11.

Neural memory plasticity for medical anomaly detection.

Fernando, Tharindu; Denman, Simon; Ahmedt-Aristizabal, David; Sridharan, Sridha; Laurens, Kristin R; Johnston, Patrick; Fookes, Clinton.

Neural Netw ; 127: 67-81, 2020 Jul.

Article in English | MEDLINE | ID: mdl-32334342

ABSTRACT

In the domain of machine learning, Neural Memory Networks (NMNs) have recently achieved impressive results in a variety of application areas including visual question answering, trajectory prediction, object tracking, and language modelling. However, we observe that the attention based knowledge retrieval mechanisms used in current NMNs restrict them from achieving their full potential as the attention process retrieves information based on a set of static connection weights. This is suboptimal in a setting where there are vast differences among samples in the data domain; such as anomaly detection where there is no consistent criteria for what constitutes an anomaly. In this paper, we propose a plastic neural memory access mechanism which exploits both static and dynamic connection weights in the memory read, write and output generation procedures. We demonstrate the effectiveness and flexibility of the proposed memory model in three challenging anomaly detection tasks in the medical domain: abnormal EEG identification, MRI tumour type classification and schizophrenia risk detection in children. In all settings, the proposed approach outperforms the current state-of-the-art. Furthermore, we perform an in-depth analysis demonstrating the utility of neural plasticity for the knowledge retrieval process and provide evidence on how the proposed memory model generates sparse yet informative memory outputs.

Subject(s)

Electroencephalography/methods , Machine Learning , Magnetic Resonance Imaging/methods , Neural Networks, Computer , Neuronal Plasticity , Attention/physiology , Brain Neoplasms/diagnostic imaging , Databases, Factual/trends , Electroencephalography/trends , Humans , Machine Learning/trends , Magnetic Resonance Imaging/trends , Memory/physiology , Neuronal Plasticity/physiology

12.

Heart Sound Segmentation Using Bidirectional LSTMs With Attention.

Fernando, Tharindu; Ghaemmaghami, Houman; Denman, Simon; Sridharan, Sridha; Hussain, Nayyar; Fookes, Clinton.

IEEE J Biomed Health Inform ; 24(6): 1601-1609, 2020 06.

Article in English | MEDLINE | ID: mdl-31670683

ABSTRACT

OBJECTIVE: This paper proposes a novel framework for the segmentation of phonocardiogram (PCG) signals into heart states, exploiting the temporal evolution of the PCG as well as considering the salient information that it provides for the detection of the heart state. METHODS: We propose the use of recurrent neural networks and exploit recent advancements in attention based learning to segment the PCG signal. This allows the network to identify the most salient aspects of the signal and disregard uninformative information. RESULTS: The proposed method attains state-of-the-art performance on multiple benchmarks including both human and animal heart recordings. Furthermore, we empirically analyse different feature combinations including envelop features, wavelet and Mel Frequency Cepstral Coefficients (MFCC), and provide quantitative measurements that explore the importance of different features in the proposed approach. CONCLUSION: We demonstrate that a recurrent neural network coupled with attention mechanisms can effectively learn from irregular and noisy PCG recordings. Our analysis of different feature combinations shows that MFCC features and their derivatives offer the best performance compared to classical wavelet and envelop features. SIGNIFICANCE: Heart sound segmentation is a crucial pre-processing step for many diagnostic applications. The proposed method provides a cost effective alternative to labour extensive manual segmentation, and provides a more accurate segmentation than existing methods. As such, it can improve the performance of further analysis including the detection of murmurs and ejection clicks. The proposed method is also applicable for detection and segmentation of other one dimensional biomedical signals.

Subject(s)

Heart Sounds/physiology , Neural Networks, Computer , Phonocardiography/methods , Signal Processing, Computer-Assisted , Animals , Deep Learning , Female , Humans , Male , Phonocardiography/classification

13.

Understanding Patients' Behavior: Vision-Based Analysis of Seizure Disorders.

Ahmedt-Aristizabal, David; Denman, Simon; Nguyen, Kien; Sridharan, Sridha; Dionisio, Sasha; Fookes, Clinton.

IEEE J Biomed Health Inform ; 23(6): 2583-2591, 2019 11.

Article in English | MEDLINE | ID: mdl-30714935

ABSTRACT

A substantial proportion of patients with functional neurological disorders (FND) are being incorrectly diagnosed with epilepsy because their semiology resembles that of epileptic seizures (ES). Misdiagnosis may lead to unnecessary treatment and its associated complications. Diagnostic errors often result from an overreliance on specific clinical features. Furthermore, the lack of electrophysiological changes in patients with FND can also be seen in some forms of epilepsy, making diagnosis extremely challenging. Therefore, understanding semiology is an essential step for differentiating between ES and FND. Existing sensor-based and marker-based systems require physical contact with the body and are vulnerable to clinical situations such as patient positions, illumination changes, and motion discontinuities. Computer vision and deep learning are advancing to overcome these limitations encountered in the assessment of diseases and patient monitoring; however, they have not been investigated for seizure disorder scenarios. Here, we propose and compare two marker-free deep learning models, a landmark-based and a region-based model, both of which are capable of distinguishing between seizures from video recordings. We quantify semiology by using either a fusion of reference points and flow fields, or through the complete analysis of the body. Average leave-one-subject-out cross-validation accuracies for the landmark-based and region-based approaches of 68.1% and 79.6% in our dataset collected from 35 patients, reveal the benefit of video analytics to support automated identification of semiology in the challenging conditions of a hospital setting.

Subject(s)

Epilepsy/diagnostic imaging , Image Interpretation, Computer-Assisted/methods , Monitoring, Physiologic/methods , Video Recording/methods , Deep Learning , Humans

14.

Aberrant epileptic seizure identification: A computer vision perspective.

Ahmedt-Aristizabal, David; Fookes, Clinton; Denman, Simon; Nguyen, Kien; Sridharan, Sridha; Dionisio, Sasha.

Seizure ; 65: 65-71, 2019 Feb.

Article in English | MEDLINE | ID: mdl-30616221

ABSTRACT

PURPOSE: The recent explosion of artificial intelligence techniques in video analytics has highlighted the clinical relevance in capturing and quantifying semiology during epileptic seizures; however, we lack an automated anomaly identification system for aberrant behaviors. In this paper, we describe a novel system that is trained with known clinical manifestations from patients with mesial temporal and extra-temporal lobe epilepsy and presents aberrant semiology to physicians. METHODS: We propose a simple end-to-end-architecture based on convolutional and recurrent neural networks to extract spatiotemporal representations and to create motion capture libraries from 119 seizures of 28 patients. The cosine similarity distance between a test representation and the libraries from five aberrant seizures separate to the main dataset is subsequently used to identify test seizures with unusual patterns that do not conform to known behavior. RESULTS: Cross-validation evaluations are performed to validate the quantification of motion features and to demonstrate the robustness of the motion capture libraries for identifying epilepsy types. The system to identify unusual epileptic seizures successfully detects out of the five seizures categorized as aberrant cases. CONCLUSIONS: The proposed approach is capable of modeling clinical manifestations of known behaviors in natural clinical settings, and effectively identify aberrant seizures using a simple strategy based on motion capture libraries of spatiotemporal representations and similarities between hidden states. Detecting anomalies is essential to alert clinicians to the occurrence of unusual events, and we show how this can be achieved using pre-learned database of semiology stored in health records.

Subject(s)

Brain/physiopathology , Diagnosis, Computer-Assisted/methods , Epilepsy, Temporal Lobe/diagnosis , Epilepsy, Temporal Lobe/physiopathology , Seizures/diagnosis , Electroencephalography , Female , Humans , Male , Neural Networks, Computer , Reproducibility of Results , Seizures/physiopathology , Video Recording

15.

Vision-Based Mouth Motion Analysis in Epilepsy: A 3D Perspective.

Ahmedt-Aristizabal, David; Nguyen, Kien; Denman, Simon; Sarfraz, M Saquib; Sridharan, Sridha; Dionisio, Sasha; Fookes, Clinton.

Annu Int Conf IEEE Eng Med Biol Soc ; 2019: 1625-1629, 2019 Jul.

Article in English | MEDLINE | ID: mdl-31946208

ABSTRACT

Epilepsy monitoring involves the study of videos to assess clinical signs (semiology) to assist with the diagnosis of seizures. Recent advances in the application of vision-based approaches to epilepsy analysis have demonstrated significant potential to automate this assessment. Nevertheless, current proposed computer vision based techniques are unable to accurately quantify specific facial modifications, e.g. mouth motions, which are examined by neurologists to distinguish between seizure types. 2D approaches that analyse facial landmarks have been proposed to quantify mouth motions, however, they are unable to fully represent motions in the mouth and cheeks (ictal pouting) due to a lack of landmarks in the the cheek regions. Additionally, 2D region-based techniques based on the detection of the mouth have limitations when dealing with large pose variations, and thus make a fair comparison between samples difficult due to the variety of poses present. 3D approaches, on the other hand, retain rich information about the shape and appearance of faces, simplifying alignment for comparison between sequences. In this paper, we propose a novel network method based on a 3D reconstruction of the face and deep learning to detect and quantify mouth semiology in our video dataset of 20 seizures, recorded from patients with mesial temporal and extra-temporal lobe epilepsy. The proposed network is capable of distinguishing between seizures of both types of epilepsy. An average classification accuracy of 89% demonstrates the benefits of computer vision and deep learning for clinical applications of non-contact systems to identify semiology commonly encountered in a natural clinical setting.

Subject(s)

Epilepsy , Electroencephalography , Face , Humans , Mouth , Seizures

16.

Deep Motion Analysis for Epileptic Seizure Classification.

Ahmedt-Aristizabal, David; Nguyen, Kien; Denman, Simon; Sridharan, Sridha; Dionisio, Sasha; Fookes, Clinton.

Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 3578-3581, 2018 Jul.

Article in English | MEDLINE | ID: mdl-30441151

ABSTRACT

Visual motion clues such as facial expression and pose are natural semiology features which an epileptologist observes to identify epileptic seizures. However, these cues have not been effectively exploited for automatic detection due to the diverse variations in seizure appearance within and between patients. Here we present a multi-modal analysis approach to quantitatively classify patients with mesial temporal lobe (MTLE) and extra-temporal lobe (ETLE) epilepsy, relying on the fusion of facial expressions and pose dynamics. We propose a new deep learning approach that leverages recent advances in Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to automatically extract spatiotemporal features from facial and pose semiology using recorded videos. A video dataset from 12 patients with MTLE and 6 patients with ETLEin an Australian hospital has been collected for experiments. Our experiments show that facial semiology and body movements can be effectively recognized and tracked, and that they provide useful evidence to identify the type of epilepsy. A multi-fold cross-validation of the fusion model exhibited an average test accuracy of 92.10%, while a leave-one-subject-out cross-validation scheme, which is the first in the literature, achieves an accuracy of 58.49%. The proposed approach is capable of modelling semiology features which effectively discriminate between seizures arising from temporal and extra-temporal brain areas. Our approach can be used as a virtual assistant, which will save time, improve patient safety and provide objective clinical analysis to assist with clinical decision making.

Subject(s)

Epilepsy , Seizures , Humans

17.

Deep Classification of Epileptic Signals.

Ahmedt-Aristizabal, David; Fookes, Clinton; Nguyen, Kien; Sridharan, Sridha.

Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 332-335, 2018 Jul.

Article in English | MEDLINE | ID: mdl-30440405

ABSTRACT

Electrophysiological observation plays a major role in epilepsy evaluation. However, human interpretation of brain signals is subjective and prone to misdiagnosis. Automating this process, especially seizure detection relying on scalpbased Electroencephalography (EEG) and intracranial EEG, has been the focus of research over recent decades. Nevertheless, its numerous challenges have inhibited a definitive solution. Inspired by recent advances in deep learning, here we describe a new classification approach for EEG time series based on Recurrent Neural Networks (RNNs) via the use of Long- Short Term Memory (LSTM) networks. The proposed deep network effectively learns and models discriminative temporal patterns from EEG sequential data. Especially, the features are automatically discovered from the raw EEG data without any pre-processing step, eliminating humans from laborious feature design task. Our light-weight system has a low computational complexity and reduced memory requirement for large training datasets. On a public dataset, a multi-fold cross-validation scheme of the proposed architecture exhibited an average validation accuracy of 95.54% and an average AUC of 0.9582 of the ROC curve among all sets defined in the experiment. This work reinforces the benefits of deep learning to be further attended in clinical applications and neuroscientific research.

Subject(s)

Epilepsy , Brain , Electroencephalography , Humans , Neural Networks, Computer , Seizures

18.

Soft + Hardwired attention: An LSTM framework for human trajectory prediction and abnormal event detection.

Fernando, Tharindu; Denman, Simon; Sridharan, Sridha; Fookes, Clinton.

Neural Netw ; 108: 466-478, 2018 Dec.

Article in English | MEDLINE | ID: mdl-30317132

ABSTRACT

As humans we possess an intuitive ability for navigation which we master through years of practice; however existing approaches to model this trait for diverse tasks including monitoring pedestrian flow and detecting abnormal events have been limited by using a variety of hand-crafted features. Recent research in the area of deep-learning has demonstrated the power of learning features directly from the data; and related research in recurrent neural networks has shown exemplary results in sequence-to-sequence problems such as neural machine translation and neural image caption generation. Motivated by these approaches, we propose a novel method to predict the future motion of a pedestrian given a short history of their, and their neighbours, past behaviour. The novelty of the proposed method is the combined attention model which utilises both "soft attention" as well as "hard-wired" attention in order to map the trajectory information from the local neighbourhood to the future positions of the pedestrian of interest. We illustrate how a simple approximation of attention weights (i.e. hard-wired) can be merged together with soft attention weights in order to make our model applicable for challenging real world scenarios with hundreds of neighbours. The navigational capability of the proposed method is tested on two challenging publicly available surveillance databases where our model outperforms the current-state-of-the-art methods. Additionally, we illustrate how the proposed architecture can be directly applied for the task of abnormal event detection without handcrafting the features.

Subject(s)

Attention , Deep Learning , Neural Networks, Computer , Databases, Factual , Deep Learning/trends , Forecasting , Humans , Machine Learning/trends , Motion

19.

A hierarchical multimodal system for motion analysis in patients with epilepsy.

Ahmedt-Aristizabal, David; Fookes, Clinton; Denman, Simon; Nguyen, Kien; Fernando, Tharindu; Sridharan, Sridha; Dionisio, Sasha.

Epilepsy Behav ; 87: 46-58, 2018 10.

Article in English | MEDLINE | ID: mdl-30173017

ABSTRACT

During seizures, a myriad of clinical manifestations may occur. The analysis of these signs, known as seizure semiology, gives clues to the underlying cerebral networks involved. When patients with drug-resistant epilepsy are monitored to assess their suitability for epilepsy surgery, semiology is a vital component to the presurgical evaluation. Specific patterns of facial movements, head motions, limb posturing and articulations, and hand and finger automatisms may be useful in distinguishing between mesial temporal lobe epilepsy (MTLE) and extratemporal lobe epilepsy (ETLE). However, this analysis is time-consuming and dependent on clinical experience and training. Given this limitation, an automated analysis of semiological patterns, i.e., detection, quantification, and recognition of body movement patterns, has the potential to help increase the diagnostic precision of localization. While a few single modal quantitative approaches are available to assess seizure semiology, the automated quantification of patients' behavior across multiple modalities has seen limited advances in the literature. This is largely due to multiple complicated variables commonly encountered in the clinical setting, such as analyzing subtle physical movements when the patient is covered or room lighting is inadequate. Semiology encompasses the stepwise/temporal progression of signs that is reflective of the integration of connected neuronal networks. Thus, single signs in isolation are far less informative. Taking this into account, here, we describe a novel modular, hierarchical, multimodal system that aims to detect and quantify semiologic signs recorded in 2D monitoring videos. Our approach can jointly learn semiologic features from facial, body, and hand motions based on computer vision and deep learning architectures. A dataset collected from an Australian quaternary referral epilepsy unit analyzing 161 seizures arising from the temporal (nâ¯=â¯90) and extratemporal (nâ¯=â¯71) brain regions has been used in our system to quantitatively classify these types of epilepsy according to the semiology detected. A leave-one-subject-out (LOSO) cross-validation of semiological patterns from the face, body, and hands reached classification accuracies ranging between 12% and 83.4%, 41.2% and 80.1%, and 32.8% and 69.3%, respectively. The proposed hierarchical multimodal system is a potential stepping-stone towards developing a fully automated semiology analysis system to support the assessment of epilepsy.

Subject(s)

Automatism/physiopathology , Deep Learning , Epilepsy, Temporal Lobe/diagnosis , Epilepsy/diagnosis , Face/physiopathology , Hand/physiopathology , Movement/physiology , Neurophysiological Monitoring/methods , Seizures/diagnosis , Biomechanical Phenomena , Datasets as Topic , Humans

20.

Deep facial analysis: A new phase I epilepsy evaluation using computer vision.

Ahmedt-Aristizabal, David; Fookes, Clinton; Nguyen, Kien; Denman, Simon; Sridharan, Sridha; Dionisio, Sasha.

Epilepsy Behav ; 82: 17-24, 2018 05.

Article in English | MEDLINE | ID: mdl-29574299

ABSTRACT

Semiology observation and characterization play a major role in the presurgical evaluation of epilepsy. However, the interpretation of patient movements has subjective and intrinsic challenges. In this paper, we develop approaches to attempt to automatically extract and classify semiological patterns from facial expressions. We address limitations of existing computer-based analytical approaches of epilepsy monitoring, where facial movements have largely been ignored. This is an area that has seen limited advances in the literature. Inspired by recent advances in deep learning, we propose two deep learning models, landmark-based and region-based, to quantitatively identify changes in facial semiology in patients with mesial temporal lobe epilepsy (MTLE) from spontaneous expressions during phase I monitoring. A dataset has been collected from the Mater Advanced Epilepsy Unit (Brisbane, Australia) and is used to evaluate our proposed approach. Our experiments show that a landmark-based approach achieves promising results in analyzing facial semiology, where movements can be effectively marked and tracked when there is a frontal face on visualization. However, the region-based counterpart with spatiotemporal features achieves more accurate results when confronted with extreme head positions. A multifold cross-validation of the region-based approach exhibited an average test accuracy of 95.19% and an average AUC of 0.98 of the ROC curve. Conversely, a leave-one-subject-out cross-validation scheme for the same approach reveals a reduction in accuracy for the model as it is affected by data limitations and achieves an average test accuracy of 50.85%. Overall, the proposed deep learning models have shown promise in quantifying ictal facial movements in patients with MTLE. In turn, this may serve to enhance the automated presurgical epilepsy evaluation by allowing for standardization, mitigating bias, and assessing key features. The computer-aided diagnosis may help to support clinical decision-making and prevent erroneous localization and surgery.

Subject(s)

Biometric Identification/methods , Diagnosis, Computer-Assisted/methods , Epilepsy/diagnosis , Video Recording/methods , Australia/epidemiology , Biometric Identification/standards , Diagnosis, Computer-Assisted/standards , Epilepsy/epidemiology , Epilepsy/physiopathology , Face/anatomy & histology , Face/physiology , Humans , Male , Movement/physiology , Neurologic Examination/methods , Neurologic Examination/standards , Reproducibility of Results , Video Recording/standards

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL