Search | VHL Regional Portal

Sign Language Motion Generation from Sign Characteristics.

Gil-Martín, Manuel; Villa-Monedero, María; Pomirski, Andrzej; Sáez-Trigueros, Daniel; San-Segundo, Rubén.

Sensors (Basel) ; 23(23)2023 Nov 23.

Article in English | MEDLINE | ID: mdl-38067738

ABSTRACT

This paper proposes, analyzes, and evaluates a deep learning architecture based on transformers for generating sign language motion from sign phonemes (represented using HamNoSys: a notation system developed at the University of Hamburg). The sign phonemes provide information about sign characteristics like hand configuration, localization, or movements. The use of sign phonemes is crucial for generating sign motion with a high level of details (including finger extensions and flexions). The transformer-based approach also includes a stop detection module for predicting the end of the generation process. Both aspects, motion generation and stop detection, are evaluated in detail. For motion generation, the dynamic time warping distance is used to compute the similarity between two landmarks sequences (ground truth and generated). The stop detection module is evaluated considering detection accuracy and ROC (receiver operating characteristic) curves. The paper proposes and evaluates several strategies to obtain the system configuration with the best performance. These strategies include different padding strategies, interpolation approaches, and data augmentation techniques. The best configuration of a fully automatic system obtains an average DTW distance per frame of 0.1057 and an area under the ROC curve (AUC) higher than 0.94.

Subject(s)

Algorithms , Sign Language , Humans , Motion , Movement , Hand

Sign Language Dataset for Automatic Motion Generation.

Villa-Monedero, María; Gil-Martín, Manuel; Sáez-Trigueros, Daniel; Pomirski, Andrzej; San-Segundo, Rubén.

J Imaging ; 9(12)2023 Nov 27.

Article in English | MEDLINE | ID: mdl-38132680

ABSTRACT

Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37.

Reducing the Impact of Sensor Orientation Variability in Human Activity Recognition Using a Consistent Reference System.

Gil-Martín, Manuel; López-Iniesta, Javier; Fernández-Martínez, Fernando; San-Segundo, Rubén.

Sensors (Basel) ; 23(13)2023 Jun 23.

Article in English | MEDLINE | ID: mdl-37447695

ABSTRACT

Sensor- orientation is a critical aspect in a Human Activity Recognition (HAR) system based on tri-axial signals (such as accelerations); different sensors orientations introduce important errors in the activity recognition process. This paper proposes a new preprocessing module to reduce the negative impact of sensor-orientation variability in HAR. Firstly, this module estimates a consistent reference system; then, the tri-axial signals recorded from sensors with different orientations are transformed into this consistent reference system. This new preprocessing has been evaluated to mitigate the effect of different sensor orientations on the classification accuracy in several state-of-the-art HAR systems. The experiments were carried out using a subject-wise cross-validation methodology over six different datasets, including movements and postures. This new preprocessing module provided robust HAR performance even when sudden sensor orientation changes were included during data collection in the six different datasets. As an example, for the WISDM dataset, sensors with different orientations provoked a significant reduction in the classification accuracy of the state-of-the-art system (from 91.57 ± 0.23% to 89.19 ± 0.26%). This important reduction was recovered with the proposed algorithm, increasing the accuracy to 91.46 ± 0.30%, i.e., the same result obtained when all sensors had the same orientation.

Subject(s)

Algorithms , Human Activities , Humans , Acceleration , Movement , Posture

Use of Laughter for the Detection of Parkinson's Disease: Feasibility Study for Clinical Decision Support Systems, Based on Speech Recognition and Automatic Classification Techniques.

Terriza, Miguel; Navarro, Jorge; Retuerta, Irene; Alfageme, Nuria; San-Segundo, Ruben; Kontaxakis, George; Garcia-Martin, Elena; Marijuan, Pedro C; Panetsos, Fivos.

Int J Environ Res Public Health ; 19(17)2022 09 01.

Article in English | MEDLINE | ID: mdl-36078600

ABSTRACT

Parkinson's disease (PD) is an incurable neurodegenerative disorder which affects over 10 million people worldwide. Early detection and correct evaluation of the disease is critical for appropriate medication and to slow the advance of the symptoms. In this scenario, it is critical to develop clinical decision support systems contributing to an early, efficient, and reliable diagnosis of this illness. In this paper we present a feasibility study for a clinical decision support system for the diagnosis of PD based on the acoustic characteristics of laughter. Our decision support system is based on laugh analysis with speech recognition methods and automatic classification techniques. We evaluated different cepstral coefficients to identify laugh characteristics of healthy and ill subjects combined with machine learning classification models. The decision support system reached 83% accuracy rate with an AUC value of 0.86 for PD-healthy laughs classification in a database of 20,000 samples randomly generated from a pool of 120 laughs from healthy and PD subjects. Laughter could be employed for the efficient and reliable detection of PD; such a detection system can be achieved using speech recognition and automatic classification techniques; a clinical decision support system can be built using the above techniques. Significance: PD clinical decision support systems for the early detection of the disease will help to improve the efficiency of available and upcoming therapeutic treatments which, in turn, would improve life conditions of the affected people and would decrease costs and efforts in public and private healthcare systems.

Subject(s)

Decision Support Systems, Clinical , Laughter , Parkinson Disease , Speech Perception , Feasibility Studies , Humans , Parkinson Disease/diagnosis

Scoring Performance on the Y-Balance Test Using a Deep Learning Approach.

Gil-Martín, Manuel; Johnston, William; San-Segundo, Rubén; Caulfield, Brian.

Sensors (Basel) ; 21(21)2021 Oct 26.

Article in English | MEDLINE | ID: mdl-34770417

ABSTRACT

The Y Balance Test (YBT) is a dynamic balance assessment typically used in sports medicine. This work proposes a deep learning approach to automatically score this YBT by estimating the normalized reach distance (NRD) using a wearable sensor to register inertial signals during the movement. This paper evaluates several signal processing techniques to extract relevant information to feed the deep neural network. This evaluation was performed using a state-of-the-art human activity recognition system based on recurrent neural networks (RNNs). This deep neural network includes long short-term memory (LSTM) layers to learn features from time series by modeling temporal patterns and an additional fully connected layer to estimate the NRD (normalized by the leg length). All analyses were carried out using a dataset with YBT assessments from 407 subjects, including young and middle-aged volunteers and athletes from different sports. This dataset allowed developing a global and robust solution for scoring the YBT in a wide range of applications. The experimentation setup considered a 10-fold subject-wise cross-validation using training, validation, and testing subsets. The mean absolute percentage error (MAPE) obtained was 7.88 ± 0.20%. Moreover, this work proposes specific regression systems to estimate the NRD for each direction separately, obtaining an average MAPE of 7.33 ± 0.26%. This deep learning approach was compared to a previous work using dynamic time warping and k-NN algorithms, obtaining a relative MAPE reduction of 10%.

Subject(s)

Deep Learning , Algorithms , Humans , Middle Aged , Movement , Neural Networks, Computer , Signal Processing, Computer-Assisted

Parkinson's Disease Tremor Detection in the Wild Using Wearable Accelerometers.

San-Segundo, Rubén; Zhang, Ada; Cebulla, Alexander; Panev, Stanislav; Tabor, Griffin; Stebbins, Katelyn; Massa, Robyn E; Whitford, Andrew; de la Torre, Fernando; Hodgins, Jessica.

Sensors (Basel) ; 20(20)2020 Oct 14.

Article in English | MEDLINE | ID: mdl-33066691

ABSTRACT

Continuous in-home monitoring of Parkinson's Disease (PD) symptoms might allow improvements in assessment of disease progression and treatment effects. As a first step towards this goal, we evaluate the feasibility of a wrist-worn wearable accelerometer system to detect PD tremor in the wild (uncontrolled scenarios). We evaluate the performance of several feature sets and classification algorithms for robust PD tremor detection in laboratory and wild settings. We report results for both laboratory data with accurate labels and wild data with weak labels. The best performance was obtained using a combination of a pre-processing module to extract information from the tremor spectrum (based on non-negative factorization) and a deep neural network for learning relevant features and detecting tremor segments. We show how the proposed method is able to predict patient self-report measures, and we propose a new metric for monitoring PD tremor (i.e., percentage of tremor over long periods of time), which may be easier to estimate the start and end time points of each tremor event while still providing clinically useful information.

Subject(s)

Accelerometry/instrumentation , Neural Networks, Computer , Parkinson Disease , Tremor , Wearable Electronic Devices , Deep Learning , Humans , Parkinson Disease/diagnosis , Tremor/diagnosis

Classification of epileptic EEG recordings using signal transforms and convolutional neural networks.

San-Segundo, Rubén; Gil-Martín, Manuel; D'Haro-Enríquez, Luis Fernando; Pardo, José Manuel.

Comput Biol Med ; 109: 148-158, 2019 06.

Article in English | MEDLINE | ID: mdl-31055181

ABSTRACT

This paper describes the analysis of a deep neural network for the classification of epileptic EEG signals. The deep learning architecture is made up of two convolutional layers for feature extraction and three fully-connected layers for classification. We evaluated several EEG signal transforms for generating the inputs to the deep neural network: Fourier, wavelet and empirical mode decomposition. This analysis was carried out using two public datasets (Bern-Barcelona EEG and Epileptic Seizure Recognition datasets) obtaining significant improvements in accuracy. For the Bern-Barcelona EEG, we obtained an increase in accuracy from 92.3% to 98.9% when classifying between focal and non-focal signals using the empirical mode decomposition. For the Epileptic Seizure Recognition dataset, we evaluated several scenarios for seizure detection obtaining the best results when using the Fourier transform. The accuracy increased from 99.0% to 99.5% for classifying non-seizure vs. seizure recordings, from 91.7% to 96.5% when differentiating between healthy, non-focal and seizure recordings, and from 89.0% to 95.7% when considering healthy, focal and seizure recordings.

Subject(s)

Deep Learning , Electroencephalography , Epilepsy/physiopathology , Signal Processing, Computer-Assisted , Humans

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL