Search | VHL Regional Portal

Sign Language Motion Generation from Sign Characteristics.

Gil-Martín, Manuel; Villa-Monedero, María; Pomirski, Andrzej; Sáez-Trigueros, Daniel; San-Segundo, Rubén.

Sensors (Basel) ; 23(23)2023 Nov 23.

Article in English | MEDLINE | ID: mdl-38067738

ABSTRACT

This paper proposes, analyzes, and evaluates a deep learning architecture based on transformers for generating sign language motion from sign phonemes (represented using HamNoSys: a notation system developed at the University of Hamburg). The sign phonemes provide information about sign characteristics like hand configuration, localization, or movements. The use of sign phonemes is crucial for generating sign motion with a high level of details (including finger extensions and flexions). The transformer-based approach also includes a stop detection module for predicting the end of the generation process. Both aspects, motion generation and stop detection, are evaluated in detail. For motion generation, the dynamic time warping distance is used to compute the similarity between two landmarks sequences (ground truth and generated). The stop detection module is evaluated considering detection accuracy and ROC (receiver operating characteristic) curves. The paper proposes and evaluates several strategies to obtain the system configuration with the best performance. These strategies include different padding strategies, interpolation approaches, and data augmentation techniques. The best configuration of a fully automatic system obtains an average DTW distance per frame of 0.1057 and an area under the ROC curve (AUC) higher than 0.94.

Subject(s)

Algorithms , Sign Language , Humans , Motion , Movement , Hand

Sign Language Dataset for Automatic Motion Generation.

Villa-Monedero, María; Gil-Martín, Manuel; Sáez-Trigueros, Daniel; Pomirski, Andrzej; San-Segundo, Rubén.

J Imaging ; 9(12)2023 Nov 27.

Article in English | MEDLINE | ID: mdl-38132680

ABSTRACT

Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37.

Reducing the Impact of Sensor Orientation Variability in Human Activity Recognition Using a Consistent Reference System.

Gil-Martín, Manuel; López-Iniesta, Javier; Fernández-Martínez, Fernando; San-Segundo, Rubén.

Sensors (Basel) ; 23(13)2023 Jun 23.

Article in English | MEDLINE | ID: mdl-37447695

ABSTRACT

Sensor- orientation is a critical aspect in a Human Activity Recognition (HAR) system based on tri-axial signals (such as accelerations); different sensors orientations introduce important errors in the activity recognition process. This paper proposes a new preprocessing module to reduce the negative impact of sensor-orientation variability in HAR. Firstly, this module estimates a consistent reference system; then, the tri-axial signals recorded from sensors with different orientations are transformed into this consistent reference system. This new preprocessing has been evaluated to mitigate the effect of different sensor orientations on the classification accuracy in several state-of-the-art HAR systems. The experiments were carried out using a subject-wise cross-validation methodology over six different datasets, including movements and postures. This new preprocessing module provided robust HAR performance even when sudden sensor orientation changes were included during data collection in the six different datasets. As an example, for the WISDM dataset, sensors with different orientations provoked a significant reduction in the classification accuracy of the state-of-the-art system (from 91.57 ± 0.23% to 89.19 ± 0.26%). This important reduction was recovered with the proposed algorithm, increasing the accuracy to 91.46 ± 0.30%, i.e., the same result obtained when all sensors had the same orientation.

Subject(s)

Algorithms , Human Activities , Humans , Acceleration , Movement , Posture

Scoring Performance on the Y-Balance Test Using a Deep Learning Approach.

Gil-Martín, Manuel; Johnston, William; San-Segundo, Rubén; Caulfield, Brian.

Sensors (Basel) ; 21(21)2021 Oct 26.

Article in English | MEDLINE | ID: mdl-34770417

ABSTRACT

The Y Balance Test (YBT) is a dynamic balance assessment typically used in sports medicine. This work proposes a deep learning approach to automatically score this YBT by estimating the normalized reach distance (NRD) using a wearable sensor to register inertial signals during the movement. This paper evaluates several signal processing techniques to extract relevant information to feed the deep neural network. This evaluation was performed using a state-of-the-art human activity recognition system based on recurrent neural networks (RNNs). This deep neural network includes long short-term memory (LSTM) layers to learn features from time series by modeling temporal patterns and an additional fully connected layer to estimate the NRD (normalized by the leg length). All analyses were carried out using a dataset with YBT assessments from 407 subjects, including young and middle-aged volunteers and athletes from different sports. This dataset allowed developing a global and robust solution for scoring the YBT in a wide range of applications. The experimentation setup considered a 10-fold subject-wise cross-validation using training, validation, and testing subsets. The mean absolute percentage error (MAPE) obtained was 7.88 ± 0.20%. Moreover, this work proposes specific regression systems to estimate the NRD for each direction separately, obtaining an average MAPE of 7.33 ± 0.26%. This deep learning approach was compared to a previous work using dynamic time warping and k-NN algorithms, obtaining a relative MAPE reduction of 10%.

Subject(s)

Deep Learning , Algorithms , Humans , Middle Aged , Movement , Neural Networks, Computer , Signal Processing, Computer-Assisted

Classification of epileptic EEG recordings using signal transforms and convolutional neural networks.

San-Segundo, Rubén; Gil-Martín, Manuel; D'Haro-Enríquez, Luis Fernando; Pardo, José Manuel.

Comput Biol Med ; 109: 148-158, 2019 06.

Article in English | MEDLINE | ID: mdl-31055181

ABSTRACT

This paper describes the analysis of a deep neural network for the classification of epileptic EEG signals. The deep learning architecture is made up of two convolutional layers for feature extraction and three fully-connected layers for classification. We evaluated several EEG signal transforms for generating the inputs to the deep neural network: Fourier, wavelet and empirical mode decomposition. This analysis was carried out using two public datasets (Bern-Barcelona EEG and Epileptic Seizure Recognition datasets) obtaining significant improvements in accuracy. For the Bern-Barcelona EEG, we obtained an increase in accuracy from 92.3% to 98.9% when classifying between focal and non-focal signals using the empirical mode decomposition. For the Epileptic Seizure Recognition dataset, we evaluated several scenarios for seizure detection obtaining the best results when using the Fourier transform. The accuracy increased from 99.0% to 99.5% for classifying non-seizure vs. seizure recordings, from 91.7% to 96.5% when differentiating between healthy, non-focal and seizure recordings, and from 89.0% to 95.7% when considering healthy, focal and seizure recordings.

Subject(s)

Deep Learning , Electroencephalography , Epilepsy/physiopathology , Signal Processing, Computer-Assisted , Humans

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL