Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 18.112
Filter
1.
Sci Rep ; 14(1): 11744, 2024 05 23.
Article in English | MEDLINE | ID: mdl-38778042

ABSTRACT

Sensorimotor impairments, resulting from conditions like stroke and amputations, can profoundly impact an individual's functional abilities and overall quality of life. Assistive and rehabilitation devices such as prostheses, exo-skeletons, and serious gaming in virtual environments can help to restore some degree of function and alleviate pain after sensorimotor impairments. Myoelectric pattern recognition (MPR) has gained popularity in the past decades as it provides superior control over said devices, and therefore efforts to facilitate and improve performance in MPR can result in better rehabilitation outcomes. One possibility to enhance MPR is to employ transcranial direct current stimulation (tDCS) to facilitate motor learning. Twelve healthy able-bodied individuals participated in this crossover study to determine the effect of tDCS on MPR performance. Baseline training was followed by two sessions of either sham or anodal tDCS using the dominant and non-dominant arms. Assignments were randomized, and the MPR task consisted of 11 different hand/wrist movements, including rest or no movement. Surface electrodes were used to record EMG and the MPR open-source platform, BioPatRec, was used for decoding motor volition in real-time. The motion test was used to evaluate performance. We hypothesized that using anodal tDCS to increase the excitability of the primary motor cortex associated with non-dominant side in able-bodied individuals, will improve motor learning and thus MPR performance. Overall, we found that tDCS enhanced MPR performance, particularly in the non-dominant side. We were able to reject the null hypothesis and improvements in the motion test's completion rate during tDCS (28% change, p-value: 0.023) indicate its potential as an adjunctive tool to enhance MPR and motor learning. tDCS appears promising as a tool to enhance the learning phase of using assistive devices using MPR, such as myoelectric prostheses.


Subject(s)
Electromyography , Transcranial Direct Current Stimulation , Humans , Transcranial Direct Current Stimulation/methods , Male , Female , Adult , Electromyography/methods , Young Adult , Cross-Over Studies , Motor Cortex/physiology , Pattern Recognition, Automated/methods
2.
Sci Adv ; 10(21): eadl2882, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781346

ABSTRACT

Neuromorphic sensors, designed to emulate natural sensory systems, hold the promise of revolutionizing data extraction by facilitating rapid and energy-efficient analysis of extensive datasets. However, a challenge lies in accurately distinguishing specific analytes within mixtures of chemically similar compounds using existing neuromorphic chemical sensors. In this study, we present an artificial olfactory system (AOS), developed through the integration of human olfactory receptors (hORs) and artificial synapses. This AOS is engineered by interfacing an hOR-functionalized extended gate with an organic synaptic device. The AOS generates distinct patterns for odorants and mixtures thereof, at the molecular chain length level, attributed to specific hOR-odorant binding affinities. This approach enables precise pattern recognition via training and inference simulations. These findings establish a foundation for the development of high-performance sensor platforms and artificial sensory systems, which are ideal for applications in wearable and implantable devices.


Subject(s)
Odorants , Receptors, Odorant , Humans , Receptors, Odorant/metabolism , Odorants/analysis , Smell/physiology , Synapses/metabolism , Pattern Recognition, Automated/methods , Olfactory Receptor Neurons/metabolism , Olfactory Receptor Neurons/physiology , Biosensing Techniques/methods
3.
PLoS One ; 19(5): e0298373, 2024.
Article in English | MEDLINE | ID: mdl-38691542

ABSTRACT

Pulse repetition interval modulation (PRIM) is integral to radar identification in modern electronic support measure (ESM) and electronic intelligence (ELINT) systems. Various distortions, including missing pulses, spurious pulses, unintended jitters, and noise from radar antenna scans, often hinder the accurate recognition of PRIM. This research introduces a novel three-stage approach for PRIM recognition, emphasizing the innovative use of PRI sound. A transfer learning-aided deep convolutional neural network (DCNN) is initially used for feature extraction. This is followed by an extreme learning machine (ELM) for real-time PRIM classification. Finally, a gray wolf optimizer (GWO) refines the network's robustness. To evaluate the proposed method, we develop a real experimental dataset consisting of sound of six common PRI patterns. We utilized eight pre-trained DCNN architectures for evaluation, with VGG16 and ResNet50V2 notably achieving recognition accuracies of 97.53% and 96.92%. Integrating ELM and GWO further optimized the accuracy rates to 98.80% and 97.58. This research advances radar identification by offering an enhanced method for PRIM recognition, emphasizing the potential of PRI sound to address real-world distortions in ESM and ELINT systems.


Subject(s)
Deep Learning , Neural Networks, Computer , Sound , Radar , Algorithms , Pattern Recognition, Automated/methods
4.
J Neural Eng ; 21(3)2024 May 17.
Article in English | MEDLINE | ID: mdl-38722304

ABSTRACT

Discrete myoelectric control-based gesture recognition has recently gained interest as a possible input modality for many emerging ubiquitous computing applications. Unlike the continuous control commonly employed in powered prostheses, discrete systems seek to recognize the dynamic sequences associated with gestures to generate event-based inputs. More akin to those used in general-purpose human-computer interaction, these could include, for example, a flick of the wrist to dismiss a phone call or a double tap of the index finger and thumb to silence an alarm. Moelectric control systems have been shown to achieve near-perfect classification accuracy, but in highly constrained offline settings. Real-world, online systems are subject to 'confounding factors' (i.e. factors that hinder the real-world robustness of myoelectric control that are not accounted for during typical offline analyses), which inevitably degrade system performance, limiting their practical use. Although these factors have been widely studied in continuous prosthesis control, there has been little exploration of their impacts on discrete myoelectric control systems for emerging applications and use cases. Correspondingly, this work examines, for the first time, three confounding factors and their effect on the robustness of discrete myoelectric control: (1)limb position variability, (2)cross-day use, and a newly identified confound faced by discrete systems (3)gesture elicitation speed. Results from four different discrete myoelectric control architectures: (1) Majority Vote LDA, (2) Dynamic Time Warping, (3) an LSTM network trained with Cross Entropy, and (4) an LSTM network trained with Contrastive Learning, show that classification accuracy is significantly degraded (p<0.05) as a result of each of these confounds. This work establishes that confounding factors are a critical barrier that must be addressed to enable the real-world adoption of discrete myoelectric control for robust and reliable gesture recognition.


Subject(s)
Electromyography , Gestures , Pattern Recognition, Automated , Humans , Electromyography/methods , Male , Pattern Recognition, Automated/methods , Female , Adult , Young Adult , Artificial Limbs
5.
J Neural Eng ; 21(3)2024 May 17.
Article in English | MEDLINE | ID: mdl-38757187

ABSTRACT

Objective.Aiming for the research on the brain-computer interface (BCI), it is crucial to design a MI-EEG recognition model, possessing a high classification accuracy and strong generalization ability, and not relying on a large number of labeled training samples.Approach.In this paper, we propose a self-supervised MI-EEG recognition method based on self-supervised learning with one-dimensional multi-task convolutional neural networks and long short-term memory (1-D MTCNN-LSTM). The model is divided into two stages: signal transform identification stage and pattern recognition stage. In the signal transform recognition phase, the signal transform dataset is recognized by the upstream 1-D MTCNN-LSTM network model. Subsequently, the backbone network from the signal transform identification phase is transferred to the pattern recognition phase. Then, it is fine-tuned using a trace amount of labeled data to finally obtain the motion recognition model.Main results.The upstream stage of this study achieves more than 95% recognition accuracy for EEG signal transforms, up to 100%. For MI-EEG pattern recognition, the model obtained recognition accuracies of 82.04% and 87.14% with F1 scores of 0.7856 and 0.839 on the datasets of BCIC-IV-2b and BCIC-IV-2a.Significance.The improved accuracy proves the superiority of the proposed method. It is prospected to be a method for accurate classification of MI-EEG in the BCI system.


Subject(s)
Brain-Computer Interfaces , Electroencephalography , Imagination , Neural Networks, Computer , Electroencephalography/methods , Humans , Imagination/physiology , Supervised Machine Learning , Pattern Recognition, Automated/methods
6.
PLoS One ; 19(5): e0302590, 2024.
Article in English | MEDLINE | ID: mdl-38758731

ABSTRACT

Automatic Urdu handwritten text recognition is a challenging task in the OCR industry. Unlike printed text, Urdu handwriting lacks a uniform font and structure. This lack of uniformity causes data inconsistencies and recognition issues. Different writing styles, cursive scripts, and limited data make Urdu text recognition a complicated task. Major languages, such as English, have experienced advances in automated recognition, whereas low-resource languages, such as Urdu, still lag. Transformer-based models are promising for automated recognition in high- and low-resource languages such as Urdu. This paper presents a transformer-based method called ET-Network that integrates self-attention into EfficientNet for feature extraction and a transformer for language modeling. The use of self-attention layers in EfficientNet helps to extract global and local features that capture long-range dependencies. These features proceeded into a vanilla transformer to generate text, and a prefix beam search is used for the finest outcome. NUST-UHWR, UPTI2.0, and MMU-OCR-21 are three datasets used to train and test the ET Network for a handwritten Urdu script. The ET-Network improved the character error rate by 4% and the word error rate by 1.55%, while establishing a new state-of-the-art character error rate of 5.27% and a word error rate of 19.09% for Urdu handwritten text.


Subject(s)
Deep Learning , Handwriting , Humans , Language , Pattern Recognition, Automated/methods , Algorithms
7.
PLoS One ; 19(5): e0301862, 2024.
Article in English | MEDLINE | ID: mdl-38753628

ABSTRACT

Recognition of the key text of the Chinese seal can speed up the approval of documents, and improve the office efficiency of enterprises or government administrative departments. Due to image blurring and occlusion, the accuracy of Chinese seal recognition is low. In addition, the real dataset is very limited. In order to solve these problems, we improve the differentiable binarization detection algorithm (DBnet) to construct a model DB-ECA for text region detection, and propose a model named LSTR (Lightweight Seal Text Recognition) for text recognition. The efficient channel attention module is added to the differentiable binarization network to solve the feature pyramid conflict, and the convolutional layer network structure is improved to delay downsampling for reducing semantic feature loss. LSTR uses a lightweight CNN more suitable for small-sample generalization, and dynamically fuses positional and visual information through a self-attention-based inference layer to predict the label distribution of feature sequences in parallel. The inference layer not only solves the weak discriminative power of CNN in the shallow layer, but also facilitates CTC (Connectionist Temporal Classification) to accurately align the feature region with the target character. Experiments on the homemade dataset in this paper, DB-ECA compared with the other five commonly used detection models, the precision, recall, F-measure are the best effect of 90.29, 85.17, 87.65, respectively. LSTR compared with the other five kinds of recognition models in the last three years, to achieve the highest effect of accuracy 91.29%, and has the advantages of a small number of parameters and fast inference. The experimental results fully prove the innovation and effectiveness of our model.


Subject(s)
Algorithms , Neural Networks, Computer , Pattern Recognition, Automated/methods
8.
Sensors (Basel) ; 24(9)2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38732808

ABSTRACT

Currently, surface EMG signals have a wide range of applications in human-computer interaction systems. However, selecting features for gesture recognition models based on traditional machine learning can be challenging and may not yield satisfactory results. Considering the strong nonlinear generalization ability of neural networks, this paper proposes a two-stream residual network model with an attention mechanism for gesture recognition. One branch processes surface EMG signals, while the other processes hand acceleration signals. Segmented networks are utilized to fully extract the physiological and kinematic features of the hand. To enhance the model's capacity to learn crucial information, we introduce an attention mechanism after global average pooling. This mechanism strengthens relevant features and weakens irrelevant ones. Finally, the deep features obtained from the two branches of learning are fused to further improve the accuracy of multi-gesture recognition. The experiments conducted on the NinaPro DB2 public dataset resulted in a recognition accuracy of 88.25% for 49 gestures. This demonstrates that our network model can effectively capture gesture features, enhancing accuracy and robustness across various gestures. This approach to multi-source information fusion is expected to provide more accurate and real-time commands for exoskeleton robots and myoelectric prosthetic control systems, thereby enhancing the user experience and the naturalness of robot operation.


Subject(s)
Electromyography , Gestures , Neural Networks, Computer , Humans , Electromyography/methods , Signal Processing, Computer-Assisted , Pattern Recognition, Automated/methods , Acceleration , Algorithms , Hand/physiology , Machine Learning , Biomechanical Phenomena/physiology
9.
Sensors (Basel) ; 24(9)2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38732843

ABSTRACT

As the number of electronic gadgets in our daily lives is increasing and most of them require some kind of human interaction, this demands innovative, convenient input methods. There are limitations to state-of-the-art (SotA) ultrasound-based hand gesture recognition (HGR) systems in terms of robustness and accuracy. This research presents a novel machine learning (ML)-based end-to-end solution for hand gesture recognition with low-cost micro-electromechanical (MEMS) system ultrasonic transducers. In contrast to prior methods, our ML model processes the raw echo samples directly instead of using pre-processed data. Consequently, the processing flow presented in this work leaves it to the ML model to extract the important information from the echo data. The success of this approach is demonstrated as follows. Four MEMS ultrasonic transducers are placed in three different geometrical arrangements. For each arrangement, different types of ML models are optimized and benchmarked on datasets acquired with the presented custom hardware (HW): convolutional neural networks (CNNs), gated recurrent units (GRUs), long short-term memory (LSTM), vision transformer (ViT), and cross-attention multi-scale vision transformer (CrossViT). The three last-mentioned ML models reached more than 88% accuracy. The most important innovation described in this research paper is that we were able to demonstrate that little pre-processing is necessary to obtain high accuracy in ultrasonic HGR for several arrangements of cost-effective and low-power MEMS ultrasonic transducer arrays. Even the computationally intensive Fourier transform can be omitted. The presented approach is further compared to HGR systems using other sensor types such as vision, WiFi, radar, and state-of-the-art ultrasound-based HGR systems. Direct processing of the sensor signals by a compact model makes ultrasonic hand gesture recognition a true low-cost and power-efficient input method.


Subject(s)
Gestures , Hand , Machine Learning , Neural Networks, Computer , Humans , Hand/physiology , Pattern Recognition, Automated/methods , Ultrasonography/methods , Ultrasonography/instrumentation , Ultrasonics/instrumentation , Algorithms
10.
Sensors (Basel) ; 24(9)2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38732846

ABSTRACT

Brain-computer interfaces (BCIs) allow information to be transmitted directly from the human brain to a computer, enhancing the ability of human brain activity to interact with the environment. In particular, BCI-based control systems are highly desirable because they can control equipment used by people with disabilities, such as wheelchairs and prosthetic legs. BCIs make use of electroencephalograms (EEGs) to decode the human brain's status. This paper presents an EEG-based facial gesture recognition method based on a self-organizing map (SOM). The proposed facial gesture recognition uses α, ß, and θ power bands of the EEG signals as the features of the gesture. The SOM-Hebb classifier is utilized to classify the feature vectors. We utilized the proposed method to develop an online facial gesture recognition system. The facial gestures were defined by combining facial movements that are easy to detect in EEG signals. The recognition accuracy of the system was examined through experiments. The recognition accuracy of the system ranged from 76.90% to 97.57% depending on the number of gestures recognized. The lowest accuracy (76.90%) occurred when recognizing seven gestures, though this is still quite accurate when compared to other EEG-based recognition systems. The implemented online recognition system was developed using MATLAB, and the system took 5.7 s to complete the recognition flow.


Subject(s)
Brain-Computer Interfaces , Electroencephalography , Gestures , Humans , Electroencephalography/methods , Face/physiology , Algorithms , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Brain/physiology , Male
11.
Sensors (Basel) ; 24(9)2024 May 05.
Article in English | MEDLINE | ID: mdl-38733038

ABSTRACT

With the continuous advancement of autonomous driving and monitoring technologies, there is increasing attention on non-intrusive target monitoring and recognition. This paper proposes an ArcFace SE-attention model-agnostic meta-learning approach (AS-MAML) by integrating attention mechanisms into residual networks for pedestrian gait recognition using frequency-modulated continuous-wave (FMCW) millimeter-wave radar through meta-learning. We enhance the feature extraction capability of the base network using channel attention mechanisms and integrate the additive angular margin loss function (ArcFace loss) into the inner loop of MAML to constrain inner loop optimization and improve radar discrimination. Then, this network is used to classify small-sample micro-Doppler images obtained from millimeter-wave radar as the data source for pose recognition. Experimental tests were conducted on pose estimation and image classification tasks. The results demonstrate significant detection and recognition performance, with an accuracy of 94.5%, accompanied by a 95% confidence interval. Additionally, on the open-source dataset DIAT-µRadHAR, which is specially processed to increase classification difficulty, the network achieves a classification accuracy of 85.9%.


Subject(s)
Pedestrians , Radar , Humans , Algorithms , Gait/physiology , Pattern Recognition, Automated/methods , Machine Learning
12.
Sci Rep ; 14(1): 12002, 2024 05 25.
Article in English | MEDLINE | ID: mdl-38796559

ABSTRACT

To address several common problems of finger vein recognition, a lightweight finger vein recognition algorithm by means of a small sample has been proposed in this study. First of all, a Gabor filter is applied to deal with the images for the purpose of that these processed images can simulate a kind of situation of finger vein at low temperature, such that the generalization ability of the algorithm model can be improved as well. By cutting down the amount of convolutional layers and fully connected layers in VGG-19, a lightweight network can be given. Meanwhile, the activation function of some convolutional layers is replaced to protect the network weight that can be updated successfully. After then, a multi-attention mechanism is introduced to the modified network architecture to result in improving the ability of extracting important features. Finally, a strategy based on transfer learning has been used to reduce the training time in the model training phase. Honestly, it is obvious that the proposed finger vein recognition algorithm has a good performance in recognition accuracy, robustness and speed. The experimental results show that the recognition accuracy can arrive at about 98.45%, which has had better performance in comparison with some existing algorithms.


Subject(s)
Algorithms , Fingers , Veins , Humans , Fingers/blood supply , Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Pattern Recognition, Automated/methods
13.
Sci Rep ; 14(1): 10560, 2024 05 08.
Article in English | MEDLINE | ID: mdl-38720020

ABSTRACT

The research on video analytics especially in the area of human behavior recognition has become increasingly popular recently. It is widely applied in virtual reality, video surveillance, and video retrieval. With the advancement of deep learning algorithms and computer hardware, the conventional two-dimensional convolution technique for training video models has been replaced by three-dimensional convolution, which enables the extraction of spatio-temporal features. Specifically, the use of 3D convolution in human behavior recognition has been the subject of growing interest. However, the increased dimensionality has led to challenges such as the dramatic increase in the number of parameters, increased time complexity, and a strong dependence on GPUs for effective spatio-temporal feature extraction. The training speed can be considerably slow without the support of powerful GPU hardware. To address these issues, this study proposes an Adaptive Time Compression (ATC) module. Functioning as an independent component, ATC can be seamlessly integrated into existing architectures and achieves data compression by eliminating redundant frames within video data. The ATC module effectively reduces GPU computing load and time complexity with negligible loss of accuracy, thereby facilitating real-time human behavior recognition.


Subject(s)
Algorithms , Data Compression , Video Recording , Humans , Data Compression/methods , Human Activities , Deep Learning , Image Processing, Computer-Assisted/methods , Pattern Recognition, Automated/methods
14.
Article in English | MEDLINE | ID: mdl-38771682

ABSTRACT

Gesture recognition has emerged as a significant research domain in computer vision and human-computer interaction. One of the key challenges in gesture recognition is how to select the most useful channels that can effectively represent gesture movements. In this study, we have developed a channel selection algorithm that determines the number and placement of sensors that are critical to gesture classification. To validate this algorithm, we constructed a Force Myography (FMG)-based signal acquisition system. The algorithm considers each sensor as a distinct channel, with the most effective channel combinations and recognition accuracy determined through assessing the correlation between each channel and the target gesture, as well as the redundant correlation between different channels. The database was created by collecting experimental data from 10 healthy individuals who wore 16 sensors to perform 13 unique hand gestures. The results indicate that the average number of channels across the 10 participants was 3, corresponding to an 75% decrease in the initial channel count, with an average recognition accuracy of 94.46%. This outperforms four widely adopted feature selection algorithms, including Relief-F, mRMR, CFS, and ILFS. Moreover, we have established a universal model for the position of gesture measurement points and verified it with an additional five participants, resulting in an average recognition accuracy of 96.3%. This study provides a sound basis for identifying the optimal and minimum number and location of channels on the forearm and designing specialized arm rings with unique shapes.


Subject(s)
Algorithms , Gestures , Pattern Recognition, Automated , Humans , Male , Female , Adult , Pattern Recognition, Automated/methods , Young Adult , Myography/methods , Hand/physiology , Healthy Volunteers , Reproducibility of Results
15.
Article in English | MEDLINE | ID: mdl-38683719

ABSTRACT

To overcome the challenges posed by the complex structure and large parameter requirements of existing classification models, the authors propose an improved extreme learning machine (ELM) classifier for human locomotion intent recognition in this study, resulting in enhanced classification accuracy. The structure of the ELM algorithm is enhanced using the logistic regression (LR) algorithm, significantly reducing the number of hidden layer nodes. Hence, this algorithm can be adopted for real-time human locomotion intent recognition on portable devices with only 234 parameters to store. Additionally, a hybrid grey wolf optimization and slime mould algorithm (GWO-SMA) is proposed to optimize the hidden layer bias of the improved ELM classifier. Numerical results demonstrate that the proposed model successfully recognizes nine daily motion modes including low-, mid-, and fast-speed level ground walking, ramp ascent/descent, sit/stand, and stair ascent/descent. Specifically, it achieves 96.75% accuracy with 5-fold cross-validation while maintaining a real-time prediction time of only 2 ms. These promising findings highlight the potential of onboard real-time recognition of continuous locomotion modes based on our model for the high-level control of powered knee prostheses.


Subject(s)
Algorithms , Amputees , Intention , Knee Prosthesis , Machine Learning , Humans , Amputees/rehabilitation , Male , Logistic Models , Locomotion/physiology , Walking , Femur , Pattern Recognition, Automated/methods , Adult
16.
Sensors (Basel) ; 24(8)2024 Apr 10.
Article in English | MEDLINE | ID: mdl-38676024

ABSTRACT

In recent decades, technological advancements have transformed the industry, highlighting the efficiency of automation and safety. The integration of augmented reality (AR) and gesture recognition has emerged as an innovative approach to create interactive environments for industrial equipment. Gesture recognition enhances AR applications by allowing intuitive interactions. This study presents a web-based architecture for the integration of AR and gesture recognition, designed to interact with industrial equipment. Emphasizing hardware-agnostic compatibility, the proposed structure offers an intuitive interaction with equipment control systems through natural gestures. Experimental validation, conducted using Google Glass, demonstrated the practical viability and potential of this approach in industrial operations. The development focused on optimizing the system's software and implementing techniques such as normalization, clamping, conversion, and filtering to achieve accurate and reliable gesture recognition under different usage conditions. The proposed approach promotes safer and more efficient industrial operations, contributing to research in AR and gesture recognition. Future work will include improving the gesture recognition accuracy, exploring alternative gestures, and expanding the platform integration to improve the user experience.


Subject(s)
Augmented Reality , Gestures , Humans , Industry , Software , Pattern Recognition, Automated/methods , User-Computer Interface
17.
Sensors (Basel) ; 24(8)2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38676108

ABSTRACT

Egocentric activity recognition is a prominent computer vision task that is based on the use of wearable cameras. Since egocentric videos are captured through the perspective of the person wearing the camera, her/his body motions severely complicate the video content, imposing several challenges. In this work we propose a novel approach for domain-generalized egocentric human activity recognition. Typical approaches use a large amount of training data, aiming to cover all possible variants of each action. Moreover, several recent approaches have attempted to handle discrepancies between domains with a variety of costly and mostly unsupervised domain adaptation methods. In our approach we show that through simple manipulation of available source domain data and with minor involvement from the target domain, we are able to produce robust models, able to adequately predict human activity in egocentric video sequences. To this end, we introduce a novel three-stream deep neural network architecture combining elements of vision transformers and residual neural networks which are trained using multi-modal data. We evaluate the proposed approach using a challenging, egocentric video dataset and demonstrate its superiority over recent, state-of-the-art research works.


Subject(s)
Neural Networks, Computer , Video Recording , Humans , Video Recording/methods , Algorithms , Pattern Recognition, Automated/methods , Image Processing, Computer-Assisted/methods , Human Activities , Wearable Electronic Devices
18.
Sensors (Basel) ; 24(8)2024 Apr 14.
Article in English | MEDLINE | ID: mdl-38676137

ABSTRACT

Human action recognition (HAR) is growing in machine learning with a wide range of applications. One challenging aspect of HAR is recognizing human actions while playing music, further complicated by the need to recognize the musical notes being played. This paper proposes a deep learning-based method for simultaneous HAR and musical note recognition in music performances. We conducted experiments on Morin khuur performances, a traditional Mongolian instrument. The proposed method consists of two stages. First, we created a new dataset of Morin khuur performances. We used motion capture systems and depth sensors to collect data that includes hand keypoints, instrument segmentation information, and detailed movement information. We then analyzed RGB images, depth images, and motion data to determine which type of data provides the most valuable features for recognizing actions and notes in music performances. The second stage utilizes a Spatial Temporal Attention Graph Convolutional Network (STA-GCN) to recognize musical notes as continuous gestures. The STA-GCN model is designed to learn the relationships between hand keypoints and instrument segmentation information, which are crucial for accurate recognition. Evaluation on our dataset demonstrates that our model outperforms the traditional ST-GCN model, achieving an accuracy of 81.4%.


Subject(s)
Deep Learning , Music , Humans , Neural Networks, Computer , Human Activities , Pattern Recognition, Automated/methods , Gestures , Algorithms , Movement/physiology
19.
Sensors (Basel) ; 24(8)2024 Apr 18.
Article in English | MEDLINE | ID: mdl-38676207

ABSTRACT

Teaching gesture recognition is a technique used to recognize the hand movements of teachers in classroom teaching scenarios. This technology is widely used in education, including for classroom teaching evaluation, enhancing online teaching, and assisting special education. However, current research on gesture recognition in teaching mainly focuses on detecting the static gestures of individual students and analyzing their classroom behavior. To analyze the teacher's gestures and mitigate the difficulty of single-target dynamic gesture recognition in multi-person teaching scenarios, this paper proposes skeleton-based teaching gesture recognition (ST-TGR), which learns through spatio-temporal representation. This method mainly uses the human pose estimation technique RTMPose to extract the coordinates of the keypoints of the teacher's skeleton and then inputs the recognized sequence of the teacher's skeleton into the MoGRU action recognition network for classifying gesture actions. The MoGRU action recognition module mainly learns the spatio-temporal representation of target actions by stacking a multi-scale bidirectional gated recurrent unit (BiGRU) and using improved attention mechanism modules. To validate the generalization of the action recognition network model, we conducted comparative experiments on datasets including NTU RGB+D 60, UT-Kinect Action3D, SBU Kinect Interaction, and Florence 3D. The results indicate that, compared with most existing baseline models, the model proposed in this article exhibits better performance in recognition accuracy and speed.


Subject(s)
Gestures , Humans , Pattern Recognition, Automated/methods , Algorithms , Teaching
20.
Asian Pac J Cancer Prev ; 25(4): 1265-1270, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38679986

ABSTRACT

PURPOSE: This study aims to compare the accuracy of the ADNEX MR scoring system and pattern recognition system to evaluate adnexal lesions indeterminate on the US exam. METHODS: In this cross-sectional retrospective study, pelvic DCE-MRI of 245 patients with 340 adnexal masses was studied based on the ADNEX MR scoring system and pattern recognition system. RESULTS: ADNEX MR scoring system with a sensitivity of 96.6% and specificity of 91% has an accuracy of 92.9%. The pattern recognition system's sensitivity, specificity, and accuracy are 95.8%, 93.3%, and 94.7%, respectively. PPV and NPV for the ADNEX MR scoring system were 85.1 and 98.1, respectively. PPV and NPV for the pattern recognition system were 89.7% and 97.7%, respectively. The area under the ROC curve for the ADNEX MR scoring system and pattern recognition system is 0.938 (95% CI, 0.909-0.967) and 0.950 (95% CI, 0.922-0.977). Pairwise comparison of these AUCs showed no significant difference (p = 0.052). CONCLUSION: The pattern recognition system is less sensitive than the ADNEX MR scoring system, yet more specific.


Subject(s)
Adnexal Diseases , Magnetic Resonance Imaging , Humans , Female , Cross-Sectional Studies , Retrospective Studies , Middle Aged , Adnexal Diseases/diagnostic imaging , Adnexal Diseases/pathology , Adnexal Diseases/diagnosis , Adult , Magnetic Resonance Imaging/methods , Aged , Prognosis , ROC Curve , Follow-Up Studies , Adolescent , Young Adult , Pattern Recognition, Automated/methods , Adnexa Uteri/pathology , Adnexa Uteri/diagnostic imaging
SELECTION OF CITATIONS
SEARCH DETAIL
...