Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
1.
Neural Netw ; 179: 106573, 2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39096753

ABSTRACT

Recognizing expressions from dynamic facial videos can find more natural affect states of humans, and it becomes a more challenging task in real-world scenes due to pose variations of face, partial occlusions and subtle dynamic changes of emotion sequences. Existing transformer-based methods often focus on self-attention to model the global relations among spatial features or temporal features, which cannot well focus on important expression-related locality structures from both spatial and temporal features for the in-the-wild expression videos. To this end, we incorporate diverse graph structures into transformers and propose a CDGT method to construct diverse graph transformers for efficient emotion recognition from in-the-wild videos. Specifically, our method contains a spatial dual-graphs transformer and a temporal hyperbolic-graph transformer. The former deploys a dual-graph constrained attention to capture latent emotion-related graph geometry structures among local spatial tokens for efficient feature representation, especially for the video frames with pose variations and partial occlusions. The latter adopts a hyperbolic-graph constrained self-attention that explores important temporal graph structure information under hyperbolic space to model more subtle changes of dynamic emotion. Extensive experimental results on in-the-wild video-based facial expression databases show that our proposed CDGT outperforms other state-of-the-art methods.

2.
IEEE J Biomed Health Inform ; 27(9): 4385-4396, 2023 09.
Article in English | MEDLINE | ID: mdl-37467088

ABSTRACT

Medical images such as facial and tongue images have been widely used for intelligence-assisted diagnosis, which can be regarded as the multi-label classification task for disease location (DL) and disease nature (DN) of biomedical images. Compared with complicated convolutional neural networks and Transformers for this task, recent MLP-like architectures are not only simple and less computationally expensive, but also have stronger generalization capabilities. However, MLP-like models require better input features from the image. Thus, this study proposes a novel convolution complex transformation MLP-like (CCT-MLP) model for the multi-label DL and DN recognition task for facial and tongue images. Notably, the convolutional Tokenizer and multiple convolutional layers are first used to extract the better shallow features from input biomedical images to make up for the loss of spatial information obtained by the simple MLP structure. Subsequently, the Channel-MLP architecture with complex transformations is used to extract deep-level contextual features. In this way, multi-channel features are extracted and mixed to perform the multi-label classification of the input biomedical images. Experimental results on our constructed multi-label facial and tongue image datasets demonstrate that our method outperforms existing methods in terms of both accuracy (Acc) and mean average precision (mAP).


Subject(s)
Diagnostic Imaging , Neural Networks, Computer , Humans
3.
Math Biosci Eng ; 20(5): 8085-8102, 2023 02 24.
Article in English | MEDLINE | ID: mdl-37161187

ABSTRACT

Currently, machine learning methods have been utilized to realize the early detection of Parkinson's disease (PD) by using voice signals. Because the vocal system of each person is unique, and the same person's pronunciation can be different at different times, the training samples used in machine learning become very different from the speech signal of the patient to be diagnosed, frequently resulting in poor diagnostic performance. On this account, this paper presents a new intelligent personalized diagnosis method (PDM) for Parkinson's disease. The method was designed to begin with constructing new training data by assigning the best classifier to each training sample composed of features from the speech signals of patients. Subsequently, a meta-classifier was trained on the new training data. Finally, for the signal of each test patient, the method used the meta-classifier to select the most appropriate classifier, followed by adopting the selected classifier to classify the signal so that the more accurate diagnosis result of the test patient can be obtained. The novelty of the proposed method is that the proposed method uses different classifiers to perform the diagnosis of PD for diversified patients, whereas the current method uses the same classifier to diagnose all patients to be tested. Results of a large number of experiments show that PDM not only improves the performance but also exceeds the existing methods in speed.


Subject(s)
Parkinson Disease , Humans , Parkinson Disease/diagnosis , Machine Learning
4.
BMJ Open ; 12(12): e063442, 2022 12 30.
Article in English | MEDLINE | ID: mdl-36585134

ABSTRACT

INTRODUCTION: Insomnia affects physical and mental health due to the lack of continuous and complete sleep architecture. Polysomnograms (PSGs) are used to record electrical information to perform sleep architecture using deep learning. Although acupuncture combined with cognitive-behavioural therapy for insomnia (CBT-I) could not only improve sleep quality, solve anxiety, depression but also ameliorate poor sleep habits and detrimental cognition. Therefore, this study will focus on the effects of electroacupuncture combined with CBT-I on sleep architecture with deep learning. METHODS AND ANALYSIS: This randomised controlled trial will evaluate the efficacy and effectiveness of electroacupuncture combined with CBT-I in patients with insomnia. Participants will be randomised to receive either electroacupuncture combined with CBT-I or sham acupuncture combined with CBT-I and followed up for 4 weeks. The primary outcome is sleep quality, which is evaluated by the Pittsburgh Sleep Quality Index. The secondary outcome measures include a measurement of depression severity, anxiety, maladaptive cognitions associated with sleep and adverse events. Sleep architecture will be assessed using deep learning on PSGs. ETHICS AND DISSEMINATION: This trial has been approved by the institutional review boards and ethics committees of the First Affiliated Hospital of Sun Yat-sun University (2021763). The results will be disseminated through peer-reviewed journals. The results of this trial will be disseminated through peer-reviewed publications and conference abstracts or posters. TRIAL REGISTRATION NUMBER: CTR2100052502.


Subject(s)
Acupuncture Therapy , Cognitive Behavioral Therapy , Sleep Initiation and Maintenance Disorders , Humans , Sleep Initiation and Maintenance Disorders/therapy , Treatment Outcome , Sleep , Cognitive Behavioral Therapy/methods , Randomized Controlled Trials as Topic
5.
J Healthc Eng ; 2022: 6553017, 2022.
Article in English | MEDLINE | ID: mdl-36389107

ABSTRACT

Traditional Chinese Medicine (TCM) is one of the oldest medical systems in the world, and inquiry is an essential part of TCM diagnosis. The development of artificial intelligence has led to the proposal of several computational TCM diagnostic methods. However, there are few research studies among them, and they have the following flaws: (1) insufficient engagement with the patient, (2) barren TCM consultation philosophy, and (3) inadequate validation of the method. As TCM inquiry knowledge is abstract and there are few relevant datasets, we devise a novel knowledge representation technique. The mapping of symptoms and syndromes is constructed based on the diagnostics of traditional Chinese medicine. As a guide, the inquiry knowledge base is constructed utilizing the "Ten Brief Inquiries," TCM's domain knowledge. Subsequently, a corresponding assessment approach is proposed for an intelligent consultation model for syndrome differentiation. We establish three criteria: the quality of the generated question-answer pairs, the accuracy of model identification, and the average number of questions. Three TCM specialists are asked to undertake a manual evaluation of the model separately. The results reveal that our approach is capable of pretty accurate syndrome differentiation. Furthermore, the model's question and answer pairs for simulated consultations are relevant, accurate, and efficient.


Subject(s)
Artificial Intelligence , Medicine, Chinese Traditional , Humans , Medicine, Chinese Traditional/methods , Syndrome , Philosophy , Referral and Consultation
6.
IEEE J Biomed Health Inform ; 26(2): 626-637, 2022 02.
Article in English | MEDLINE | ID: mdl-34428166

ABSTRACT

Physical signs of patients indicate crucial evidence for diagnosing both location and nature of the disease, where there is a sequential relationship between the two tasks. Thus their joint learning can utilize intrinsic association by transferring related knowledge across relevant tasks. Choosing the right time to transfer is a critical problem for joint learning. However, how to dynamically adjust when tasks interact to capture the right time for transferring related knowledge is still an open issue. To this end, we propose a Task-Coupling Elastic Learning (TCEL) framework to model the task relatedness for classifying disease-location and disease-nature based on physical sign images. The main idea is to dynamically transfer relevant knowledge by progressively shifting task-coupling from loose to tight during the multi-stage training. In the early stage of training, we relax the constraints of modeling relations to focus more in learning the generic task-common features. In the later stage, the semantic guidance will be strengthened to learn the task-specific features. Specifically, a dynamic sequential module (DSM) is proposed to explicitly model the sequential relationship and enable multi-stage training. Moreover, to address the side effect of DSM, a new loss regularization is proposed. The extensive experiments on these two clinical datasets show the superiority of the proposed method over the baselines, and demonstrate the effectiveness of the proposed task-coupling elastic mechanism.


Subject(s)
Machine Learning , Humans
7.
IEEE Trans Cybern ; 52(8): 8547-8560, 2022 Aug.
Article in English | MEDLINE | ID: mdl-34398768

ABSTRACT

Despite the tremendous success in computer vision, deep convolutional networks suffer from serious computation costs and redundancies. Although previous works address that by enhancing the diversities of filters, they have not considered the complementarity and the completeness of the internal convolutional structure. To respond to this problem, we propose a novel inner-imaging (InI) architecture, which allows relationships between channels to meet the above requirement. Specifically, we organize the channel signal points in groups using convolutional kernels to model both the intragroup and intergroup relationships simultaneously. A convolutional filter is a powerful tool for modeling spatial relations and organizing grouped signals, so the proposed methods map the channel signals onto a pseudoimage, like putting a lens into the internal convolution structure. Consequently, not only is the diversity of channels increased but also the complementarity and completeness can be explicitly enhanced. The proposed architecture is lightweight and easy to be implement. It provides an efficient self-organization strategy for convolutional networks to improve their performance. Extensive experiments are conducted on multiple benchmark datasets, including CIFAR, SVHN, and ImageNet. Experimental results verify the effectiveness of the InI mechanism with the most popular convolutional networks as the backbones.


Subject(s)
Algorithms , Neural Networks, Computer
8.
Mitochondrial DNA B Resour ; 6(11): 3269-3270, 2021.
Article in English | MEDLINE | ID: mdl-34712807

ABSTRACT

Arcangelisia gusanlung H.S.Lo is widely used as a folk medicine by the Dai and Li peoples. Here, we report the first complete chloroplast (cp) genome sequence for this species based on Illumina paired-end sequencing data. The cp genome was 162,509 bp in length with a small single-copy (SSC) region of 20,852 bp, a large single-copy (LSC) region of 91,449 bp, and two separated inverted region of 25,104 bp. In total, 129 unique genes were identified of this genome, including 84 protein-coding genes, 37 tRNA genes, and eight rRNA genes. The GC contents of this genome is 37.8%. Phylogenetic analysis based on 13 complete cp genomes showed a strong sister relationship with Tinospora cordifolia (Willd.) Miers and Tinospora sinensis (Lour.) Merr. This complete genome of A. gusanlung will provide valuable information to elucidate the mechanism of speciation of Arcangelisia Becc.

9.
Artif Intell Med ; 118: 102110, 2021 08.
Article in English | MEDLINE | ID: mdl-34412836

ABSTRACT

OBJECTIVE: Using the deep learning model to realize tongue image-based disease location recognition and focus on solving two problems: 1. The ability of the general convolution network to model detailed regional tongue features is weak; 2. Ignoring the group relationship between convolution channels, which caused the high redundancy of the model. METHODS: To enhance the convolutional neural networks. In this paper, a stochastic region pooling method is proposed to gain detailed regional features. Also, an inner-imaging channel relationship modeling method is proposed to model multi-region relations on all channels. Moreover, we combine it with the spatial attention mechanism. RESULTS: The tongue image dataset with the clinical disease-location label is established. Abundant experiments are carried out on it. The experimental results show that the proposed method can effectively model the regional details of tongue image and improve the performance of disease location recognition. CONCLUSION: In this paper, we construct the tongue image dataset with disease-location labels to mine the relationship between tongue images and disease locations. A novel fully-channel regional attention network is proposed to model the local detail tongue features and improve the modeling efficiency. SIGNIFICANCE: The applications of deep learning in tongue image disease-location recognition and the proposed innovative models have guiding significance for other assistant diagnostic tasks. The proposed model provides an example of efficient modeling of detailed tongue features, which is of great guiding significance for other auxiliary diagnosis applications.


Subject(s)
Neural Networks, Computer , Tongue , Image Processing, Computer-Assisted , Tongue/diagnostic imaging
10.
J Biomed Inform ; 117: 103727, 2021 05.
Article in English | MEDLINE | ID: mdl-33713854

ABSTRACT

Online healthcare consultation offers people a convenient way to consult doctors. In this paper, we aim at building a generative dialog system for Chinese healthcare consultation. As the original Seq2seq architecture tends to suffer the issue of generating low-quality responses, the multi-source Seq2seq architecture generating more informative responses is much more preferred in this task. The multi-source Seq2seq architecture takes advantage of retrieval techniques to obtain responses from the database, and then takes these responses alongside the user-issued question as input. However, some of the retrieved responses might be not much related to the user-issued question, resulting in the generation of unsatisfying responses that are not correct in diagnosis or instead provide inappropriate advice on prevention or treatment. Therefore, this paper proposes multi-source Seq2seq guided by knowledge (MSSGK) to handle this problem. MSSGK differs from the multi-source Seq2seq architecture in that domain knowledge, including disease labels and topic labels about prevention and treatment, is introduced into the response generation via a multi-task learning framework. To better exploit the domain knowledge, we propose three attention mechanisms to provide more appropriate guidance for response generation. Experimental results on a dataset of real-world healthcare consultation show the effectiveness of the proposed method.


Subject(s)
Learning , Machine Learning , China , Delivery of Health Care , Humans , Referral and Consultation
11.
IEEE Trans Cybern ; 51(2): 708-721, 2021 Feb.
Article in English | MEDLINE | ID: mdl-31059462

ABSTRACT

The tongue image provides important physical information of humans. It is of great importance for diagnoses and treatments in clinical medicine. Herbal prescriptions are simple, noninvasive, and have low side effects. Thus, they are widely applied in China. Studies on the automatic construction technology of herbal prescriptions based on tongue images have great significance for deep learning to explore the relevance of tongue images for herbal prescriptions, it can be applied to healthcare services in mobile medical systems. In order to adapt to the tongue image in a variety of photographic environments and construct herbal prescriptions, a neural network framework for prescription construction is designed. It includes single/double convolution channels and fully connected layers. Furthermore, it proposes the auxiliary therapy topic loss mechanism to model the therapy of Chinese doctors and alleviate the interference of sparse output labels on the diversity of results. The experiment use the real-world tongue images and the corresponding prescriptions and the results can generate prescriptions that are close to the real samples, which verifies the feasibility of the proposed method for the automatic construction of herbal prescriptions from tongue images. Also, it provides a reference for automatic herbal prescription construction from more physical information.


Subject(s)
Drugs, Chinese Herbal , Image Interpretation, Computer-Assisted/methods , Medicine, Chinese Traditional/methods , Neural Networks, Computer , Tongue/diagnostic imaging , Drugs, Chinese Herbal/administration & dosage , Drugs, Chinese Herbal/therapeutic use , Humans , Physical Examination/methods
12.
J Healthc Eng ; 2020: 8834465, 2020.
Article in English | MEDLINE | ID: mdl-33274038

ABSTRACT

Background: Body constitution (BC) is the abstract concept indicating the state of a person's health in Traditional Chinese Medicine (TCM). The doctor identifies the body constitution of the patient through inspection and inquiry. Previous research simulates doctors to identify BC types according to a patient's objective physical indicators. However, the lack of subjective feeling information can reduce the accuracy of the machine to imitate the doctor's diagnosis. The Constitution in Chinese Medicine Questionnaire (CCMQ) is used to collect subjective information but suffers from low acquisition efficiency. Methods: This paper presents a personalized body constitution inquiry method based on a machine learning technique. It employs a random generator, a feature extractor, and a classifier to simulate the doctor inquiry and generate a personalized questionnaire. Specifically, the feature extractor evaluates and sorts the question of the constitution in the CCMQ based on the recognition results of the tongue coating image of patients. The sorted questions and relevant BC label are inputted into the classifier; the best questions are screened out for patients. Results: The experimental results show that our method can select personalized questions from the CCMQ for the patients, significantly reducing the time and the number of questions to answer. It also improves the accuracy of recognizing BC. Compared with the CCMQ, patients had 68.3% fewer questions to answer and the time occupied by answering is reduced by 80.3%. Conclusions: The proposed method can simulate the doctor's inquiry and pick out personalized questions for patients. It can act as auxiliary diagnosis tools to collect subjective patient feelings and help make further judgments on the patient's BC types.


Subject(s)
Body Constitution , Physicians , Humans , Machine Learning , Medicine, Chinese Traditional , Surveys and Questionnaires
13.
J Biomed Inform ; 112: 103608, 2020 12.
Article in English | MEDLINE | ID: mdl-33132138

ABSTRACT

Deep learning methods have been applied to Chinese named entity recognition for the online medical consultation. They require a large number of marked samples. However, no such database is available at present. This paper begins with constructing a larger labelled Chinese texts database for the online medical consultation. Second, a basic framework unit is proposed, which is pre-trained by the transfer learning from both Bidirectional language model and Mask language model trained on the larger unlabelled data. Finally, cross domains adversarial learning (CDAL) for Chinese named entity recognition is proposed to further improve the performance, which not only uses the pre-trained basic framework unit, but also uses the adversarial multi-task learning on both electronic medical record texts and online medical consultation texts. Experimental results validate the effectiveness of CDAL.


Subject(s)
Language , Natural Language Processing , China , Electronic Health Records , Referral and Consultation
14.
Artif Intell Med ; 109: 101951, 2020 09.
Article in English | MEDLINE | ID: mdl-34756217

ABSTRACT

Traditional Chinese Medicine (TCM) considers that the personal constitution determines the occurrence trend and therapeutic effects of certain diseases, which can be recognized by machine learning through tongue images. However, current machine learning methods are confronted with two challenges. First, there are not some larger tongue image databases available. Second, they do not use the domain knowledge of TCM, so that the imbalance of constitution categories cannot be solved. Therefore, this paper proposes a new constitution recognition method based on the zero-shot learning with the knowledge of TCM. To further improve the performance, a new zero-shot learning method is proposed by grouping attributes and learning discriminant latent features, which can better solve the imbalance problem of constitution categories. Experimental results on our constructed databases validate the proposed methods.


Subject(s)
Machine Learning , Tongue , Databases, Factual , Medicine, Chinese Traditional
15.
Nanoscale ; 11(24): 11765-11773, 2019 Jun 20.
Article in English | MEDLINE | ID: mdl-31184359

ABSTRACT

Effective oxygen evolution reaction (OER) catalysts composed of Earth-abundant transition metals are crucial for sustainable energy conversion and storage. Metal-organic frameworks (MOFs) with tunable compositions are promising precursors for the fabrication of hollow and porous electrocatalysts. However, pulverous MOFs usually suffer from agglomeration during pyrolysis, greatly reducing the activity of their derived catalysts. In this work, Prussian blue analogue (PBA) arrays with hierarchical multidimensional architecture were directly grown on nickel foam (NF) using a template-oriented method. The subsequent calcination in air allowed for obtaining NixCo3-xO4 nanoplate arrays consisting of porous and hollow nanocubes. The derived bimetallic NixCo3-xO4/NF required only an overpotential of 287 mV to achieve a current density of 10 mA cm-2 in 1.0 M KOH solution, which is much lower than that of the monometallic NiO and the RuO2 benchmark. The 3D intersectional architecture of the NixCo3-xO4 nanoplates and the porous and hollow nanocube subunits contributed to the large specific surface area and reduced charge-transfer resistance of the NixCo3-xO4/NF electrode. Density functional theory (DFT) calculations and post-OER characterization revealed that the incorporated Co was the active sites and electrochemical active CoOOH intermediates were in situ formed during the OER. Our study provides a facile and efficient strategy for the rational design of MOF-derived materials towards effective and low-cost electrocatalysis.

16.
Artif Intell Med ; 96: 123-133, 2019 05.
Article in English | MEDLINE | ID: mdl-31164206

ABSTRACT

The body constitution is much related to the diseases and the corresponding treatment programs in Traditional Chinese Medicine. It can be recognized by the tongue image diagnosis, so that it is essentially regarded as a problem of tongue image classification, where each tongue image is classified into one of nine constitution types. This paper first presents a system framework to automatically identify the constitution through natural tongue images, where deep convolutional neural networks are carefully designed for tongue coating detection, tongue coating calibration, and constitution recognition. Under the system framework, a novel complexity perception (CP) classification method is proposed to nicely perform the constitution recognition, which can better deal with the bad influence of the variation of environmental condition and the uneven distribution of the tongue images on constitution recognition performance. CP performs the constitution recognition based on the complexity of individual tongue images by selecting the classifier with the corresponding complexity. To evaluate the performance of the proposed method, experiments are conducted on three sizes of clinic tongue images from hospitals. The experimental results illustrate that CP is effective to improve the accuracy of body constitution recognition.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Tongue/diagnostic imaging , Humans , Medicine, Chinese Traditional , Neural Networks, Computer
17.
Comput Math Methods Med ; 2019: 1258782, 2019.
Article in English | MEDLINE | ID: mdl-31933675

ABSTRACT

Constitution classification is the basis and core content of TCM constitution research. In order to improve the accuracy of constitution classification, this paper proposes a multilevel and multiscale features aggregation method within the convolutional neural network, which consists of four steps. First, it uses the pretrained VGG16 as the basic network and then refines the network structure through supervised feature learning so as to capture local image features. Second, it extracts the image features of different layers from the fine-tuned VGG16 model, which are then dimensionally reduced by principal component analysis (PCA). Third, it uses another pretrained NASNetMobile network for supervised feature learning, where the previous layer features of the global average pooling layer are outputted. Similarly, these features are dimensionally reduced by PCA and then are fused with the features of different layers in VGG16 after the PCA. Finally, all features are aggregated with the fully connected layers of the fine-tuned VGG16, and then the constitution classification is performed. The conducted experiments show that using the multilevel and multiscale feature aggregation is very effective in the constitution classification, and the accuracy on the test dataset reaches 69.61%.


Subject(s)
Algorithms , Face , Image Processing, Computer-Assisted/methods , Machine Learning , Neural Networks, Computer , Pattern Recognition, Automated/methods , China , Humans , Principal Component Analysis
18.
J Biomed Inform ; 82: 154-168, 2018 06.
Article in English | MEDLINE | ID: mdl-29705197

ABSTRACT

BACKGROUND: Current Chinese medicine has an urgent demand for convenient medical services. When facing a large number of patients, understanding patients' questions automatically and precisely is useful. Different from the high professional medical text, patients' questions contain only a small amount of descriptions regarding the symptoms, and the questions are slightly professional and colloquial. OBJECT: The aim of this paper is to implement a department classification system for patient questions. Patients' questions will be classified into 11 departments, such as surgery and others. METHODS: This paper presents a morpheme growth model that enhances the memories of key elements in questions, and later extracts the "label-indicators" and germinates the expansion vectors around them. Finally, the model inputs the expansion vectors into a neural network to assign department labels for patients' questions. RESULTS: All compared methods are validated by experiments on three datasets that are composed of real patient questions. The proposed method has some ability to improve the performance of the classification. CONCLUSIONS: The proposed method is effective for the departments classification of patients questions and serves as a useful system for the automatic understanding of patient questions.


Subject(s)
Communication , Medical Informatics/methods , Neural Networks, Computer , Physician-Patient Relations , China , Databases, Factual , Deep Learning , Delivery of Health Care , Humans , Natural Language Processing , Patient Participation , Reproducibility of Results , Workflow
19.
Comput Math Methods Med ; 2017: 9846707, 2017.
Article in English | MEDLINE | ID: mdl-29181087

ABSTRACT

Body constitution classification is the basis and core content of traditional Chinese medicine constitution research. It is to extract the relevant laws from the complex constitution phenomenon and finally build the constitution classification system. Traditional identification methods have the disadvantages of inefficiency and low accuracy, for instance, questionnaires. This paper proposed a body constitution recognition algorithm based on deep convolutional neural network, which can classify individual constitution types according to face images. The proposed model first uses the convolutional neural network to extract the features of face image and then combines the extracted features with the color features. Finally, the fusion features are input to the Softmax classifier to get the classification result. Different comparison experiments show that the algorithm proposed in this paper can achieve the accuracy of 65.29% about the constitution classification. And its performance was accepted by Chinese medicine practitioners.


Subject(s)
Body Constitution , Face , Neural Networks, Computer , Algorithms , China , Color , Facial Recognition , Humans , Medicine, Chinese Traditional , Models, Statistical , Reproducibility of Results , Software , Surveys and Questionnaires
20.
Comput Intell Neurosci ; 2017: 1945630, 2017.
Article in English | MEDLINE | ID: mdl-28356908

ABSTRACT

Now the human emotions can be recognized from speech signals using machine learning methods; however, they are challenged by the lower recognition accuracies in real applications due to lack of the rich representation ability. Deep belief networks (DBN) can automatically discover the multiple levels of representations in speech signals. To make full of its advantages, this paper presents an ensemble of random deep belief networks (RDBN) method for speech emotion recognition. It firstly extracts the low level features of the input speech signal and then applies them to construct lots of random subspaces. Each random subspace is then provided for DBN to yield the higher level features as the input of the classifier to output an emotion label. All outputted emotion labels are then fused through the majority voting to decide the final emotion label for the input speech signal. The conducted experimental results on benchmark speech emotion databases show that RDBN has better accuracy than the compared methods for speech emotion recognition.


Subject(s)
Algorithms , Emotions/physiology , Nerve Net/physiology , Pattern Recognition, Automated , Speech/physiology , Humans , Pattern Recognition, Automated/methods , Task Performance and Analysis
SELECTION OF CITATIONS
SEARCH DETAIL