Pesquisa | Portal Regional da BVS (teste)

ACG-EmoCluster: A Novel Framework to Capture Spatial and Temporal Information from Emotional Speech Enhanced by DeepCluster.

Zhao, Huan; Li, Lixuan; Zha, Xupeng; Wang, Yujiang; Xie, Zhaoxin; Zhang, Zixing.

Sensors (Basel) ; 23(10)2023 May 16.

Artigo em Inglês | MEDLINE | ID: mdl-37430691

RESUMO

Speech emotion recognition (SER) is a task that tailors a matching function between the speech features and the emotion labels. Speech data have higher information saturation than images and stronger temporal coherence than text. This makes entirely and effectively learning speech features challenging when using feature extractors designed for images or texts. In this paper, we propose a novel semi-supervised framework for extracting spatial and temporal features from speech, called the ACG-EmoCluster. This framework is equipped with a feature extractor for simultaneously extracting the spatial and temporal features, as well as a clustering classifier for enhancing the speech representations through unsupervised learning. Specifically, the feature extractor combines an Attn-Convolution neural network and a Bidirectional Gated Recurrent Unit (BiGRU). The Attn-Convolution network enjoys a global spatial receptive field and can be generalized to the convolution block of any neural networks according to the data scale. The BiGRU is conducive to learning temporal information on a small-scale dataset, thereby alleviating data dependence. The experimental results on the MSP-Podcast demonstrate that our ACG-EmoCluster can capture effective speech representation and outperform all baselines in both supervised and semi-supervised SER tasks.

Assuntos

Emoções , Fala , Análise por Conglomerados , Redes Neurais de Computação

Adaptive Multi-Type Fingerprint Indoor Positioning and Localization Method Based on Multi-Task Learning and Weight Coefficients K-Nearest Neighbor.

Yuan, Zhengwu; Zha, Xupeng; Zhang, Xiaojian.

Sensors (Basel) ; 20(18)2020 Sep 21.

Artigo em Inglês | MEDLINE | ID: mdl-32967320

RESUMO

The complex indoor environment makes the use of received fingerprints unreliable as an indoor positioning and localization method based on fingerprint data. This paper proposes an adaptive multi-type fingerprint indoor positioning and localization method based on multi-task learning (MTL) and Weight Coefficients K-Nearest Neighbor (WCKNN), which integrates magnetic field, Wi-Fi and Bluetooth fingerprints for positioning and localization. The MTL fuses the features of different types of fingerprints to search the potential relationship between them. It also exploits the synergy between the tasks, which can boost up positioning and localization performance. Then the WCKNN predicts another position of the fingerprints in a certain class determined by the obtained location. The final position is obtained by fusing the predicted positions using a weighted average method whose weights are the positioning errors provided by positioning error prediction models. Experimental results indicated that the proposed method achieved 98.58% accuracy in classifying locations with a mean positioning error of 1.95 m.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA