Pesquisa | Portal Regional da BVS

Weakening the Dominant Role of Text: CMOSI Dataset and Multimodal Semantic Enhancement Network.

Jin, Cong; Luo, Cong; Yan, Ming; Zhao, Guangzhe; Zhang, Guixuan; Zhang, Shuwu.

IEEE Trans Neural Netw Learn Syst ; PP2023 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-37327101

RESUMO

Multimodal sentiment analysis (MSA) is important for quickly and accurately understanding people's attitudes and opinions about an event. However, existing sentiment analysis methods suffer from the dominant contribution of text modality in the dataset; this is called text dominance. In this context, we emphasize that weakening the dominant role of text modality is important for MSA tasks. To solve the above two problems, from the perspective of datasets, we first propose the Chinese multimodal opinion-level sentiment intensity (CMOSI) dataset. Three different versions of the dataset were constructed: manually proofreading subtitles, generating subtitles using machine speech transcription, and generating subtitles using human cross-language translation. The latter two versions radically weaken the dominant role of the textual model. We randomly collected 144 real videos from the Bilibili video site and manually edited 2557 clips containing emotions from them. From the perspective of network modeling, we propose a multimodal semantic enhancement network (MSEN) based on a multiheaded attention mechanism by taking advantage of the multiple versions of the CMOSI dataset. Experiments with our proposed CMOSI show that the network performs best with the text-unweakened version of the dataset. The loss of performance is minimal on both versions of the text-weakened dataset, indicating that our network can fully exploit the latent semantics in nontext patterns. In addition, we conducted model generalization experiments with MSEN on MOSI, MOSEI, and CH-SIMS datasets, and the results show that our approach is also very competitive and has good cross-language robustness.

Movie Scene Event Extraction with Graph Attention Network Based on Argument Correlation Information.

Yi, Qian; Zhang, Guixuan; Liu, Jie; Zhang, Shuwu.

Sensors (Basel) ; 23(4)2023 Feb 17.

Artigo em Inglês | MEDLINE | ID: mdl-36850883

RESUMO

Movie scene event extraction is a practical task in media analysis, which aims at extracting structured events from unstructured movie scripts. However, although there have been many studies regarding open domain event extraction, there have only been a few studies focusing on movie scene event extraction. Specifically aimed at instances where different argument roles have the same characteristics in a movie scene, we propose the utilization of the correlation between different argument roles, which is beneficial for both movie scene trigger extraction (trigger identification and classification) and movie scene argument extraction (argument identification and classification) in event extraction. To model the correlation between different argument roles, we propose the superior role concept (SRC), a high-level role concept based upon the ordinary argument role. In this paper, we introduce a new movie scene event extraction model with two main features: (1) an attentive high-level argument role module to capture SRC information and (2) an SRC-based graph attention network (GAT) to fuse the argument role correlation information into semantic embeddings. To evaluate the performance of our model, we constructed a movie scene event extraction dataset named MovieSceneEvent and also conducted experiments on a widely used dataset to compare the results with other models. The experimental results show that our model outperforms competitive models, and the correlation information of argument roles helps to improve the performance of movie scene event extraction.

Utilizing Entity-Based Gated Convolution and Multilevel Sentence Attention to Improve Distantly Supervised Relation Extraction.

Yi, Qian; Zhang, Guixuan; Zhang, Shuwu.

Comput Intell Neurosci ; 2021: 6110885, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34759966

RESUMO

Distant supervision is an effective method to automatically collect large-scale datasets for relation extraction (RE). Automatically constructed datasets usually comprise two types of noise: the intrasentence noise and the wrongly labeled noisy sentence. To address issues caused by the above two types of noise and improve distantly supervised relation extraction, this paper proposes a novel distantly supervised relation extraction model, which consists of an entity-based gated convolution sentence encoder and a multilevel sentence selective attention (Matt) module. Specifically, we first apply an entity-based gated convolution operation to force the sentence encoder to extract entity-pair-related features and filter out useless intrasentence noise information. Furthermore, the multilevel attention schema fuses the bag information to obtain a fine-grained bag-specific query vector, which can better identify valid sentences and reduce the influence of wrongly labeled sentences. Experimental results on a large-scale benchmark dataset show that our model can effectively reduce the influence of the above two types of noise and achieves state-of-the-art performance in relation extraction.

Assuntos

Idioma , Projetos de Pesquisa

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA