Pesquisa | Portal Regional da BVS

A syntactic evidence network model for fact verification.

Chen, Zhendong; Hui, Siu Cheung; Zhuang, Fuzhen; Liao, Lejian; Jia, Meihuizi; Li, Jiaqi; Huang, Heyan.

Neural Netw ; 178: 106424, 2024 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-38875934

RESUMO

In natural language processing, fact verification is a very challenging task, which requires retrieving multiple evidence sentences from a reliable corpus to verify the authenticity of a claim. Although most of the current deep learning methods use the attention mechanism for fact verification, they have not considered imposing attentional constraints on important related words in the claim and evidence sentences, resulting in inaccurate attention for some irrelevant words. In this paper, we propose a syntactic evidence network (SENet) model which incorporates entity keywords, syntactic information and sentence attention for fact verification. The SENet model extracts entity keywords from claim and evidence sentences, and uses a pre-trained syntactic dependency parser to extract the corresponding syntactic sentence structures and incorporates the extracted syntactic information into the attention mechanism for language-driven word representation. In addition, the sentence attention mechanism is applied to obtain a richer semantic representation. We have conducted experiments on the FEVER and UKP Snopes datasets for performance evaluation. Our SENet model has achieved 78.69% in Label Accuracy and 75.63% in FEVER Score on the FEVER dataset. In addition, our SENet model also has achieved 65.0% in precision and 61.2% in macro F1 on the UKP Snopes dataset. The experimental results have shown that our proposed SENet model has outperformed the baseline models and achieved the state-of-the-art performance for fact verification.

Coarse-to-Fine Contrastive Learning on Graphs.

Zhao, Peiyao; Pan, Yuangang; Li, Xin; Chen, Xu; Tsang, Ivor W; Liao, Lejian.

IEEE Trans Neural Netw Learn Syst ; 35(4): 4622-4634, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37018665

RESUMO

Inspired by the impressive success of contrastive learning (CL), a variety of graph augmentation strategies have been employed to learn node representations in a self-supervised manner. Existing methods construct the contrastive samples by adding perturbations to the graph structure or node attributes. Although impressive results are achieved, it is rather blind to the wealth of prior information assumed: with the increase of the perturbation degree applied on the original graph: 1) the similarity between the original graph and the generated augmented graph gradually decreases and 2) the discrimination between all nodes within each augmented view gradually increases. In this article, we argue that both such prior information can be incorporated (differently) into the CL paradigm following our general ranking framework. In particular, we first interpret CL as a special case of learning to rank (L2R), which inspires us to leverage the ranking order among positive augmented views. Meanwhile, we introduce a self-ranking paradigm to ensure that the discriminative information among different nodes can be maintained and also be less altered to the perturbations of different degrees. Experiment results on various benchmark datasets verify the effectiveness of our algorithm compared with the supervised and unsupervised models.

TopicBERT: A Topic-Enhanced Neural Language Model Fine-Tuned for Sentiment Classification.

Zhou, Yuxiang; Liao, Lejian; Gao, Yang; Wang, Rui; Huang, Heyan.

IEEE Trans Neural Netw Learn Syst ; 34(1): 380-393, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-34357867

RESUMO

Sentiment classification is a form of data analytics where people's feelings and attitudes toward a topic are mined from data. This tantalizing power to "predict the zeitgeist" means that sentiment classification has long attracted interest, but with mixed results. However, the recent development of the BERT framework and its pretrained neural language models is seeing new-found success for sentiment classification. BERT models are trained to capture word-level information via mask language modeling and sentence-level contexts via next sentence prediction tasks. Out of the box, they are adequate models for some natural language processing tasks. However, most models are further fine-tuned with domain-specific information to increase accuracy and usefulness. Motivated by the idea that a further fine-tuning step would improve the performance for downstream sentiment classification tasks, we developed TopicBERT-a BERT model fine-tuned to recognize topics at the corpus level in addition to the word and sentence levels. TopicBERT comprises two variants: TopicBERT-ATP (aspect topic prediction), which captures topic information via an auxiliary training task, and TopicBERT-TA, where topic representation is directly injected into a topic augmentation layer for sentiment classification. With TopicBERT-ATP, the topics are predetermined by an LDA mechanism and collapsed Gibbs sampling. With TopicBERT-TA, the topics can change dynamically during the training. Experimental results show that both approaches deliver the state-of-the-art performance in two different domains with SemEval 2014 Task 4. However, in a test of methods, direct augmentation outperforms further training. Comprehensive analyses in the form of ablation, parameter, and complexity studies accompany the results.

Assuntos

Redes Neurais de Computação , Análise de Sentimentos , Humanos , Idioma , Processamento de Linguagem Natural , Trifosfato de Adenosina

Analysis of the nonperfused volume ratio of adenomyosis from MRI images based on fewshot learning.

Li, Jiaqi; Wang, Wei; Liao, Lejian; Liu, Xin.

Phys Med Biol ; 66(4): 045019, 2021 02 09.

Artigo em Inglês | MEDLINE | ID: mdl-33361557

RESUMO

The nonperfused volume (NPV) ratio is the key to the success of high intensity focused ultrasound (HIFU) ablation treatment of adenomyosis. However, there are no qualitative interpretation standards for predicting the NPV ratio of adenomyosis using magnetic resonance imaging (MRI) before HIFU ablation treatment, which leading to inter-reader variability. Convolutional neural networks (CNNs) have achieved state-of-the-art performance in the automatic disease diagnosis of MRI. Since the use of HIFU to treat adenomyosis is a novel treatment, there is not enough MRI data to support CNNs. We proposed a novel few-shot learning framework that extends CNNs to predict NPV ratio of HIFU ablation treatment for adenomyosis. We collected a dataset from 208 patients with adenomyosis who underwent MRI examination before and after HIFU treatment. Our proposed method was trained and evaluated by fourfold cross validation. This framework obtained sensitivity of 85.6%, 89.6% and 92.8% at 0.799, 0.980 and 1.180 FPs per patient. In the receiver operating characteristics analysis for NPV ratio of adenomyosis, our proposed method received the area under the curve of 0.8233, 0.8289, 0.8412, 0.8319, 0.7010, 0.7637, 0.8375, 0.8219, 0.8207, 0.9812 for the classifications of NPV ratio interval [0%-10%), [10%-20%), , [90%-100%], respectively. The present study demonstrated that few-shot learning on NPV ratio prediction of HIFU ablation treatment for adenomyosis may contribute to the selection of eligible patients and the pre-judgment of clinical efficacy.

Assuntos

Adenomiose/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Adenomiose/cirurgia , Adulto , Feminino , Ablação por Ultrassom Focalizado de Alta Intensidade , Humanos , Pessoa de Meia-Idade

Efficient State Management for Scaling Out Stateful Operators in Stream Processing Systems.

Mudassar, Muhammad; Zhai, Yanlong; Liao, Lejian.

Big Data ; 7(3): 192-206, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-30994383

RESUMO

Many big data applications require real-time analysis of continuous data streams. Stream Processing Systems (SPSs) are designed to act on real-time streaming data using continuous queries consisting of interconnected operators. The dynamic nature of data streams, for example, fluctuation in data arrival rates and uneven data distribution, can cause an operator to be a bottleneck one. Scalability is an important factor in SPS, but detecting bottleneck operator correctly and scaling it without affecting application execution are challenging. A stateful operator such as aggregation or join makes scaling operation more difficult as it involves state management. Current research does not address the issue of scaling stateful operators efficiently as mostly stop application for handling state, which results in significant overheads to the performance. In this article, the key idea is to detect bottleneck operator correctly using the runtime bottleneck detection approach and then scale out this operator and manage its internal state in a way that we can achieve almost zero latency. During the bottleneck detection process, we have defined alarming_threshold, a parameter for the operators that can be bottleneck operators in the future and scale_out_threshold, when the operator is bottleneck. To scale out, we have presented two techniques, active backup and checkpointing, the former one will start a Secondary Execution (SE) in back end by partitioning state and input streams to multiple nodes at alarming_threshold; this SE will replace primary node at scale_out_threshold. In the latter technique, a State Manager (SM) module will start state checkpointing at alarming_threshold to external store and perform scale out by managing state and input stream at scale_out_threshold. The first approach will help us to achieve almost zero latency goal, while the latter one is a resource efficient technique. Our results show that both techniques are working while providing desired goals of reducing overall latency during scale out and improving resource utilization.

Assuntos

Processamento Eletrônico de Dados , Algoritmos , Big Data , Simulação por Computador , Metodologias Computacionais

An automatic and efficient pipeline for disease gene identification through utilizing family-based sequencing data.

Song, Dandan; Li, Ning; Liao, Lejian.

Biomed Mater Eng ; 26 Suppl 1: S1797-803, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26405949

RESUMO

Due to the generation of enormous amounts of data at both lower costs as well as in shorter times, whole-exome sequencing technologies provide dramatic opportunities for identifying disease genes implicated in Mendelian disorders. Since upwards of thousands genomic variants can be sequenced in each exome, it is challenging to filter pathogenic variants in protein coding regions and reduce the number of missing true variants. Therefore, an automatic and efficient pipeline for finding disease variants in Mendelian disorders is designed by exploiting a combination of variants filtering steps to analyze the family-based exome sequencing approach. Recent studies on the Freeman-Sheldon disease are revisited and show that the proposed method outperforms other existing candidate gene identification methods.

Assuntos

Mapeamento Cromossômico/métodos , Exoma/genética , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Sequência de Bases , Humanos , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA