Pesquisa | Portal Regional da BVS (teste)

Exploring Misclassification Information for Fine-Grained Image Classification.

Wang, Da-Han; Zhou, Wei; Li, Jianmin; Wu, Yun; Zhu, Shunzhi.

Sensors (Basel) ; 21(12)2021 Jun 18.

Artigo em Inglês | MEDLINE | ID: mdl-34206995

RESUMO

Fine-grained image classification is a hot topic that has been widely studied recently. Many fine-grained image classification methods ignore misclassification information, which is important to improve classification accuracy. To make use of misclassification information, in this paper, we propose a novel fine-grained image classification method by exploring the misclassification information (FGMI) of prelearned models. For each class, we harvest the confusion information from several prelearned fine-grained image classification models. For one particular class, we select a number of classes which are likely to be misclassified with this class. The images of selected classes are then used to train classifiers. In this way, we can reduce the influence of irrelevant images to some extent. We use the misclassification information for all the classes by training a number of confusion classifiers. The outputs of these trained classifiers are combined to represent images and produce classifications. To evaluate the effectiveness of the proposed FGMI method, we conduct fine-grained classification experiments on several public image datasets. Experimental results prove the usefulness of the proposed method.

Generative adversarial networks with decoder-encoder output noises.

Zhong, Guoqiang; Gao, Wei; Liu, Yongbin; Yang, Youzhao; Wang, Da-Han; Huang, Kaizhu.

Neural Netw ; 127: 19-28, 2020 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-32315932

RESUMO

In recent years, research on image generation has been developing very fast. The generative adversarial network (GAN) emerges as a promising framework, which uses adversarial training to improve the generative ability of its generator. However, since GAN and most of its variants use randomly sampled noises as the input of their generators, they have to learn a mapping function from a whole random distribution to the image manifold. As the structures of the random distribution and the image manifold are generally different, this results in GAN and its variants difficult to train and converge. In this paper, we propose a novel deep model called generative adversarial networks with decoder-encoder output noises (DE-GANs), which take advantage of both the adversarial training and the variational Bayesian inference to improve GAN and its variants on image generation performances. DE-GANs use a pre-trained decoder-encoder architecture to map the random noise vectors to informative ones and feed them to the generator of the adversarial networks. Since the decoder-encoder architecture is trained with the same data set as the generator, its output vectors, as the inputs of the generator, could carry the intrinsic distribution information of the training images, which greatly improves the learnability of the generator and the quality of the generated images. Extensive experiments demonstrate the effectiveness of the proposed model, DE-GANs.

Assuntos

Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Teorema de Bayes , Humanos , Processamento de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/tendências , Reconhecimento Automatizado de Padrão/tendências , Distribuição Aleatória

Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields.

Zhou, Xiang-Dong; Wang, Da-Han; Tian, Feng; Liu, Cheng-Lin; Nakagawa, Masaki.

IEEE Trans Pattern Anal Mach Intell ; 35(10): 2413-26, 2013 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-23969386

RESUMO

This paper proposes a method for handwritten Chinese/Japanese text (character string) recognition based on semi-Markov conditional random fields (semi-CRFs). The high-order semi-CRF model is defined on a lattice containing all possible segmentation-recognition hypotheses of a string to elegantly fuse the scores of candidate character recognition and the compatibilities of geometric and linguistic contexts by representing them in the feature functions. Based on given models of character recognition and compatibilities, the fusion parameters are optimized by minimizing the negative log-likelihood loss with a margin term on a training string sample set. A forward-backward lattice pruning algorithm is proposed to reduce the computation in training when trigram language models are used, and beam search techniques are investigated to accelerate the decoding speed. We evaluate the performance of the proposed method on unconstrained online handwritten text lines of three databases. On the test sets of databases CASIA-OLHWDB (Chinese) and TUAT Kondate (Japanese), the character level correct rates are 95.20 and 95.44 percent, and the accurate rates are 94.54 and 94.55 percent, respectively. On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.

Assuntos

Algoritmos , Inteligência Artificial , Escrita Manual , Interpretação de Imagem Assistida por Computador/métodos , Idioma , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Povo Asiático , Simulação por Computador , Humanos , Aumento da Imagem/métodos , Cadeias de Markov , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA