Search | VHL Regional Portal

SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and Out-of-Vocabulary Text.

Luo, Canjie; Zhu, Yuanzhi; Jin, Lianwen; Li, Zhe; Peng, Dezhi.

IEEE Trans Neural Netw Learn Syst ; 34(11): 8503-8515, 2023 Nov.

Article in English | MEDLINE | ID: mdl-35226609

ABSTRACT

Large amounts of labeled data are urgently required for the training of robust text recognizers. However, collecting handwriting data of diverse styles, along with an immense lexicon, is considerably expensive. Although data synthesis is a promising way to relieve data hunger, two key issues of handwriting synthesis, namely, style representation and content embedding, remain unsolved. To this end, we propose a novel method that can synthesize parameterized and controllable handwriting S tyles for arbitrary-Length and O ut-of-vocabulary text based on a G enerative A dversarial N etwork (GAN), termed SLOGAN. Specifically, we propose a style bank to parameterize specific handwriting styles as latent vectors, which are input to a generator as style priors to achieve the corresponding handwritten styles. The training of the style bank requires only writer identification of the source images, rather than attribute annotations. Moreover, we embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved by changing the input printed image. Finally, the generator is guided by dual discriminators to handle both the handwriting characteristics that appear as separated characters and in a series of cursive joins. Our method can synthesize words that are not included in the training vocabulary and with various new styles. Extensive experiments have shown that high-quality text images with great style diversity and rich vocabulary can be synthesized using our method, thereby enhancing the robustness of the recognizer.

EraseNet: End-to-End Text Removal in the Wild.

Liu, Chongyu; Liu, Yuliang; Jin, Lianwen; Zhang, Shuaitao; Luo, Canjie; Wang, Yongpan.

IEEE Trans Image Process ; PP2020 Aug 28.

Article in English | MEDLINE | ID: mdl-32857697

ABSTRACT

Scene text removal has attracted increasing research interests owing to its valuable applications in privacy protection, camera-based virtual reality translation, and image editing. However, existing approaches, which fall short on real applications, are mainly because they were evaluated on synthetic or unrepresentative datasets. To fill this gap and facilitate this research direction, this paper proposes a real-world dataset called SCUT-EnsText that consists of 3,562 diverse images selected from public scene text reading benchmarks, and each image is scrupulously annotated to provide visually plausible erasure targets. With SCUT-EnsText, we design a novel GANbased model termed EraseNet that can automatically remove text located on the natural images. The model is a two-stage network that consists of a coarse-erasure sub-network and a refinement sub-network. The refinement sub-network targets improvement in the feature representation and refinement of the coarse outputs to enhance the removal performance. Additionally, EraseNet contains a segmentation head for text perception and a local-global SN-Patch-GAN with spectral normalization (SN) on both the generator and discriminator for maintaining the training stability and the congruity of the erased regions. A sufficient number of experiments are conducted on both the previous public dataset and the brand-new SCUT-EnsText. Our EraseNet significantly outperforms the existing state-of-the-art methods in terms of all metrics, with remarkably superior higherquality results. The dataset and code will be made available at https://github.com/HCIILAB/SCUT-EnsText.

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL