A Generative Approach for Face Mask Removal Using Audio and Appearance

Coelho, L. E. L.; Prates, R.; Schwartz, W. R.; Soc, Ieee Comp

Coelho, L. E. L.; Prates, R.; Schwartz, W. R.; Soc, Ieee Comp.

34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) ; : 239-246, 2021.

Article in English | Web of Science | ID: covidwho-1691667

ABSTRACT

ABSTRACT

Since the COVID-19 pandemic, the use of facial masks in public spaces or during people gatherings has become common. Therefore, journalists, reporters, and interviewees frequently use a mask, following the public health measures to contain the pandemic. However, using a mask while speaking or conducting a presentation can be uncomfortable for viewers. Furthermore, the usage of a mask prevents lip reading, which can harm the speech comprehension of people with hearing impairment. Thus, this work aims at artificially removing masks in videos while recovering the lip movements using the audio and uncovered face features. We use the audio to infer the lip movement in a way it matches with the uttered phrase. From the audio, we estimate landmarks representing the mouth structure. Finally, the landmarks (i.e. uncovered and estimated) are the input in a generative adversarial network (GAN) that reconstructs the full face image with the mouth in a correct shape. We present quantitative results in the form of evaluation metrics and qualitative results in the form of visual examples.

Fulltext

XML

Search on Google

Full text: Available Collection: Databases of international organizations Database: Web of Science Language: English Journal: 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) Year: 2021 Document Type: Article

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google