A Generative Approach for Face Mask Removal Using Audio and Appearance
34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)
; : 239-246, 2021.
Article
in English
| Web of Science | ID: covidwho-1691667
ABSTRACT
Since the COVID-19 pandemic, the use of facial masks in public spaces or during people gatherings has become common. Therefore, journalists, reporters, and interviewees frequently use a mask, following the public health measures to contain the pandemic. However, using a mask while speaking or conducting a presentation can be uncomfortable for viewers. Furthermore, the usage of a mask prevents lip reading, which can harm the speech comprehension of people with hearing impairment. Thus, this work aims at artificially removing masks in videos while recovering the lip movements using the audio and uncovered face features. We use the audio to infer the lip movement in a way it matches with the uttered phrase. From the audio, we estimate landmarks representing the mouth structure. Finally, the landmarks (i.e. uncovered and estimated) are the input in a generative adversarial network (GAN) that reconstructs the full face image with the mouth in a correct shape. We present quantitative results in the form of evaluation metrics and qualitative results in the form of visual examples.
Full text:
Available
Collection:
Databases of international organizations
Database:
Web of Science
Language:
English
Journal:
34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)
Year:
2021
Document Type:
Article
Similar
MEDLINE
...
LILACS
LIS