ABSTRACT
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks. However, models trained with AT sacrifice standard accuracy and do not generalize well to unseen attacks. Recent works show generalization improvement with adversarial samples under unseen threat models such as on-manifold threat model or neural perceptual threat model. However, the former requires exact manifold information while the latter requires algorithm relaxation. Motivated by these considerations, we propose a novel threat model called Joint Space Threat Model (JSTM), which exploits the underlying manifold information with Normalizing Flow, ensuring that the exact manifold assumption holds. Under JSTM, we develop novel adversarial attacks and defenses. Specifically, we propose the Robust Mixup strategy in which we maximize the adversity of the interpolated images and gain robustness and prevent overfitting. Our experiments show that Interpolated Joint Space Adversarial Training (IJSAT) achieves good performance in standard accuracy, robustness, and generalization. IJSAT is also flexible and can be used as a data augmentation method to improve standard accuracy and combined with many existing AT approaches to improve robustness. We demonstrate the effectiveness of our approach on three benchmark datasets, CIFAR-10/100, OM-ImageNet and CIFAR-10-C.
ABSTRACT
With commercialization of deep learning (DL) models, daily precision dietary record based on images from smartphones becomes possible. This study took advantage of DL techniques on visual recognition tasks and proposed a suite of big-data-driven DL models regressing from food images to their nutrient estimation. We established and publicized the first food image database from the Chinese market, named ChinaMartFood-109. It contained 10,921 images with 23 nutrient contents, covering 18 main food groups. Inception V3 was optimized using other state-of-the-art deep convolutional neural networks, achieving up to 78 % and 94 % for top-1 and top-5 accuracy, respectively. Besides, this research compared three nutrient estimation algorithms and achieved the best regression coefficient (R2) by normalization + AM compared with arithmetic mean and harmonic mean, validating applicability in practice as well as theory. These encouraging results provide further evidence supporting artificial intelligence in the field of food analysis.
Subject(s)
Artificial Intelligence , Deep Learning , China , Neural Networks, Computer , NutrientsABSTRACT
Food image recognition systems facilitate dietary assessment and in turn track users' dietary behaviors. However, due to the diversity of Chinese food, a quick and accurate food image recognizing is a particularly challenging task. The success of deep learning in computer vision inspired us to investigate its potential in this task. To satisfy its requirement on large-scale data, we established the first open-access image database for Chinese dishes, named ChinaFood-100, with quantitative nutrient annotations. We collected 10,074 images covering 100 food categories, including staple, meat, seafood, and vegetables. Based on this dataset, we trained four state-of-art deep learning neural network architectures for image recognition and showed that deep learning model Inception V3 resulted in the most advantageous recognition performance 78.26% in top-1 accuracy and 96.62% in top-5 accuracy. Based on this image recognition posterior, we further compared three nutrition estimation algorithms for food nutrient estimation. The results showed that the top-5 Arithmetic Mean (AM) algorithm achieved the highest regression coefficient (R2) up to 0.73 for protein estimation, which validated its applicability in practice. In addition, we analyzed our algorithm in terms of precision-recall and Grad-CAM. The results achieved by deep learning for food nutrient estimation may encourage artificial intelligence to be applied to the field of food, which shed the light on improvement in the future.
Subject(s)
Deep Learning , Artificial Intelligence , China , Neural Networks, Computer , NutrientsABSTRACT
Image retargeting aims to resize an image to one with a prescribed aspect ratio. Simple scaling inevitably introduces unnatural geometric distortions on the important content of the image. In this paper, we propose a simple and yet effective method to resize an image, which preserves the geometry of the important content, using the Beltrami representation. Our algorithm allows users to interactively label content regions as well as line structures. Image resizing can then be achieved by warping the image by an orientation-preserving bijective warping map with controlled distortion. The warping map is represented by its Beltrami representation, which captures the local geometric distortion of the map. By carefully prescribing the values of the Beltrami representation, images with different complexity can be effectively resized. Our method does not require solving any optimization problems and tuning parameters throughout the process. This results in a simple and efficient algorithm to solve the image retargeting problem. Extensive experiments have been carried out, which demonstrate the efficacy of our proposed method.