Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
Add more filters










Publication year range
1.
J Opt Soc Am A Opt Image Sci Vis ; 41(2): 185-194, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38437331

ABSTRACT

Multispectral imaging is a technique that captures data across several bands of the light spectrum, and it can be useful in many computer vision fields, including color constancy. We propose a method that exploits multispectral imaging for illuminant estimation, and then applies illuminant correction in the raw RGB domain to achieve computational color constancy. Our proposed method is composed of two steps: first, a selected number of existing camera-independent algorithms for illuminant estimation, originally designed for RGB data, are applied in generalized form to work with multispectral data. We demonstrate that the sole multispectral extension of such algorithms is not sufficient to achieve color constancy, and thus we introduce a second step, in which we re-elaborate the multispectral estimations before conversion into raw RGB with the use of the camera response function. Our results on the NUS dataset show that an improvement of 60% in the color constancy performance, measured in terms of reproduction angular error, can be obtained according to our method when compared to the traditional raw RGB pipeline.

2.
J Opt Soc Am A Opt Image Sci Vis ; 41(3): 516-526, 2024 Mar 01.
Article in English | MEDLINE | ID: mdl-38437443

ABSTRACT

We introduce a method that enhances RGB color constancy accuracy by combining neural network and k-means clustering techniques. Our approach stands out from previous works because we combine multispectral and color information together to estimate illuminants. Furthermore, we investigate the combination of the illuminant estimation in the RGB color and in the spectral domains, as a strategy to provide a refined estimation in the RGB color domain. Our investigation can be divided into three main points: (1) identify the spatial resolution for sampling the input image in terms of RGB color and spectral information that brings the highest performance; (2) determine whether it is more effective to predict the illuminant in the spectral or in the RGB color domain, and finally, (3) assuming that the illuminant is in fact predicted in the spectral domain, investigate if it is better to have a loss function defined in the RGB color or spectral domain. Experimental results are carried out on NUS: a standard dataset of multispectral radiance images with an annotated spectral global illuminant. Among the several considered options, the best results are obtained with a model trained to predict the illuminant in the spectral domain using an RGB color loss function. In terms of comparison with the state of the art, this solution improves the recovery angular error metric by 66% compared to the best tested spectral method, and by 41% compared to the best tested RGB method.

3.
J Imaging ; 9(10)2023 Oct 20.
Article in English | MEDLINE | ID: mdl-37888340

ABSTRACT

Data augmentation is a fundamental technique in machine learning that plays a crucial role in expanding the size of training datasets. By applying various transformations or modifications to existing data, data augmentation enhances the generalization and robustness of machine learning models. In recent years, the development of several libraries has simplified the utilization of diverse data augmentation strategies across different tasks. This paper focuses on the exploration of the most widely adopted libraries specifically designed for data augmentation in computer vision tasks. Here, we aim to provide a comprehensive survey of publicly available data augmentation libraries, facilitating practitioners to navigate these resources effectively. Through a curated taxonomy, we present an organized classification of the different approaches employed by these libraries, along with accompanying application examples. By examining the techniques of each library, practitioners can make informed decisions in selecting the most suitable augmentation techniques for their computer vision projects. To ensure the accessibility of this valuable information, a dedicated public website named DALib has been created. This website serves as a centralized repository where the taxonomy, methods, and examples associated with the surveyed data augmentation libraries can be explored. By offering this comprehensive resource, we aim to empower practitioners and contribute to the advancement of computer vision research and applications through effective utilization of data augmentation techniques.

4.
Sensors (Basel) ; 23(18)2023 Sep 14.
Article in English | MEDLINE | ID: mdl-37765950

ABSTRACT

Defect segmentation of apples is an important task in the agriculture industry for quality control and food safety. In this paper, we propose a deep learning approach for the automated segmentation of apple defects using convolutional neural networks (CNNs) based on a U-shaped architecture with skip-connections only within the noise reduction block. An ad-hoc data synthesis technique has been designed to increase the number of samples and at the same time to reduce neural network overfitting. We evaluate our model on a dataset of multi-spectral apple images with pixel-wise annotations for several types of defects. In this paper, we show that our proposal outperforms in terms of segmentation accuracy general-purpose deep learning architectures commonly used for segmentation tasks. From the application point of view, we improve the previous methods for apple defect segmentation. A measure of the computational cost shows that our proposal can be employed in real-time (about 100 frame-per-second on GPU) and in quasi-real-time (about 7/8 frame-per-second on CPU) visual-based apple inspection. To further improve the applicability of the method, we investigate the potential of using only RGB images instead of multi-spectral images as input images. The results prove that the accuracy in this case is almost comparable with the multi-spectral case.

5.
Article in English | MEDLINE | ID: mdl-37027541

ABSTRACT

Full-reference image quality measures are a fundamental tool to approximate the human visual system in various applications for digital data management: from retrieval to compression to detection of unauthorized uses. Inspired by both the effectiveness and the simplicity of hand-crafted Structural Similarity Index Measure (SSIM), in this work, we present a framework for the formulation of SSIM-like image quality measures through genetic programming. We explore different terminal sets, defined from the building blocks of structural similarity at different levels of abstraction, and we propose a two-stage genetic optimization that exploits hoist mutation to constrain the complexity of the solutions. Our optimized measures are selected through a cross-dataset validation procedure, which results in superior performance against different versions of structural similarity, measured as correlation with human mean opinion scores. We also demonstrate how, by tuning on specific datasets, it is possible to obtain solutions that are competitive with (or even outperform) more complex image quality measures.

6.
Sensors (Basel) ; 23(8)2023 Apr 07.
Article in English | MEDLINE | ID: mdl-37112129

ABSTRACT

Precision agriculture has emerged as a promising approach to improve crop productivity and reduce the environmental impact. However, effective decision making in precision agriculture relies on accurate and timely data acquisition, management, and analysis. The collection of multisource and heterogeneous data for soil characteristics estimation is a critical component of precision agriculture, as it provides insights into key factors, such as soil nutrient levels, moisture content, and texture. To address these challenges, this work proposes a software platform that facilitates the collection, visualization, management, and analysis of soil data. The platform is designed to handle data from various sources, including proximity, airborne, and spaceborne data, to enable precision agriculture. The proposed software allows for the integration of new data, including data that can be collected directly on-board the acquisition device, and it also allows for the incorporation of custom predictive systems for soil digital mapping. The usability experiments conducted on the proposed software platform demonstrate that it is easy to use and effective. Overall, this work highlights the importance of decision support systems in the field of precision agriculture and the potential benefits of using such systems for soil data management and analysis.

7.
J Opt Soc Am A Opt Image Sci Vis ; 39(6): B1-B10, 2022 Jun 01.
Article in English | MEDLINE | ID: mdl-36215522

ABSTRACT

Blind image quality assessment (BIQA) of authentically distorted images is a challenging problem due to the lack of a reference image and the coexistence of blends of distortions with unknown characteristics. In this article, we present a convolutional neural network based BIQA model. It encodes the input image into multi-level features to estimate the perceptual quality score. The proposed model is designed to predict the image quality score but is trained for jointly treating the image quality assessment as a classification, regression, and pairwise ranking problem. Experimental results on three different datasets of authentically distorted images show that the proposed method achieves comparable results with state-of-the-art methods in intra-dataset experiments and is more effective in cross-dataset experiments.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Image Processing, Computer-Assisted/methods , Neural Networks, Computer
8.
Sensors (Basel) ; 21(22)2021 Nov 09.
Article in English | MEDLINE | ID: mdl-34833529

ABSTRACT

Smart mirrors are devices that can display any kind of information and can interact with the user using touch and voice commands. Different kinds of smart mirrors exist: general purpose, medical, fashion, and other task specific ones. General purpose smart mirrors are suitable for home environments but the exiting ones offer similar, limited functionalities. In this paper, we present a general-purpose smart mirror that integrates several functionalities, standard and advanced, to support users in their everyday life. Among the advanced functionalities are the capabilities of detecting a person's emotions, the short- and long-term monitoring and analysis of the emotions, a double authentication protocol to preserve the privacy, and the integration of Alexa Skills to extend the applications of the smart mirrors. We exploit a deep learning technique to develop most of the smart functionalities. The effectiveness of the device is demonstrated by the performances of the implemented functionalities, and the evaluation in terms of its usability with real users.


Subject(s)
Emotions , Voice , Humans , Privacy
9.
J Imaging ; 7(3)2021 Mar 13.
Article in English | MEDLINE | ID: mdl-34460711

ABSTRACT

Methods for No-Reference Video Quality Assessment (NR-VQA) of consumer-produced video content are largely investigated due to the spread of databases containing videos affected by natural distortions. In this work, we design an effective and efficient method for NR-VQA. The proposed method exploits a novel sampling module capable of selecting a predetermined number of frames from the whole video sequence on which to base the quality assessment. It encodes both the quality attributes and semantic content of video frames using two lightweight Convolutional Neural Networks (CNNs). Then, it estimates the quality score of the entire video using a Support Vector Regressor (SVR). We compare the proposed method against several relevant state-of-the-art methods using four benchmark databases containing user generated videos (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC). The results show that the proposed method at a substantially lower computational cost predicts subjective video quality in line with the state of the art methods on individual databases and generalizes better than existing methods in cross-database setup.

10.
Sensors (Basel) ; 21(4)2021 Feb 12.
Article in English | MEDLINE | ID: mdl-33673052

ABSTRACT

The automatic assessment of the aesthetic quality of a photo is a challenging and extensively studied problem. Most of the existing works focus on the aesthetic quality assessment of photos regardless of the depicted subject and mainly use features extracted from the entire image. It has been observed that the performance of generic content aesthetic assessment methods significantly decreases when it comes to images depicting faces. This paper introduces a method for evaluating the aesthetic quality of images with faces by encoding both the properties of the entire image and specific aspects of the face. Three different convolutional neural networks are exploited to encode information regarding perceptual quality, global image aesthetics, and facial attributes; then, a model is trained to combine these features to explicitly predict the aesthetics of images containing faces. Experimental results show that our approach outperforms existing methods for both binary, i.e., low/high, and continuous aesthetic score prediction on four different image databases in the state-of-the-art.

11.
Sensors (Basel) ; 21(3)2021 Feb 02.
Article in English | MEDLINE | ID: mdl-33540652

ABSTRACT

We propose an anomaly detection based image quality assessment method which exploits the correlations between feature maps from a pre-trained Convolutional Neural Network (CNN). The proposed method encodes the intra-layer correlation through the Gram matrix and then estimates the quality score combining the average of the correlation and the output from an anomaly detection method. The latter evaluates the degree of abnormality of an image by computing a correlation similarity with respect to a dictionary of pristine images. The effectiveness of the method is tested on different benchmarking datasets (LIVE-itW, KONIQ, and SPAQ).

12.
Sensors (Basel) ; 21(3)2021 Feb 02.
Article in English | MEDLINE | ID: mdl-33540828

ABSTRACT

In this paper we present T1K+, a very large, heterogeneous database of high-quality texture images acquired under variable conditions. T1K+ contains 1129 classes of textures ranging from natural subjects to food, textile samples, construction materials, etc. T1K+ allows the design of experiments especially aimed at understanding the specific issues related to texture classification and retrieval. To help the exploration of the database, all the 1129 classes are hierarchically organized in 5 thematic categories and 266 sub-categories. To complete our study, we present an evaluation of hand-crafted and learned visual descriptors in supervised texture classification tasks.

13.
J Opt Soc Am A Opt Image Sci Vis ; 37(11): 1721-1730, 2020 Nov 01.
Article in English | MEDLINE | ID: mdl-33175748

ABSTRACT

Color constancy algorithms are typically evaluated with a statistical analysis of the recovery angular error and the reproduction angular error between the estimated and ground truth illuminants. Such analysis provides information about only the magnitude of the errors, and not about their chromatic properties. We propose an Angle-Retaining Chromaticity diagram (ARC) for the visual analysis of the estimated illuminants and the corresponding errors. We provide both quantitative and qualitative proof of the superiority of ARC in preserving angular distances compared to other chromaticity diagrams, making it possible to quantify the reproduction and recovery errors in terms of Euclidean distances on a plane. We present two case studies for the application of the ARC diagram in the visualization of the ground truth illuminants of color constancy datasets, and the visual analysis of error distributions of color constancy algorithms.

14.
Article in English | MEDLINE | ID: mdl-32365026

ABSTRACT

In this work we present SpliNet, a novel CNNbased method that estimates a global color transform for the enhancement of raw images. The method is designed to improve the perceived quality of the images by reproducing the ability of an expert in the field of photo editing. The transformation applied to the input image is found by a convolutional neural network specifically trained for this purpose. More precisely, the network takes as input a raw image and produces as output one set of control points for each of the three color channels. Then, the control points are interpolated with natural cubic splines and the resulting functions are globally applied to the values of the input pixels to produce the output image. Experimental results compare favorably against recent methods in the state of the art on the MIT-Adobe FiveK dataset. Furthermore, we also propose an extension of the SpliNet in which a single neural network is used to model the style of multiple reference retouchers by embedding them into a user space. The style of new users can be reproduced without retraining the network, after a quick modeling stage in which they are positioned in the user space on the basis of their preferences on a very small set of retouched images.

15.
J Imaging ; 6(8)2020 Jul 30.
Article in English | MEDLINE | ID: mdl-34460689

ABSTRACT

We introduce a no-reference method for the assessment of the quality of videos affected by in-capture distortions due to camera hardware and processing software. The proposed method encodes both quality attributes and semantic content of each video frame by using two Convolutional Neural Networks (CNNs) and then estimates the quality score of the whole video by using a Recurrent Neural Network (RNN), which models the temporal information. The extensive experiments conducted on four benchmark databases (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC) containing in-capture distortions demonstrate the effectiveness of the proposed method and its ability to generalize in cross-database setup.

16.
Sensors (Basel) ; 18(8)2018 Aug 14.
Article in English | MEDLINE | ID: mdl-30110891

ABSTRACT

We present a multi-task learning-based convolutional neural network (MTL-CNN) able to estimate multiple tags describing face images simultaneously. In total, the model is able to estimate up to 74 different face attributes belonging to three distinct recognition tasks: age group, gender and visual attributes (such as hair color, face shape and the presence of makeup). The proposed model shares all the CNN's parameters among tasks and deals with task-specific estimation through the introduction of two components: (i) a gating mechanism to control activations' sharing and to adaptively route them across different face attributes; (ii) a module to post-process the predictions in order to take into account the correlation among face attributes. The model is trained by fusing multiple databases for increasing the number of face attributes that can be estimated and using a center loss for disentangling representations among face attributes in the embedding space. Extensive experiments validate the effectiveness of the proposed approach.


Subject(s)
Face , Neural Networks, Computer , Databases, Factual , Deep Learning , Face/anatomy & histology
17.
Sensors (Basel) ; 18(1)2018 Jan 12.
Article in English | MEDLINE | ID: mdl-29329268

ABSTRACT

Automatic detection and localization of anomalies in nanofibrous materials help to reduce the cost of the production process and the time of the post-production visual inspection process. Amongst all the monitoring methods, those exploiting Scanning Electron Microscope (SEM) imaging are the most effective. In this paper, we propose a region-based method for the detection and localization of anomalies in SEM images, based on Convolutional Neural Networks (CNNs) and self-similarity. The method evaluates the degree of abnormality of each subregion of an image under consideration by computing a CNN-based visual similarity with respect to a dictionary of anomaly-free subregions belonging to a training set. The proposed method outperforms the state of the art.

18.
IEEE Trans Image Process ; 26(9): 4347-4362, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28600246

ABSTRACT

In this paper, we present a three-stage method for the estimation of the color of the illuminant in RAW images. The first stage uses a convolutional neural network that has been specially designed to produce multiple local estimates of the illuminant. The second stage, given the local estimates, determines the number of illuminants in the scene. Finally, local illuminant estimates are refined by non-linear local aggregation, resulting in a global estimate in case of single illuminant. An extensive comparison with both local and global illuminant estimation methods in the state of the art, on standard data sets with single and multiple illuminants, proves the effectiveness of our method.

19.
IEEE J Biomed Health Inform ; 21(3): 588-598, 2017 05.
Article in English | MEDLINE | ID: mdl-28114043

ABSTRACT

We propose a new dataset for the evaluation of food recognition algorithms that can be used in dietary monitoring applications. Each image depicts a real canteen tray with dishes and foods arranged in different ways. Each tray contains multiple instances of food classes. The dataset contains 1027 canteen trays for a total of 3616 food instances belonging to 73 food classes. The food on the tray images has been manually segmented using carefully drawn polygonal boundaries. We have benchmarked the dataset by designing an automatic tray analysis pipeline that takes a tray image as input, finds the regions of interest, and predicts for each region the corresponding food class. We have experimented with three different classification strategies using also several visual descriptors. We achieve about 79% of food and tray recognition accuracy using convolutional-neural-networks-based features. The dataset, as well as the benchmark framework, are available to the research community.


Subject(s)
Databases, Factual , Food/classification , Image Processing, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Algorithms , Humans , Neural Networks, Computer
20.
J Opt Soc Am A Opt Image Sci Vis ; 33(1): 17-30, 2016 Jan 01.
Article in English | MEDLINE | ID: mdl-26831581

ABSTRACT

The recognition of color texture under varying lighting conditions remains an open issue. Several features have been proposed for this purpose, ranging from traditional statistical descriptors to features extracted with neural networks. Still, it is not completely clear under what circumstances a feature performs better than others. In this paper, we report an extensive comparison of old and new texture features, with and without a color normalization step, with a particular focus on how these features are affected by small and large variations in the lighting conditions. The evaluation is performed on a new texture database, which includes 68 samples of raw food acquired under 46 conditions that present single and combined variations of light color, direction, and intensity. The database allows us to systematically investigate the robustness of texture descriptors across large variations of imaging conditions.

SELECTION OF CITATIONS
SEARCH DETAIL
...