Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38713563

ABSTRACT

Deep learning models dealing with image understanding in real-world settings must be able to adapt to a wide variety of tasks across different domains. Domain adaptation and class incremental learning deal with domain and task variability separately, whereas their unified solution is still an open problem. We tackle both facets of the problem together, taking into account the semantic shift within both input and label spaces. We start by formally introducing continual learning under task and domain shift. Then, we address the proposed setup by using style transfer techniques to extend knowledge across domains when learning incremental tasks and a robust distillation framework to effectively recollect task knowledge under incremental domain shift. The devised framework (LwS, Learning with Style) is able to generalize incrementally acquired task knowledge across all the domains encountered, proving to be robust against catastrophic forgetting. Extensive experimental evaluation on multiple autonomous driving datasets shows how the proposed method outperforms existing approaches, which prove to be ill-equipped to deal with continual semantic segmentation under both task and domain shift. The code is available at https://lttm.dei.unipd.it/paper data/LwS.

2.
Neural Netw ; 153: 303-313, 2022 Sep.
Article in English | MEDLINE | ID: mdl-35772251

ABSTRACT

Training procedures for deep networks require the setting of several hyper-parameters that strongly affect the obtained results. The problem is even worse in adversarial learning strategies used for image generation where a proper balancing of the discriminative and generative networks is fundamental for an effective training. In this work we propose a novel hyper-parameters optimization strategy based on the use of Proportional-Integral (PI) and Proportional-Integral-Derivative (PID) controllers. Both open loop and closed loop schemes for the tuning of a single parameter or of multiple parameters together are proposed allowing an efficient parameter tuning without resorting to computationally demanding trial-and-error schemes. We applied the proposed strategies to the widely used BEGAN and CycleGAN models: They allowed to achieve a more stable training that converges faster. The obtained images are also sharper with a slightly better quality both visually and according to the FID and FCN metrics. Image translation results also showed better background preservation and less color artifacts with respect to CycleGAN.


Subject(s)
Artifacts , Image Processing, Computer-Assisted , Image Processing, Computer-Assisted/methods
3.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9195-9208, 2022 Dec.
Article in English | MEDLINE | ID: mdl-34714740

ABSTRACT

Depth maps acquired with ToF cameras have a limited accuracy due to the high noise level and to the multi-path interference. Deep networks can be used for refining ToF depth, but their training requires real world acquisitions with ground truth, which is complex and expensive to collect. A possible workaround is to train networks on synthetic data, but the domain shift between the real and synthetic data reduces the performances. In this paper, we propose three approaches to perform unsupervised domain adaptation of a depth denoising network from synthetic to real data. These approaches are respectively acting at the input, at the feature and at the output level of the network. The first approach uses domain translation networks to transform labeled synthetic ToF data into a representation closer to real data, that is then used to train the denoiser. The second approach tries to align the network internal features related to synthetic and real data. The third approach uses an adversarial loss, implemented with a discriminator trained to recognize the ground truth statistic, to train the denoiser on unlabeled real data. Experimental results show that the considered approaches are able to outperform other state-of-the-art techniques and achieve superior denoising performances.

4.
Sensors (Basel) ; 21(6)2021 Mar 11.
Article in English | MEDLINE | ID: mdl-33799603

ABSTRACT

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.

5.
Data Brief ; 27: 104619, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31687438

ABSTRACT

Time-of-Flight (ToF) sensors and stereo vision systems are two of the most diffused depth acquisition devices for commercial and industrial applications. They share complementary strengths and weaknesses. For this reason, the combination of data acquired from these devices can improve the final depth estimation accuracy. This paper introduces a dataset acquired with a multi-camera system composed by a Microsoft Kinect v2 ToF sensor, an Intel RealSense R200 active stereo sensor and a Stereolabs ZED passive stereo camera system. The acquired scenes include indoor settings with different external lighting conditions. The depth ground truth has been acquired for each scene of the dataset using a line laser. The data can be used for developing fusion and denoising algorithms for depth estimation and test with different lighting conditions. A subset of the data has already been used for the experimental evaluation of the work "Stereo and ToF Data Fusion by Learning from Synthetic Data".

6.
IEEE Trans Pattern Anal Mach Intell ; 37(11): 2260-72, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26440266

ABSTRACT

This paper proposes a method for fusing data acquired by a ToF camera and a stereo pair based on a model for depth measurement by ToF cameras which accounts also for depth discontinuity artifacts due to the mixed pixel effect. Such model is exploited within both a ML and a MAP-MRF frameworks for ToF and stereo data fusion. The proposed MAP-MRF framework is characterized by site-dependent range values, a rather important feature since it can be used both to improve the accuracy and to decrease the computational complexity of standard MAP-MRF approaches. This paper, in order to optimize the site dependent global cost function characteristic of the proposed MAP-MRF approach, also introduces an extension to Loopy Belief Propagation which can be used in other contexts. Experimental data validate the proposed ToF measurements model and the effectiveness of the proposed fusion techniques.

7.
IEEE Trans Image Process ; 22(5): 1982-95, 2013 May.
Article in English | MEDLINE | ID: mdl-23335671

ABSTRACT

Recent work on depth map compression has revealed the importance of incorporating a description of discontinuity boundary geometry into the compression scheme. We propose a novel compression strategy for depth maps that incorporates geometry information while achieving the goals of scalability and embedded representation. Our scheme involves two separate image pyramid structures, one for breakpoints and the other for sub-band samples produced by a breakpoint-adaptive transform. Breakpoints capture geometric attributes, and are amenable to scalable coding. We develop a rate-distortion optimization framework for determining the presence and precision of breakpoints in the pyramid representation. We employ a variation of the EBCOT scheme to produce embedded bit-streams for both the breakpoint and sub-band data. Compared to JPEG 2000, our proposed scheme enables the same the scalability features while achieving substantially improved rate-distortion performance at the higher bit-rate range and comparable performance at the lower rates.

SELECTION OF CITATIONS
SEARCH DETAIL
...