Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 10615-10631, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37079402

ABSTRACT

Deep convolutional neural networks for dense prediction tasks are commonly optimized using synthetic data, as generating pixel-wise annotations for real-world data is laborious. However, the synthetically trained models do not generalize well to real-world environments. This poor "synthetic to real" (S2R) generalization we address through the lens of shortcut learning. We demonstrate that the learning of feature representations in deep convolutional networks is heavily influenced by synthetic data artifacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance (ITSA) approach to automatically restrict shortcut-related information from being encoded into the feature representations. Specifically, our proposed method minimizes the sensitivity of latent features to input variations: to regularize the learning of robust and shortcut-invariant features in synthetically trained models. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose a practical yet feasible algorithm to achieve robustness. Our results show that the proposed method can effectively improve S2R generalization in multiple distinct dense prediction tasks, such as stereo matching, optical flow, and semantic segmentation. Importantly, the proposed method enhances the robustness of the synthetically trained networks and outperforms their fine-tuned counterparts (on real data) for challenging out-of-domain applications.

2.
Article in English | MEDLINE | ID: mdl-32275594

ABSTRACT

This paper presents an innovative method for motion segmentation in RGB-D dynamic videos with multiple moving objects. The focus is on finding static, small or slow moving objects (often overlooked by other methods) that their inclusion can improve the motion segmentation results. In our approach, semantic object based segmentation and motion cues are combined to estimate the number of moving objects, their motion parameters and perform segmentation. Selective object-based sampling and correspondence matching are used to estimate object specific motion parameters. The main issue with such an approach is the over segmentation of moving parts due to the fact that different objects can have the same motion (e.g. background objects). To resolve this issue, we propose to identify objects with similar motions by characterizing each motion by a distribution of a simple metric and using a statistical inference theory to assess their similarities. To demonstrate the significance of the proposed statistical inference, we present an ablation study, with and without static objects inclusion, on SLAM accuracy using the TUM-RGBD dataset. To test the effectiveness of the proposed method for finding small or slow moving objects, we applied the method to RGB-D MultiBody and SBM-RGBD motion segmentation datasets. The results showed that we can improve the accuracy of motion segmentation for small objects while remaining competitive on overall measures.

3.
Sensors (Basel) ; 20(3)2020 Feb 10.
Article in English | MEDLINE | ID: mdl-32050574

ABSTRACT

One of the core challenges in visual multi-target tracking is occlusion. This is especially important in applications such as video surveillance and sports analytics. While offline batch processing algorithms can utilise future measurements to handle occlusion effectively, online algorithms have to rely on current and past measurements only. As such, it is markedly more challenging to handle occlusion in online applications. To address this problem, we propagate information over time in a way that it generates a sense of déjà vu when similar visual and motion features are observed. To achieve this, we extend the Generalized Labeled Multi-Bernoulli (GLMB) filter, originally designed for tracking point-sized targets, to be used in visual multi-target tracking. The proposed algorithm includes a novel false alarm detection/removal and label recovery methods capable of reliably recovering tracks that are even lost for a substantial period of time. We compare the performance of the proposed method with the state-of-the-art methods in challenging datasets using standard visual tracking metrics. Our comparisons show that the proposed method performs favourably compared to the state-of-the-art methods, particularly in terms of ID switches and fragmentation metrics which signifies occlusion.

4.
IEEE Trans Med Imaging ; 39(4): 854-865, 2020 04.
Article in English | MEDLINE | ID: mdl-31425069

ABSTRACT

Volumetric imaging is an essential diagnostic tool for medical practitioners. The use of popular techniques such as convolutional neural networks (CNN) for analysis of volumetric images is constrained by the availability of detailed (with local annotations) training data and GPU memory. In this paper, the volumetric image classification problem is posed as a multi-instance classification problem and a novel method is proposed to adaptively select positive instances from positive bags during the training phase. This method uses the extreme value theory to model the feature distribution of the images without a pathology and use it to identify positive instances of an imaged pathology. The experimental results, on three separate image classification tasks (i.e. classify retinal OCT images according to the presence or absence of fluid build-ups, emphysema detection in pulmonary 3D-CT images and detection of cancerous regions in 2D histopathology images) show that the proposed method produces classifiers that have similar performance to fully supervised methods and achieves the state of the art performance in all examined test cases.


Subject(s)
Deep Learning , Imaging, Three-Dimensional/methods , Tomography, X-Ray Computed/methods , Algorithms , Emphysema/diagnostic imaging , Humans , Lung/diagnostic imaging , Pulmonary Disease, Chronic Obstructive/diagnostic imaging , Supervised Machine Learning
5.
Sensors (Basel) ; 19(17)2019 Sep 01.
Article in English | MEDLINE | ID: mdl-31480502

ABSTRACT

In many multi-object tracking applications, the sensor(s) may have controllable states. Examples include movable sensors in multi-target tracking applications in defence, and unmanned air vehicles (UAVs) as sensors in multi-object systems used in civil applications such as inspection and fault detection. Uncertainties in the number of objects (due to random appearances and disappearances) as well as false alarms and detection uncertainties collectively make the above problem a highly challenging stochastic sensor control problem. Numerous solutions have been proposed to tackle the problem of precise control of sensor(s) for multi-object detection and tracking, and, in this work, recent contributions towards the advancement in the domain are comprehensively reviewed. After an introduction, we provide an overview of the sensor control problem and present the key components of sensor control solutions in general. Then, we present a categorization of the existing methods and review those methods under each category. The categorization includes a new generation of solutions called selective sensor control that have been recently developed for applications where particular objects of interest need to be accurately detected and tracked by controllable sensors.

6.
Sensors (Basel) ; 19(9)2019 Apr 29.
Article in English | MEDLINE | ID: mdl-31035720

ABSTRACT

This paper presents a novel Track-Before-Detect (TBD) Labeled Multi-Bernoulli (LMB) filter tailored for industrial mobile platform safety applications. At the core of the developed solution is two techniques for fusion of color and edge information in visual tracking. We derive an application specific separable likelihood function that captures the geometric shape of the human targets wearing safety vests. We use a novel geometric shape likelihood along with a color likelihood to devise two Bayesian updates steps which fuse shape and color related information. One approach is sequential and the other is based on weighted Kullback-Leibler average (KLA). Experimental results show that the KLA based fusion variant of the proposed algorithm outperforms both the sequential update based variant and a state-of-art method in terms of the performance metrics commonly used in computer vision literature.

7.
Sensors (Basel) ; 19(7)2019 Apr 03.
Article in English | MEDLINE | ID: mdl-30987259

ABSTRACT

There is a large body of literature on solving the SLAM problem for various autonomous vehicle applications. A substantial part of the solutions is formulated based on using statistical (mainly Bayesian) filters such as Kalman filter and its extended version. In such solutions, the measurements are commonly some point features or detections collected by the sensor(s) on board the autonomous vehicle. With the increasing utilization of scanners with common autonomous cars, and availability of 3D point clouds in real-time and at fast rates, it is now possible to use more sophisticated features extracted from the point clouds for filtering. This paper presents the idea of using planar features with multi-object Bayesian filters for SLAM. With Bayesian filters, the first step is prediction, where the object states are propagated to the next time based on a stochastic transition model. We first present how such a transition model can be developed, and then propose a solution for state prediction. In the simulation studies, using a dataset of measurements acquired from real vehicle sensors, we apply the proposed model to predict the next planar features and vehicle states. The results show reasonable accuracy and efficiency for statistical filtering-based SLAM applications.

8.
IEEE Trans Image Process ; 27(9): 4182-4194, 2018 Sep.
Article in English | MEDLINE | ID: mdl-29870340

ABSTRACT

Identifying the underlying models in a set of data points that is contaminated by noise and outliers leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher-order affinities between data points into a graph, which can be clustered using spectral clustering. Calculating all possible higher-order affinities is computationally expensive. Hence, in most cases, only a subset is used. In this paper, we propose an effective sampling method for obtaining a highly accurate approximation of the full graph, which is required to solve multi-structural model fitting problems in computer vision. The proposed method is based on the observation that the usefulness of a graph for segmentation improves as the distribution of the hypotheses that are used to build the graph approaches the distribution of the actual parameters for the given data. In this paper, we approximate this actual parameter distribution by using a th-order statistics-based cost function, and the samples are generated using a greedy algorithm that is coupled with a data sub-sampling strategy. The experimental analysis shows that the proposed method is both accurate and computationally efficient compared with the state-of-the-art robust multi-model fitting techniques. The implementation of the method is publicly available from https://github.com/RuwanT/model-fitting-cbs.

9.
IEEE Trans Pattern Anal Mach Intell ; 38(2): 350-62, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26761739

ABSTRACT

Identifying the underlying model in a set of data contaminated by noise and outliers is a fundamental task in computer vision. The cost function associated with such tasks is often highly complex, hence in most cases only an approximate solution is obtained by evaluating the cost function on discrete locations in the parameter (hypothesis) space. To be successful at least one hypothesis has to be in the vicinity of the solution. Due to noise hypotheses generated by minimal subsets can be far from the underlying model, even when the samples are from the said structure. In this paper we investigate the feasibility of using higher than minimal subset sampling for hypothesis generation. Our empirical studies showed that increasing the sample size beyond minimal size ( p ), in particular up to p+2, will significantly increase the probability of generating a hypothesis closer to the true model when subsets are selected from inliers. On the other hand, the probability of selecting an all inlier sample rapidly decreases with the sample size, making direct extension of existing methods unfeasible. Hence, we propose a new computationally tractable method for robust model fitting that uses higher than minimal subsets. Here, one starts from an arbitrary hypothesis (which does not need to be in the vicinity of the solution) and moves until either a structure in data is found or the process is re-initialized. The method also has the ability to identify when the algorithm has reached a hypothesis with adequate accuracy and stops appropriately, thereby saving computational time. The experimental analysis carried out using synthetic and real data shows that the proposed method is both accurate and efficient compared to the state-of-the-art robust model fitting techniques.

10.
ScientificWorldJournal ; 2013: 878417, 2013.
Article in English | MEDLINE | ID: mdl-24348191

ABSTRACT

Motion segmentation is an important task in computer vision and several practical approaches have already been developed. A common approach to motion segmentation is to use the optical flow and formulate the segmentation problem using a linear approximation of the brightness constancy constraints. Although there are numerous solutions to solve this problem and their accuracies and reliabilities have been studied, the exact definition of the segmentation problem, its theoretical feasibility and the conditions for successful motion segmentation are yet to be derived. This paper presents a simplified theoretical framework for the prediction of feasibility, of segmentation of a two-dimensional linear equation system. A statistical definition of a separable motion (structure) is presented and a relatively straightforward criterion for predicting the separability of two different motions in this framework is derived. The applicability of the proposed criterion for prediction of the existence of multiple motions in practice is examined using both synthetic and real image sequences. The prescribed separability criterion is useful in designing computer vision applications as it is solely based on the amount of relative motion and the scale of measurement noise.


Subject(s)
Models, Theoretical , Motion , Algorithms , Computer Simulation
11.
ISA Trans ; 41(3): 283-301, 2002 Jul.
Article in English | MEDLINE | ID: mdl-12160343

ABSTRACT

In any autonomous mobile robot, one of the most important issues to be designed and implemented is environment perception. In this paper, a new approach is formulated in order to perform sensory data integration for generation of an occupancy grid map of the environment. This method is an extended version of the Bayesian fusion method for independent sources of information. The performance of the proposed method of fusion and its sensitivity are discussed. Map building simulation for a cylindrical robot with eight ultrasonic sensors and mapping implementation for a Khepera robot have been separately tried in simulation and experimental works. A new neural structure is introduced for conversion of proximity data that are given by Khepera IR sensors to occupancy probabilities. Path planning experiments have also been applied to the resulting maps. For each map, two factors are considered and calculated: the fitness and the augmented occupancy of the map with respect to the ideal map. The length and the least distance to obstacles were the other two factors that were calculated for the routes that are resulted by path planning experiments. Experimental and simulation results show that by using the new fusion formulas, more informative maps of the environment are obtained. By these maps more appropriate routes could be achieved. Actually, there is a tradeoff between the length of the resulting routes and their safety and by choosing the proper fusion function, this tradeoff is suitably tuned for different map building applications.


Subject(s)
Algorithms , Computer Simulation , Feedback , Models, Statistical , Robotics/instrumentation , Robotics/methods , Bayes Theorem , Fuzzy Logic , Motion , Neural Networks, Computer , Transducers
SELECTION OF CITATIONS
SEARCH DETAIL
...