Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
IEEE Trans Pattern Anal Mach Intell ; 44(1): 196-210, 2022 Jan.
Article in English | MEDLINE | ID: mdl-32750796

ABSTRACT

In this paper, we propose a novel approach to two-view minimal-case relative pose problems based on homography with known gravity direction. This case is relevant to smart phones, tablets, and other camera-IMU (Inertial measurement unit) systems which have accelerometers to measure the gravity vector. We explore the rank-1 constraint on the difference between the euclidean homography matrix and the corresponding rotation, and propose an efficient two-step solution for solving both the calibrated and semi-calibrated (unknown focal length) problems. Based on the hidden variable technique, we convert the problems to the polynomial eigenvalue problems, and derive new 3.5-point, 3.5-point, 4-point solvers for two cameras such that the two focal lengths are unknown but equal, one of them is unknown, and both are unknown and possibly different, respectively. We present detailed analyses and comparisons with the existing 6- and 7-point solvers, including results with smart phone images.

2.
IEEE Trans Pattern Anal Mach Intell ; 44(3): 1399-1414, 2022 Mar.
Article in English | MEDLINE | ID: mdl-32750842

ABSTRACT

We address the problem of semantic correspondence, that is, establishing a dense flow field between images depicting different instances of the same object or scene category. We propose to use images annotated with binary foreground masks and subjected to synthetic geometric deformations to train a convolutional neural network (CNN) for this task. Using these masks as part of the supervisory signal provides an object-level prior for the semantic correspondence task and offers a good compromise between semantic flow methods, where the amount of training data is limited by the cost of manually selecting point correspondences, and semantic alignment ones, where the regression of a single global geometric transformation between images may be sensitive to image-specific details such as background clutter. We propose a new CNN architecture, dubbed SFNet, which implements this idea. It leverages a new and differentiable version of the argmax function for end-to-end training, with a loss that combines mask and flow consistency with smoothness terms. Experimental results demonstrate the effectiveness of our approach, which significantly outperforms the state of the art on standard benchmarks.

3.
IEEE Trans Image Process ; 27(5): 2176-2188, 2018 May.
Article in English | MEDLINE | ID: mdl-29432099

ABSTRACT

In this paper, we propose a vanishing-point constrained Dijkstra road model for road detection in a stereo-vision paradigm. First, the stereo-camera is used to generate the u- and v-disparity maps of road image, from which the horizon can be extracted. With the horizon and ground region constraints, we can robustly locate the vanishing point of road region. Second, a weighted graph is constructed using all pixels of the image, and the detected vanishing point is treated as the source node of the graph. By computing a vanishing-point constrained Dijkstra minimum-cost map, where both disparity and gradient of gray image are used to calculate cost between two neighbor pixels, the problem of detecting road borders in image is transformed into that of finding two shortest paths that originate from the vanishing point to two pixels in the last row of image. The proposed approach has been implemented and tested over 2600 grayscale images of different road scenes in the KITTI data set. The experimental results demonstrate that this training-free approach can detect horizon, vanishing point, and road regions very accurately and robustly. It can achieve promising performance.

4.
IEEE Trans Pattern Anal Mach Intell ; 40(1): 192-207, 2018 01.
Article in English | MEDLINE | ID: mdl-28212077

ABSTRACT

Filtering images using a guidance signal, a process called guided or joint image filtering, has been used in various tasks in computer vision and computational photography, particularly for noise reduction and joint upsampling. This uses an additional guidance signal as a structure prior, and transfers the structure of the guidance signal to an input image, restoring noisy or altered image structure. The main drawbacks of such a data-dependent framework are that it does not consider structural differences between guidance and input images, and that it is not robust to outliers. We propose a novel SD (for static/dynamic) filter to address these problems in a unified framework, and jointly leverage structural information from guidance and input images. Guided image filtering is formulated as a nonconvex optimization problem, which is solved by the majorize-minimization algorithm. The proposed algorithm converges quickly while guaranteeing a local minimum. The SD filter effectively controls the underlying image structure at different scales, and can handle a variety of types of data from different sensors. It is robust to outliers and other artifacts such as gradient reversal and global intensity shift, and has good edge-preserving smoothing properties. We demonstrate the flexibility and effectiveness of the proposed SD filter in a variety of applications, including depth upsampling, scale-space filtering, texture removal, flash/non-flash denoising, and RGB/NIR denoising.

5.
IEEE Trans Pattern Anal Mach Intell ; 40(7): 1711-1725, 2018 07.
Article in English | MEDLINE | ID: mdl-28708543

ABSTRACT

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals. Unlike prevailing semantic flow approaches that operate on pixels or regularly sampled local regions, proposal flow benefits from the characteristics of modern object proposals, that exhibit high repeatability at multiple scales, and can take advantage of both local and geometric consistency constraints among proposals. We also show that the corresponding sparse proposal flow can effectively be transformed into a conventional dense flow field. We introduce two new challenging datasets that can be used to evaluate both general semantic flow techniques and region-based approaches such as proposal flow. We use these benchmarks to compare different matching algorithms, object proposals, and region features within proposal flow, to the state of the art in semantic flow. This comparison, along with experiments on standard datasets, demonstrates that proposal flow significantly outperforms existing semantic flow methods in various settings.

6.
IEEE Trans Pattern Anal Mach Intell ; 34(4): 791-804, 2012 Apr.
Article in English | MEDLINE | ID: mdl-21808090

ABSTRACT

Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience, and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that these models are well suited to restoration tasks. In this context, learning the dictionary amounts to solving a large-scale matrix factorization problem, which can be done efficiently with classical optimization tools. The same approach has also been used for learning features from data for other purposes, e.g., image classification, but tuning the dictionary in a supervised way for these tasks has proven to be more difficult. In this paper, we present a general formulation for supervised dictionary learning adapted to a wide variety of tasks, and present an efficient algorithm for solving the corresponding optimization problem. Experiments on handwritten digit classification, digital art identification, nonlinear inverse image problems, and compressed sensing demonstrate that our approach is effective in large-scale settings, and is well suited to supervised and semi-supervised classification, as well as regression tasks for data that admit sparse representations.


Subject(s)
Algorithms , Pattern Recognition, Automated/methods , Databases, Factual , Humans
7.
IEEE Trans Pattern Anal Mach Intell ; 33(12): 2383-95, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21646677

ABSTRACT

This paper addresses the problem of establishing correspondences between two sets of visual features using higher order constraints instead of the unary or pairwise ones used in classical methods. Concretely, the corresponding hypergraph matching problem is formulated as the maximization of a multilinear objective function over all permutations of the features. This function is defined by a tensor representing the affinity between feature tuples. It is maximized using a generalization of spectral techniques where a relaxed problem is first solved by a multidimensional power method and the solution is then projected onto the closest assignment matrix. The proposed approach has been implemented, and it is compared to state-of-the-art algorithms on both synthetic and real data.

8.
IEEE Trans Pattern Anal Mach Intell ; 32(8): 1362-76, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20558871

ABSTRACT

This paper proposes a novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges. We have tested our algorithm on various data sets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in front of a static structure of interest. A quantitative evaluation on the Middlebury benchmark shows that the proposed method outperforms all others submitted so far for four out of the six data sets.

9.
IEEE Trans Image Process ; 19(8): 2211-20, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20371404

ABSTRACT

Given a single image of an arbitrary road, that may not be well-paved, or have clearly delineated edges, or some a priori known color or texture distribution, is it possible for a computer to find this road? This paper addresses this question by decomposing the road detection process into two steps: the estimation of the vanishing point associated with the main (straight) part of the road, followed by the segmentation of the corresponding road area based upon the detected vanishing point. The main technical contributions of the proposed approach are a novel adaptive soft voting scheme based upon a local voting region using high-confidence voters, whose texture orientations are computed using Gabor filters, and a new vanishing-point-constrained edge detection technique for detecting road boundaries. The proposed method has been implemented, and experiments with 1003 general road images demonstrate that it is effective at detecting road regions in challenging conditions.


Subject(s)
Algorithms , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Transportation , Reproducibility of Results , Sensitivity and Specificity
10.
IEEE Trans Image Process ; 19(8): 2201-10, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20371405

ABSTRACT

This paper presents a novel framework for detecting nonflat abandoned objects by matching a reference and a target video sequences. The reference video is taken by a moving camera when there is no suspicious object in the scene. The target video is taken by a camera following the same route and may contain extra objects. The objective is to find these objects. GPS information is used to roughly align the two videos and find the corresponding frame pairs. Based upon the GPS alignment, four simple but effective ideas are proposed to achieve the objective: an intersequence geometric alignment based upon homographies, which is computed by a modified RANSAC, to find all possible suspicious areas, an intrasequence geometric alignment to remove false alarms caused by high objects, a local appearance comparison between two aligned intrasequence frames to remove false alarms in flat areas, and a temporal filtering step to confirm the existence of suspicious objects. Experiments on fifteen pairs of videos show the promise of the proposed method.


Subject(s)
Algorithms , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Security Measures , Video Recording/methods , Motion , Reproducibility of Results , Sensitivity and Specificity
11.
IEEE Trans Pattern Anal Mach Intell ; 29(3): 477-91, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17224617

ABSTRACT

This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and are observed by a moving camera. Multiview constraints associated with groups of affine-covariant scene patches and a normalized description of their appearance are used to segment a scene into its rigid components, construct three-dimensional models of these components, and match instances of models recovered from different image sequences. The proposed approach has been applied to the detection and matching of moving objects in video sequences and to shot matching, i.e., the identification of shots that depict the same scene in a video clip.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Models, Theoretical , Pattern Recognition, Automated/methods , Video Recording/methods , Computer Simulation , Information Storage and Retrieval/methods , Motion , Movement , Reproducibility of Results , Sensitivity and Specificity , Subtraction Technique
12.
IEEE Trans Pattern Anal Mach Intell ; 28(2): 302-15, 2006 Feb.
Article in English | MEDLINE | ID: mdl-16468625

ABSTRACT

This paper addresses the problem of estimating the motion of a camera as it observes the outline (or apparent contour) of a solid bounded by a smooth surface in successive image frames. In this context, the surface points that project onto the outline of an object depend on the viewpoint and the only true correspondences between two outlines of the same object are the projections of frontier points where the viewing rays intersect in the tangent plane of the surface. In turn, the epipolar geometry is easily estimated once these correspondences have been identified. Given the apparent contours detected in an image sequence, a robust procedure based on RANSAC and a voting strategy is proposed to simultaneously estimate the camera configurations and a consistent set of frontier point projections by enforcing the redundancy of multiview epipolar geometry. The proposed approach is, in principle, applicable to orthographic, weak-perspective, and affine projection models. Experiments with nine real image sequences are presented for the orthographic projection case, including a quantitative comparison with the ground-truth data for the six data sets for which the latter information is available. Sample visual hulls have been computed from all image sequences for qualitative evaluation.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Movement , Pattern Recognition, Automated/methods , Video Recording/methods , Information Storage and Retrieval/methods , Motion , Subtraction Technique
13.
IEEE Trans Pattern Anal Mach Intell ; 27(8): 1265-78, 2005 Aug.
Article in English | MEDLINE | ID: mdl-16119265

ABSTRACT

This paper introduces a texture representation suitable for recognizing images of textured surfaces under a wide range of transformations, including viewpoint changes and nonrigid deformations. At the feature extraction stage, a sparse set of affine Harris and Laplacian regions is found in the image. Each of these regions can be thought of as a texture element having a characteristic elliptic shape and a distinctive appearance pattern. This pattern is captured in an affine-invariant fashion via a process of shape normalization followed by the computation of two novel descriptors, the spin image and the RIFT descriptor. When affine invariance is not required, the original elliptical shape servee as an additional discriminative feature for texture recognition. The proposed approach is evaluated in retrieval and classification tasks using the entire Brodatz database and a publicly available collection of 1,000 photographs of textured surfaces taken from different viewpoints.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Cluster Analysis , Computer Graphics , Image Enhancement/methods , Information Storage and Retrieval/methods , Numerical Analysis, Computer-Assisted
SELECTION OF CITATIONS
SEARCH DETAIL
...