Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Pattern Anal Mach Intell ; 43(1): 1-16, 2021 Jan.
Article in English | MEDLINE | ID: mdl-31331880

ABSTRACT

Many real-world video sequences cannot be conveniently categorized as general or degenerate; in such cases, imposing a false dichotomy in using the fundamental matrix or homography model for motion segmentation on video sequences would lead to difficulty. Even when we are confronted with a general scene-motion, the fundamental matrix approach as a model for motion segmentation still suffers from several defects, which we discuss in this paper. The full potential of the fundamental matrix approach could only be realized if we judiciously harness information from the simpler homography model. From these considerations, we propose a multi-model spectral clustering framework that synergistically combines multiple models (homography and fundamental matrix) together. We show that the performance can be substantially improved in this way. For general motion segmentation tasks, the number of independently moving objects is often unknown a priori and needs to be estimated from the observations. This is referred to as model selection and it is essentially still an open research problem. In this work, we propose a set of model selection criteria balancing data fidelity and model complexity. We perform extensive testing on existing motion segmentation datasets with both segmentation and model selection tasks, achieving state-of-the-art performance on all of them; we also put forth a more realistic and challenging dataset adapted from the KITTI benchmark, containing real-world effects such as strong perspectives and strong forward translations not seen in the traditional datasets.

2.
Nat Commun ; 10(1): 4995, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31676790

ABSTRACT

Maintenance of working memory is thought to involve the activity of prefrontal neuronal populations with strong recurrent connections. However, it was recently shown that distractors evoke a morphing of the prefrontal population code, even when memories are maintained throughout the delay. How can a morphing code maintain time-invariant memory information? We hypothesized that dynamic prefrontal activity contains time-invariant memory information within a subspace of neural activity. Using an optimization algorithm, we found a low-dimensional subspace that contains time-invariant memory information. This information was reduced in trials where the animals made errors in the task, and was also found in periods of the trial not used to find the subspace. A bump attractor model replicated these properties, and provided predictions that were confirmed in the neural data. Our results suggest that the high-dimensional responses of prefrontal cortex contain subspaces where different types of information can be simultaneously encoded with minimal interference.


Subject(s)
Macaca fascicularis/physiology , Memory, Short-Term/physiology , Neurons/physiology , Prefrontal Cortex/physiology , Algorithms , Animals , Male , Models, Neurological , Prefrontal Cortex/cytology , Time Factors
3.
IEEE Trans Pattern Anal Mach Intell ; 40(8): 1964-1978, 2018 08.
Article in English | MEDLINE | ID: mdl-28809676

ABSTRACT

While clustering has been well studied in the past decade, model selection has drawn much less attention due to the difficulty of the problem. In this paper, we address both problems in a joint manner by recovering an ideal affinity tensor from an imperfect input. By taking into account the relationship of the affinities induced by the cluster structures, we are able to significantly improve the affinity input, such as repairing those entries corrupted by gross outliers. More importantly, the recovered ideal affinity tensor also directly indicates the number of clusters and their membership, thus solving the model selection and clustering jointly. To enforce the requisite global consistency in the affinities demanded by the cluster structure, we impose a number of constraints, specifically, among others, the tensor should be low rank and sparse, and it should obey what we call the rank-1 sum constraint. To solve this highly non-smooth and non-convex problem, we exploit the mathematical structures, and express the original problem in an equivalent form amenable for numerical optimization and convergence analysis. To scale to large problem sizes, we also propose an alternative formulation, so that those problems can be efficiently solved via stochastic optimization in an online fashion. We evaluate our algorithm with different applications to demonstrate its superiority, and show it can adapt to a large variety of settings.

4.
IEEE Trans Pattern Anal Mach Intell ; 36(10): 1975-87, 2014 Oct.
Article in English | MEDLINE | ID: mdl-26352629

ABSTRACT

Recent evaluation [2], [13] of representative background subtraction techniques demonstrated that there are still considerable challenges facing these methods. Challenges in realistic environment include illumination change causing complex intensity variation, background motions (trees, waves, etc.) whose magnitude can be greater than those of the foreground, poor image quality under low light, camouflage, etc. Existing methods often handle only part of these challenges; we address all these challenges in a unified framework which makes little specific assumption of the background. We regard the observed image sequence as being made up of the sum of a low-rank background matrix and a sparse outlier matrix and solve the decomposition using the Robust Principal Component Analysis method. Our contribution lies in dynamically estimating the support of the foreground regions via a motion saliency estimation step, so as to impose spatial coherence on these regions. Unlike smoothness constraint such as MRF, our method is able to obtain crisply defined foreground regions, and in general, handles large dynamic background motion much better. Furthermore, we also introduce an image alignment step to handle camera jitter. Extensive experiments on benchmark and additional challenging data sets demonstrate that our method works effectively on a wide range of complex scenarios, resulting in best performance that significantly outperforms many state-of-the-art approaches.

5.
IEEE Trans Pattern Anal Mach Intell ; 34(4): 639-53, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22383341

ABSTRACT

Attention is an integral part of the human visual system and has been widely studied in the visual attention literature. The human eyes fixate at important locations in the scene, and every fixation point lies inside a particular region of arbitrary shape and size, which can either be an entire object or a part of it. Using that fixation point as an identification marker on the object, we propose a method to segment the object of interest by finding the "optimal" closed contour around the fixation point in the polar space, avoiding the perennial problem of scale in the Cartesian space. The proposed segmentation process is carried out in two separate steps: First, all visual cues are combined to generate the probabilistic boundary edge map of the scene; second, in this edge map, the "optimal" closed contour around a given fixation point is found. Having two separate steps also makes it possible to establish a simple feedback between the mid-level cue (regions) and the low-level visual cues (edges). In fact, we propose a segmentation refinement process based on such a feedback process. Finally, our experiments show the promise of the proposed method as an automatic segmentation framework for a general purpose visual system.


Subject(s)
Algorithms , Eye , Image Processing, Computer-Assisted/methods , Vision, Ocular/physiology , Cues , Form Perception , Humans
6.
J Opt Soc Am A Opt Image Sci Vis ; 24(6): 1485-500, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17491617

ABSTRACT

Cinema viewed from a location other than a canonical viewing point (CVP) presents distortions to the viewer in both its static and its dynamic aspects. Past works have investigated mainly the static aspect of this problem and attempted to explain why viewers still seem to perceive the scene very well. The dynamic aspect of depth perception, which is known as structure from motion, and its possible distortion, have not been well investigated. We derive the dynamic depth cues perceived by the viewer and use the so-called isodistortion framework to understand its distortion. The result is that viewers seated at a reasonably central position experience a shift in the intrinsic parameters of their visual systems. Despite this shift, the key properties of the perceived depths remain largely the same, being determined in the main by the accuracy to which extrinsic motion parameters can be recovered. For a viewer seated at a noncentral position and watching the movie screen at a slant angle, the view is related to the view at the CVP by a homography, resulting in various aberrations such as noncentral projection.

7.
Vision Res ; 42(16): 1991-2003, 2002 Jul.
Article in English | MEDLINE | ID: mdl-12160571

ABSTRACT

We investigated the ability of monocular human observer to scale absolute distance during sagittal head motion in the presence of pure optic flow information. Subjects were presented at eye-level computer-generated spheres (covered with randomly distributed dots) placed at several distances. We compared the condition of self-motion (SM) versus object-motion (OM) using equivalent optic flow field. When the amplitude of head movement was relatively constant, subjects estimated absolute distance rather accurately in both the SM and OM conditions. However, when the amplitude changed on a trial-to-trial basis, subjects' performance deteriorated only in the OM condition. We found that distance judgment in OM condition correlated strongly with optic flow divergence, and that non-visual cues served as important factors for scaling distances in SM condition. Absolute distance also seemed to be better scaled with sagittal head movement when compared with lateral head translation.


Subject(s)
Cues , Depth Perception/physiology , Head Movements , Vision, Monocular , Adolescent , Adult , Humans , Kinesthesis , Psychophysics
8.
IEEE Trans Image Process ; 11(10): 1179-91, 2002.
Article in English | MEDLINE | ID: mdl-18249690

ABSTRACT

Temporal texture accounts for a large proportion of motion commonly experienced in the visual world. Current temporal texture techniques extract primarily motion-based features for recognition. We propose a representation where both the spatial and the temporal aspects of texture are coupled together. Such a representation has the advantages of improving efficiency as well as retaining both spatial and temporal semantics. Flow measurements form the basis of our representation. The magnitudes and directions of the normal flow are mapped as spatiotemporal textures. These textures are then aggregated over time and are subsequently analyzed by classical texture analysis tools. Such aggregation traces the history of a motion which can be useful in the understanding of motion types. By providing a spatiotemporal analysis, our approach gains several advantages over previous implementations. The strength of our approach was demonstrated in a series of experiments, including classification and comparisons with other algorithms.

SELECTION OF CITATIONS
SEARCH DETAIL
...