Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 41(2): 379-393, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29994497

RESUMO

We propose a method designed to push the frontiers of unconstrained face recognition in the wild with an emphasis on extreme out-of-plane pose variations. Existing methods either expect a single model to learn pose invariance by training on massive amounts of data or else normalize images by aligning faces to a single frontal pose. Contrary to these, our method is designed to explicitly tackle pose variations. Our proposed Pose-Aware Models (PAM) process a face image using several pose-specific, deep convolutional neural networks (CNN). 3D rendering is used to synthesize multiple face poses from input images to both train these models and to provide additional robustness to pose variations at test time. Our paper presents an extensive analysis of the IARPA Janus Benchmark A (IJB-A), evaluating the effects that landmark detection accuracy, CNN layer selection, and pose model selection all have on the performance of the recognition pipeline. It further provides comparative evaluations on IJB-A and the PIPA dataset. These tests show that our approach outperforms existing methods, even surprisingly matching the accuracy of methods that were specifically fine-tuned to the target dataset. Parts of this work previously appeared in [1] and [2].

2.
IEEE Trans Pattern Anal Mach Intell ; 40(12): 3067-3074, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-29990138

RESUMO

This paper concerns the problem of facial landmark detection. We provide a unique new analysis of the features produced at intermediate layers of a convolutional neural network (CNN) trained to regress facial landmark coordinates. This analysis shows that while being processed by the CNN, face images can be partitioned in an unsupervised manner into subsets containing faces in similar poses (i.e., 3D views) and facial properties (e.g., presence or absence of eye-wear). Based on this finding, we describe a novel CNN architecture, specialized to regress the facial landmark coordinates of faces in specific poses and appearances. To address the shortage of training data, particularly in extreme profile poses, we additionally present data augmentation techniques designed to provide sufficient training examples for each of these specialized sub-networks. The proposed Tweaked CNN (TCNN) architecture is shown to outperform existing landmark detection methods in an extensive battery of tests on the AFW, ALFW, and 300W benchmarks. Finally, to promote reproducibility of our results, we make code and trained models publicly available through our project webpage.

3.
J Med Imaging (Bellingham) ; 2(1): 014501, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26158084

RESUMO

Geographic atrophy (GA) is a manifestation of the advanced or late stage of age-related macular degeneration (AMD). AMD is the leading cause of blindness in people over the age of 65 in the western world. The purpose of this study is to develop a fully automated supervised pixel classification approach for segmenting GA, including uni- and multifocal patches in fundus autofluorescene (FAF) images. The image features include region-wise intensity measures, gray-level co-occurrence matrix measures, and Gaussian filter banks. A [Formula: see text]-nearest-neighbor pixel classifier is applied to obtain a GA probability map, representing the likelihood that the image pixel belongs to GA. Sixteen randomly chosen FAF images were obtained from 16 subjects with GA. The algorithm-defined GA regions are compared with manual delineation performed by a certified image reading center grader. Eight-fold cross-validation is applied to evaluate the algorithm performance. The mean overlap ratio (OR), area correlation (Pearson's [Formula: see text]), accuracy (ACC), true positive rate (TPR), specificity (SPC), positive predictive value (PPV), and false discovery rate (FDR) between the algorithm- and manually defined GA regions are [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], respectively.

4.
IEEE Trans Pattern Anal Mach Intell ; 36(7): 1414-27, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26353312

RESUMO

We address the problem of structure learning of human motion in order to recognize actions from a continuous monocular motion sequence of an arbitrary person from an arbitrary viewpoint. Human motion sequences are represented by multivariate time series in the joint-trajectories space. Under this structured time series framework, we first propose Kernelized Temporal Cut (KTC), an extension of previous works on change-point detection by incorporating Hilbert space embedding of distributions, to handle the nonparametric and high dimensionality issues of human motions. Experimental results demonstrate the effectiveness of our approach, which yields realtime segmentation, and produces high action segmentation accuracy. Second, a spatio-temporal manifold framework is proposed to model the latent structure of time series data. Then an efficient spatio-temporal alignment algorithm Dynamic Manifold Warping (DMW) is proposed for multivariate time series to calculate motion similarity between action sequences (segments). Furthermore, by combining the temporal segmentation algorithm and the alignment algorithm, online human action recognition can be performed by associating a few labeled examples from motion capture data. The results on human motion capture data and 3D depth sensor data demonstrate the effectiveness of the proposed approach in automatically segmenting and recognizing motion sequences, and its ability to handle noisy and partially occluded data, in the transfer learning module.


Assuntos
Imageamento Tridimensional/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Imagem Corporal Total/métodos , Actigrafia/métodos , Algoritmos , Humanos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Atividade Motora/fisiologia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
5.
Invest Ophthalmol Vis Sci ; 54(13): 8375-83, 2013 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-24265015

RESUMO

PURPOSE: Geographic atrophy (GA) is the atrophic late-stage manifestation of age-related macular degeneration (AMD), which may result in severe vision loss and blindness. The purpose of this study was to develop a reliable, effective approach for GA segmentation in both spectral-domain optical coherence tomography (SD-OCT) and fundus autofluorescence (FAF) images using a level set-based approach and to compare the segmentation performance in the two modalities. METHODS: To identify GA regions in SD-OCT images, three retinal surfaces were first segmented in volumetric SD-OCT images using a double-surface graph search scheme. A two-dimensional (2-D) partial OCT projection image was created from the segmented choroid layer. A level set approach was applied to segment the GA in the partial OCT projection image. In addition, the algorithm was applied to FAF images for the GA segmentation. Twenty randomly chosen macular SD-OCT (Zeiss Cirrus) volumes and 20 corresponding FAF (Heidelberg Spectralis) images were obtained from 20 subjects with GA. The algorithm-defined GA region was compared with consensus manual delineation performed by certified graders. RESULTS: The mean Dice similarity coefficients (DSC) between the algorithm- and manually defined GA regions were 0.87 ± 0.09 in partial OCT projection images and 0.89 ± 0.07 in registered FAF images. The area correlations between them were 0.93 (P < 0.001) in partial OCT projection images and 0.99 (P < 0.001) in FAF images. The mean DSC between the algorithm-defined GA regions in the partial OCT projection and registered FAF images was 0.79 ± 0.12, and the area correlation was 0.96 (P < 0.001). CONCLUSIONS: A level set approach was developed to segment GA regions in both SD-OCT and FAF images. This approach demonstrated good agreement between the algorithm- and manually defined GA regions within each single modality. The GA segmentation in FAF images performed better than in partial OCT projection images. Across the two modalities, the GA segmentation presented reasonable agreement.


Assuntos
Angiofluoresceinografia , Atrofia Geográfica/diagnóstico , Tomografia de Coerência Óptica , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Corioide/patologia , Feminino , Fundo de Olho , Humanos , Imageamento Tridimensional , Masculino , Pessoa de Meia-Idade , Epitélio Pigmentado da Retina/patologia , Acuidade Visual
6.
J Opt Soc Am A Opt Image Sci Vis ; 30(3): 353-66, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23456111

RESUMO

Existing hierarchical techniques that decompose an image into a smooth image and high frequency components based on Gaussian filter and bilateral filter suffer from halo effects, whereas techniques based on weighted least squares extract low contrast features as details. Other techniques require multiple images and are not tolerant to noise. We use a single image to enhance sharpness based on a hierarchical framework using a modified Laplacian pyramid. In order to ensure robustness, we remove noise by using an extra level in the hierarchical framework. We use an edge-preserving nonlocal means filter and modify it to remove potential halo effects and gradient reversals. However, these effects are only reduced but not removed completely after similar modifications are made to the bilateral filter. We compare our results with existing techniques and show better decomposition and enhancement. Based on validation by human observers, we introduce a new measure to quantify sharpness quality, which allows us to automatically set parameters in order to achieve preferred sharpness enhancement. This causes blurry images to be sharpened more and sufficiently sharp images not to be sharpened. Finally, we demonstrate applications in the context of robust high dynamic range tone mapping that is better than state-of-the-art approaches and enhancement of archaeological artifacts.


Assuntos
Aumento da Imagem/métodos , Algoritmos , Arqueologia , Automação , Razão Sinal-Ruído , Software , Fatores de Tempo
7.
Artigo em Inglês | MEDLINE | ID: mdl-23365889

RESUMO

Retinal prostheses for the blind have demonstrated the ability to provide the sensation of light in otherwise blind individuals. However, visual task performance in these patients remains poor relative to someone with normal vision. Computer vision algorithms for navigation and object detection were evaluated for their ability to improve task performance. Blind subjects navigating a mobility course had fewer collisions when using a wearable camera system that guided them on a safe path. Subjects using a retinal prosthesis simulator could locate objects more quickly when an object detection algorithm assisted them. Computer vision algorithms can assist retinal prosthesis patients and low-vision patients in general.


Assuntos
Algoritmos , Cegueira , Olho Artificial , Processamento de Imagem Assistida por Computador , Retina , Análise e Desempenho de Tarefas , Próteses Visuais , Feminino , Humanos , Processamento de Imagem Assistida por Computador/instrumentação , Processamento de Imagem Assistida por Computador/métodos , Masculino
8.
IEEE Trans Pattern Anal Mach Intell ; 34(8): 1482-95, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22184257

RESUMO

We prove a closed-form solution to tensor voting (CFTV): Given a point set in any dimensions, our closed-form solution provides an exact, continuous, and efficient algorithm for computing a structure-aware tensor that simultaneously achieves salient structure detection and outlier attenuation. Using CFTV, we prove the convergence of tensor voting on a Markov random field (MRF), thus termed as MRFTV, where the structure-aware tensor at each input site reaches a stationary state upon convergence in structure propagation. We then embed structure-aware tensor into expectation maximization (EM) for optimizing a single linear structure to achieve efficient and robust parameter estimation. Specifically, our EMTV algorithm optimizes both the tensor and fitting parameters and does not require random sampling consensus typically used in existing robust statistical techniques. We performed quantitative evaluation on its accuracy and robustness, showing that EMTV performs better than the original TV and other state-of-the-art techniques in fundamental matrix estimation for multiview stereo matching. The extensions of CFTV and EMTV for extracting multiple and nonlinear structures are underway.

9.
Artigo em Inglês | MEDLINE | ID: mdl-21097167

RESUMO

We present a light-weight, cheap and low-power, wearable system for assisting the visually impaired in performing routine mobility tasks. Our system extends the range of the white cane by providing the user with vibro-tactile cues corresponding to the location of obstacles and a safe path for traversal through a cluttered environment. The presented approach keeps cognitive load to a minimum, and while being autonomous, adapts to the changing mobility requirements of a navigating user. In this paper, we provide an overview of the hardware and algorithmic components of our system, and show results of pilot studies with human test subjects. Our system operates at 20Hz, and significantly improves mobility performance compared to using only the white cane.


Assuntos
Vestuário , Auxiliares Sensoriais , Pessoas com Deficiência Visual , Algoritmos , Percepção de Profundidade , Humanos , Fatores de Tempo
10.
J Mach Learn Res ; 2010: 265-272, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-25309138

RESUMO

We study the problem of image denoising where images are assumed to be samples from low dimensional (sub)manifolds. We propose the algorithm of locally linear denoising. The algorithm approximates manifolds with locally linear patches by constructing nearest neighbor graphs. Each image is then locally denoised within its neighborhoods. A global optimal denoising result is then identified by aligning those local estimates. The algorithm has a closed-form solution that is efficient to compute. We evaluated and compared the algorithm to alternative methods on two image data sets. We demonstrated the effectiveness of the proposed algorithm, which yields visually appealing denoising results, incurs smaller reconstruction errors and results in lower error rates when the denoised data are used in supervised learning tasks.

11.
IEEE Trans Pattern Anal Mach Intell ; 31(12): 2196-210, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19834141

RESUMO

We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied. Our method overcomes the one-to-one assumption by formulating the visual tracking problem in terms of finding the best spatial and temporal association of observations, which maximizes the consistency of both motion and appearance of trajectories. To avoid enumerating all possible solutions, we take a Data-Driven Markov Chain Monte Carlo (DD-MCMC) approach to sample the solution space efficiently. The sampling is driven by an informed proposal scheme controlled by a joint probability model combining motion and appearance. Comparative experiments with quantitative evaluations are provided.

12.
IEEE Trans Pattern Anal Mach Intell ; 30(9): 1589-602, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18617717

RESUMO

We present a novel local spatiotemporal approach to produce motion segmentation and dense temporal trajectories from an image sequence. A common representation of image sequences is a 3D spatiotemporal volume, (x,y,t), and its corresponding mathematical formalism is the fiber bundle. However, directly enforcing the spatiotemporal smoothness constraint is difficult in the fiber bundle representation. Thus, we convert the representation into a new 5D space (x,y,t,vx,vy) with an additional velocity domain, where each moving object produces a separate 3D smooth layer. The smoothness constraint is now enforced by extracting 3D layers using the tensor voting framework in a single step that solves both correspondence and segmentation simultaneously. Motion segmentation is achieved by identifying those layers, and the dense temporal trajectories are obtained by converting the layers back into the fiber bundle representation. We proceed to address three applications (tracking, mosaic, and 3D reconstruction) that are hard to solve from the video stream directly because of the segmentation and dense matching steps, but become straightforward with our framework. The approach does not make restrictive assumptions about the observed scene or camera motion and is therefore generally applicable. We present results on a number of data sets.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Movimento (Física) , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
13.
Med Image Anal ; 12(2): 174-90, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18060827

RESUMO

This study presents methods to 2-D registration of retinal image sequences and 3-D shape inference from fluorescein images. The Y-feature is a robust geometric entity that is largely invariant across modalities as well as across the temporal grey level variations induced by the propagation of the dye in the vessels. We first present a Y-feature extraction method that finds a set of Y-feature candidates using local image gradient information. A gradient-based approach is then used to align an articulated model of the Y-feature to the candidates more accurately while optimizing a cost function. Using mutual information, fitted Y-features are subsequently matched across images, including colors and fluorescein angiographic frames, for registration. To reconstruct the retinal fundus in 3-D, the extracted Y-features are used to estimate the epipolar geometry with a plane-and-parallax approach. The proposed solution provides a robust estimation of the fundamental matrix suitable for plane-like surfaces, such as the retinal fundus. The mutual information criterion is used to accurately estimate the dense disparity map. Our experimental results validate the proposed method on a set of difficult fluorescein image pairs.


Assuntos
Inteligência Artificial , Angiofluoresceinografia/métodos , Fluoresceína , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Algoritmos , Angiofluoresceinografia/instrumentação , Humanos , Aumento da Imagem/métodos , Imagens de Fantasmas , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
IEEE Trans Pattern Anal Mach Intell ; 29(9): 1627-41, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17627049

RESUMO

We present a method for detecting motion regions in video sequences observed by a moving camera, in the presence of strong parallax due to static 3D structures. The proposed method classifies each image pixel into planar background, parallax or motion regions by sequentially applying 2D planar homographies, the epipolar constraint and a novel geometric constraint, called "structure consistency constraint". The structure consistency constraint, as the main contribution of this paper, is derived from the relative camera poses among three frames and implemented within the "Plane+Parallax" framework. Unlike previous planar-parallax constraints, the proposed constraint does not require the reference plane to be constant across multiple views. It directly measures the inconsistency between the projective structures from the same point under camera motion and reference plane change. The structure consistency constraint is capable of detecting moving objects followed by a moving camera in the same direction, a so called degenerate configuration where the epipolar constraint fails. We demonstrate the effectiveness and robustness of our method with experimental results on real-world video sequences.


Assuntos
Algoritmos , Artefatos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotogrametria/métodos , Movimento (Física) , Fotogrametria/instrumentação , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
15.
IEEE Trans Pattern Anal Mach Intell ; 28(6): 968-82, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16724590

RESUMO

We address the fundamental problem of matching in two static images. The remaining challenges are related to occlusion and lack of texture. Our approach addresses these difficulties within a perceptual organization framework, considering both binocular and monocular cues. Initially, matching candidates for all pixels are generated by a combination of matching techniques. The matching candidates are then embedded in disparity space, where perceptual organization takes place in 3D neighborhoods and, thus, does not suffer from problems associated with scanline or image neighborhoods. The assumption is that correct matches produce salient, coherent surfaces, while wrong ones do not. Matching candidates that are consistent with the surfaces are kept and grouped into smooth layers. Thus, we achieve surface segmentation based on geometric and not photometric properties. Surface overextensions, which are due to occlusion, can be corrected by removing matches whose projections are not consistent in color with their neighbors of the same surface in both images. Finally, the projections of the refined surfaces on both images are used to obtain disparity hypotheses for unmatched pixels. The final disparities are selected after a second tensor voting stage, during which information is propagated from more reliable pixels to less reliable ones. We present results on widely used benchmark stereo pairs.


Assuntos
Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotogrametria/métodos , Visão Binocular , Visão Monocular , Algoritmos , Sinais (Psicologia) , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
16.
Artigo em Inglês | MEDLINE | ID: mdl-17354883

RESUMO

We present a method for the 3-D shape reconstruction of the retinal fundus from stereo paired images. Detection of retinal elevation plays a critical role in the diagnosis and management of many retinal diseases. However, since the shape of ocular fundus is nearly planar, its 3-D depth range is very narrow. Therefore, we use the location of vascular bifurcations and a plane+parallax approach to provide a robust estimation of the epipolar geometry. Matching is then performed using a mutual information algorithm for accurate estimation of the disparity maps. To validate our results, in the absence of camera calibration, we compared the results with measurements from the current clinical gold standard, optical coherence tomography (OCT).


Assuntos
Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Oftalmoscopia/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotogrametria/métodos , Retina/citologia , Algoritmos , Inteligência Artificial , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
17.
IEEE Trans Image Process ; 14(8): 1202-14, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16121466

RESUMO

The three-dimensional (3-D) reconstruction of generalized cylinders (GCs) is an important research field in computer vision. One of the main difficulties is that some contour features in images cannot be reconstructed by traditional stereovision because they do not correspond to reflectance discontinuities of surface in space. In this paper, we present a novel, parametric approach for the 3-D reconstruction of circular generalized cylinders (CGCs) only from the limb edges of CGCs in two images. Instead of exploiting the invariant and quasiinvariant properties of some specific subclasses of GCs in projections, our reconstruction is achieved by some general assumptions on GCs, and can, therefore, be applied to a broader subclass of GCs. In order to improve robustness, we perform the extraction and labeling of the limb edge interactively, and estimate the epipolar geometry between two images by an optimal algorithm. Then, for different types of GCs, three kinds of symmetries (parallel symmetry, skew symmetry, and local smooth symmetry) are employed to compute the symmetry of limb edges. The surface points corresponding to limb edges in images are reconstructed by integrating the recovered epipolar geometry and the properties induced from the assumptions that we make on the GCs. Finally, a homography-based method is exploited to further refine the 3-D description of the GC with a coplanar curved axis.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Aumento da Imagem/métodos , Técnica de Subtração
18.
IEEE Trans Pattern Anal Mach Intell ; 27(5): 739-52, 2005 May.
Artigo em Inglês | MEDLINE | ID: mdl-15875795

RESUMO

Most approaches for motion analysis and interpretation rely on restrictive parametric models and involve iterative methods which depend heavily on initial conditions and are subject to instability. Further difficulties are encountered in image regions where motion is not smooth-typically around motion boundaries. This work addresses the problem of visual motion analysis and interpretation by formulating it as an inference of motion layers from a noisy and possibly sparse point set in a 4D space. The core of the method is based on a layered 4D representation of data and a voting scheme for affinity propagation. The inherent problem caused by the ambiguity of 2D to 3D interpretation is usually handled by adding additional constraints, such as rigidity. However, enforcing such a global constraint has been problematic in the combined presence of noise and multiple independent motions. By decoupling the processes of matching, outlier rejection, segmentation, and interpretation, we extract accurate motion layers based on the smoothness of image motion, then locally enforce rigidity for each layer in order to infer its 3D structure and motion. The proposed framework is noniterative and consistently handles both smooth moving regions and motion discontinuities without using any prior knowledge of the motion model.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Modelos Biológicos , Movimento , Reconhecimento Automatizado de Padrão/métodos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Estatísticos , Fotografação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de Subtração , Gravação em Vídeo/métodos
19.
IEEE Trans Pattern Anal Mach Intell ; 26(5): 594-611, 2004 May.
Artigo em Inglês | MEDLINE | ID: mdl-15460281

RESUMO

Most computer vision applications require the reliable detection of boundaries. In the presence of outliers, missing data, orientation discontinuities, and occlusion, this problem is particularly challenging. We propose to address it by complementing the tensor voting framework, which was limited to second order properties, with first order representation and voting. First order voting fields and a mechanism to vote for 3D surface and volume boundaries and curve endpoints in 3D are defined. Boundary inference is also useful for a second difficult problem in grouping, namely, automatic scale selection. We propose an algorithm that automatically infers the smallest scale that can preserve the finest details. Our algorithm then proceeds with progressively larger scales to ensure continuity where it has not been achieved. Therefore, the proposed approach does not oversmooth features or delay the handling of boundaries and discontinuities until model misfit occurs. The interaction of smooth features, boundaries, and outliers is accommodated by the unified representation, making possible the perceptual organization of data in curves, surfaces, volumes, and their boundaries simultaneously. We present results on a variety of data sets to show the efficacy of the improved formalism.


Assuntos
Algoritmos , Inteligência Artificial , Encéfalo/anatomia & histologia , Encéfalo/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão , Análise por Conglomerados , Simulação por Computador , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Análise Numérica Assistida por Computador , Radiografia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador , Técnica de Subtração
20.
IEEE Trans Pattern Anal Mach Intell ; 26(9): 1167-84, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-15742892

RESUMO

We address the problem of simultaneous two-view epipolar geometry estimation and motion segmentation from nonstatic scenes. Given a set of noisy image pairs containing matches of n objects, we propose an unconventional, efficient, and robust method, 4D tensor voting, for estimating the unknown n epipolar geometries, and segmenting the static and motion matching pairs into n independent motions. By considering the 4D isotropic and orthogonal joint image space, only two tensor voting passes are needed, and a very high noise to signal ratio (up to five) can be tolerated. Epipolar geometries corresponding to multiple, rigid motions are extracted in succession. Only two uncalibrated frames are needed, and no simplifying assumption (such as affine camera model or homographic model between images) other than the pin-hole camera model is made. Our novel approach consists of propagating a local geometric smoothness constraint in the 4D joint image space, followed by global consistency enforcement for extracting the fundamental matrices corresponding to independent motions. We have performed extensive experiments to compare our method with some representative algorithms to show that better performance on nonstatic scenes are achieved. Results on challenging data sets are presented.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Animais , Análise por Conglomerados , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador , Gravação em Vídeo/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...