Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14682-14692, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37751349

RESUMO

In this work, we present a Deep Learning approach to estimate age from facial images. First, we introduce a novel attention-based approach to image augmentation-aggregation, which allows multiple image augmentations to be adaptively aggregated using a Transformer-Encoder. A hierarchical probabilistic regression model is then proposed that combines discrete probabilistic age estimates with an ensemble of regressors. Each regressor is adapted and trained to refine the probability estimate over a given age range. We show that our age estimation scheme outperforms current schemes and provides a new state-of-the-art age estimation accuracy when applied to the MORPH II and CACD datasets. We also present an analysis of the biases in the results of the state-of-the-art age estimates.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14222-14233, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37651496

RESUMO

Absolute camera pose regressors estimate the position and orientation of a camera given the captured image alone. Typically, a convolutional backbone with a multi-layer perceptron (MLP) head is trained using images and pose labels to embed a single reference scene at a time. Recently, this scheme was extended to learn multiple scenes by replacing the MLP head with a set of fully connected layers. In this work, we propose to learn multi-scene absolute camera pose regression with Transformers, where encoders are used to aggregate activation maps with self-attention and decoders transform latent features and scenes encoding into pose predictions. This allows our model to focus on general features that are informative for localization, while embedding multiple scenes in parallel. We extend our previous MS-Transformer approach Shavit et al. (2021) by introducing a mixed classification-regression architecture that improves the localization accuracy. Our method is evaluated on commonly benchmark indoor and outdoor datasets and has been shown to exceed both multi-scene and state-of-the-art single-scene absolute pose regressors.

3.
Artigo em Inglês | MEDLINE | ID: mdl-37402200

RESUMO

We propose a novel formulation of deep networks that do not use dot-product neurons and rely on a hierarchy of voting tables instead, denoted as convolutional tables (CTs), to enable accelerated CPU-based inference. Convolutional layers are the most time-consuming bottleneck in contemporary deep learning techniques, severely limiting their use in the Internet of Things and CPU-based devices. The proposed CT performs a fern operation at each image location: it encodes the location environment into a binary index and uses the index to retrieve the desired local output from a table. The results of multiple tables are combined to derive the final output. The computational complexity of a CT transformation is independent of the patch (filter) size and grows gracefully with the number of channels, outperforming comparable convolutional layers. It is shown to have a better capacity:compute ratio than dot-product neurons, and that deep CT networks exhibit a universal approximation property similar to neural networks. As the transformation involves computing discrete indices, we derive a soft relaxation and gradient-based approach for training the CT hierarchy. Deep CT networks have been experimentally shown to have accuracy comparable to that of CNNs of similar architectures. In the low-compute regime, they enable an error:speed tradeoff superior to alternative efficient CNN architectures.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 560-575, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35471874

RESUMO

We present Face Swapping GAN (FSGAN) for face swapping and reenactment. Unlike previous work, we offer a subject agnostic swapping scheme that can be applied to pairs of faces without requiring training on those faces. We derive a novel iterative deep learning-based approach for face reenactment which adjusts significant pose and expression variations that can be applied to a single image or a video sequence. For video sequences, we introduce a continuous interpolation of the face views based on reenactment, Delaunay Triangulation, and barycentric coordinates. Occluded face regions are handled by a face completion network. Finally, we use a face blending network for seamless blending of the two faces while preserving the target skin color and lighting conditions. This network uses a novel Poisson blending loss combining Poisson optimization with a perceptual loss. We compare our approach to existing state-of-the-art systems and show our results to be both qualitatively and quantitatively superior. This work describes extensions of the FSGAN method, proposed in an earlier conference version of our work (Nirkin et al. 2019), as well as additional experiments and results.

5.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 10252-10260, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34855587

RESUMO

We present a deep learning approach for learning the joint semantic embeddings of images and captions in a euclidean space, such that the semantic similarity is approximated by the L2 distances in the embedding space. For that, we introduce a metric learning scheme that utilizes multitask learning to learn the embedding of identical semantic concepts using a center loss. By introducing a differentiable quantization scheme into the end-to-end trainable network, we derive a semantic embedding of semantically similar concepts in euclidean space. We also propose a novel metric learning formulation using an adaptive margin hinge loss, that is refined during the training phase. The proposed scheme was applied to the MS-COCO, Flicke30K and Flickr8K datasets, and was shown to compare favorably with contemporary state-of-the-art approaches.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6585-6593, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34166186

RESUMO

In this work, we propose a novel Convolutional Neural Network (CNN) architecture for the joint detection and matching of feature points in images acquired by different sensors using a single forward pass. The resulting feature detector is tightly coupled with the feature descriptor, in contrast to classical approaches (SIFT, etc.), where the detection phase precedes and differs from computing the descriptor. Our approach utilizes two CNN subnetworks, the first being a Siamese CNN and the second, consisting of dual non-weight-sharing CNNs. This allows simultaneous processing and fusion of the joint and disjoint cues in the multimodal image patches. The proposed approach is experimentally shown to outperform contemporary state-of-the-art schemes when applied to multiple datasets of multimodal images. It is also shown to provide repeatable feature points detections across multi-sensor images, outperforming state-of-the-art detectors. To the best of our knowledge, it is the first unified approach for the detection and matching of such images.


Assuntos
Algoritmos , Redes Neurais de Computação
7.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6111-6121, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34185639

RESUMO

We propose a method for detecting face swapping and other identity manipulations in single images. Face swapping methods, such as DeepFake, manipulate the face region, aiming to adjust the face to the appearance of its context, while leaving the context unchanged. We show that this modus operandi produces discrepancies between the two regions (e.g., Fig. 1). These discrepancies offer exploitable telltale signs of manipulation. Our approach involves two networks: (i) a face identification network that considers the face region bounded by a tight semantic segmentation, and (ii) a context recognition network that considers the face context (e.g., hair, ears, neck). We describe a method which uses the recognition signals from our two networks to detect such discrepancies, providing a complementary detection signal that improves conventional real versus fake classifiers commonly used for detecting fake images. Our method achieves state of the art results on the FaceForensics++ and Celeb-DF-v2 benchmarks for face manipulation detection, and even generalizes to detect fakes produced by unseen methods.


Assuntos
Algoritmos , Face , Face/diagnóstico por imagem
8.
IEEE Trans Pattern Anal Mach Intell ; 43(8): 2851-2857, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33175677

RESUMO

In this work, we propose a deep learning-based approach for kin verification using a unified multi-task learning scheme where all kinship classes are jointly learned. This allows us to better utilize small training sets that are typical of kin verification. We introduce a novel approach for fusing the embeddings of kin images, to avoid overfitting, which is a common issue in training such networks. An adaptive sampling scheme is derived for the training set images, to resolve the inherent imbalance in kin verification datasets. A thorough ablation study exemplifies the effectivity of our approach, which is experimentally shown to outperform contemporary state-of-the-art kin verification results when applied to the Families In the Wild, FG2018, and FG2020 datasets.


Assuntos
Algoritmos , Humanos
9.
Sci Rep ; 7(1): 4702, 2017 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-28680149

RESUMO

The recent proliferation in mobile touch-based devices paves the way for increasingly efficient, easy to use natural user interfaces (NUI). Unfortunately, touch-based NUIs might prove difficult, or even impossible to operate, in certain conditions e.g. when suffering from motor dysfunction such as Parkinson's Disease (PD). Yet, the prevalence of such devices makes them particularly suitable for acquiring motor function data, and enabling the early detection of PD symptoms and other conditions. In this work we acquired a unique database of more than 12,500 annotated NUI multi-touch gestures, collected from PD patients and healthy volunteers, that were analyzed by applying advanced shape analysis and statistical inference schemes. The proposed analysis leads to a novel detection scheme for early stages of PD. Moreover, our computational analysis revealed that young subjects may be using a 'slang' form of gesture-making to reduce effort and attention cost while maintaining meaning, whereas older subjects put an emphasis on content and precise performance.


Assuntos
Doença de Parkinson/diagnóstico , Doença de Parkinson/fisiopatologia , Tato , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Feminino , Gestos , Humanos , Masculino , Pessoa de Meia-Idade , Máquina de Vetores de Suporte , Interface Usuário-Computador
10.
IEEE Trans Image Process ; 25(10): 4743-4752, 2016 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-27416599

RESUMO

This paper presents an unsupervised and semi-automatic image segmentation approach where we formulate the segmentation as an inference problem based on unary and pairwise assignment probabilities computed using low-level image cues. The inference is solved via a probabilistic graph matching scheme, which allows rigorous incorporation of low-level image cues and automatic tuning of parameters. The proposed scheme is experimentally shown to compare favorably with contemporary semi-supervised and unsupervised image segmentation schemes, when applied to contemporary state-of-the-art image sets.

11.
IEEE Trans Image Process ; 25(3): 1340-53, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26780796

RESUMO

We present a method for improving a non-local means (NLM) operator by computing its low-rank approximation. The low-rank operator is constructed by applying a filter to the spectrum of the original NLM operator. This results in an operator, which is less sensitive to noise while preserving important properties of the original operator. The method is efficiently implemented based on Chebyshev polynomials and is demonstrated on the application of natural images denoising. For this application, we provide a comparison of our method with other denoising methods.

12.
PLoS Biol ; 13(8): e1002212, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26241802

RESUMO

One of the major challenges that developing organs face is scaling, that is, the adjustment of physical proportions during the massive increase in size. Although organ scaling is fundamental for development and function, little is known about the mechanisms that regulate it. Bone superstructures are projections that typically serve for tendon and ligament insertion or articulation and, therefore, their position along the bone is crucial for musculoskeletal functionality. As bones are rigid structures that elongate only from their ends, it is unclear how superstructure positions are regulated during growth to end up in the right locations. Here, we document the process of longitudinal scaling in developing mouse long bones and uncover the mechanism that regulates it. To that end, we performed a computational analysis of hundreds of three-dimensional micro-CT images, using a newly developed method for recovering the morphogenetic sequence of developing bones. Strikingly, analysis revealed that the relative position of all superstructures along the bone is highly preserved during more than a 5-fold increase in length, indicating isometric scaling. It has been suggested that during development, bone superstructures are continuously reconstructed and relocated along the shaft, a process known as drift. Surprisingly, our results showed that most superstructures did not drift at all. Instead, we identified a novel mechanism for bone scaling, whereby each bone exhibits a specific and unique balance between proximal and distal growth rates, which accurately maintains the relative position of its superstructures. Moreover, we show mathematically that this mechanism minimizes the cumulative drift of all superstructures, thereby optimizing the scaling process. Our study reveals a general mechanism for the scaling of developing bones. More broadly, these findings suggest an evolutionary mechanism that facilitates variability in bone morphology by controlling the activity of individual epiphyseal plates.


Assuntos
Ossos do Braço/embriologia , Ossos do Braço/crescimento & desenvolvimento , Desenvolvimento Ósseo/fisiologia , Ossos da Perna/embriologia , Ossos da Perna/crescimento & desenvolvimento , Animais , Ossos do Braço/diagnóstico por imagem , Imageamento Tridimensional , Ossos da Perna/diagnóstico por imagem , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Modelos Biológicos , Modelos Estatísticos , Microtomografia por Raio-X
13.
IEEE Trans Image Process ; 23(5): 2291-301, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24818248

RESUMO

In this paper, we propose a novel approach for integrating multiple tracking cues within a unified probabilistic graph-based Markov random fields (MRFs) representation. We show how to integrate temporal and spatial cues encoded by unary and pairwise probabilistic potentials. As the inference of such high-order MRF models is known to be NP-hard, we propose an efficient spectral relaxation-based inference scheme. The proposed scheme is exemplified by applying it to a mixture of five tracking cues, and is shown to be applicable to wider sets of cues. This paves the way for a modular plug-and-play tracking framework that can be easily adapted to diverse tracking scenarios. The proposed scheme is experimentally shown to compare favorably with contemporary state-of-the-art schemes, and provides accurate tracking results.

14.
IEEE Trans Image Process ; 22(8): 2983-94, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23322762

RESUMO

We present a framework for image inpainting that utilizes the diffusion framework approach to spectral dimensionality reduction. We show that on formulating the inpainting problem in the embedding domain, the domain to be inpainted is smoother in general, particularly for the textured images. Thus, the textured images can be inpainted through simple exemplar-based and variational methods. We discuss the properties of the induced smoothness and relate it to the underlying assumptions used in contemporary inpainting schemes. As the diffusion embedding is nonlinear and noninvertible, we propose a novel computational approach to approximate the inverse mapping from the inpainted embedding space to the image domain. We formulate the mapping as a discrete optimization problem, solved through spectral relaxation. The effectiveness of the presented method is exemplified by inpainting real images, where it is shown to compare favorably with contemporary state-of-the-art schemes.


Assuntos
Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
15.
IEEE Trans Pattern Anal Mach Intell ; 35(1): 18-27, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22350163

RESUMO

Spectral Matching (SM) is a computationally efficient approach to approximate the solution of pairwise matching problems that are np-hard. In this paper, we present a probabilistic interpretation of spectral matching schemes and derive a novel Probabilistic Matching (PM) scheme that is shown to outperform previous approaches. We show that spectral matching can be interpreted as a Maximum Likelihood (ML) estimate of the assignment probabilities and that the Graduated Assignment (GA) algorithm can be cast as a Maximum a Posteriori (MAP) estimator. Based on this analysis, we derive a ranking scheme for spectral matchings based on their reliability, and propose a novel iterative probabilistic matching algorithm that relaxes some of the implicit assumptions used in prior works. We experimentally show our approaches to outperform previous schemes when applied to exhaustive synthetic tests as well as the analysis of real image sequences.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Modelos Estatísticos
16.
IEEE Trans Image Process ; 21(5): 2758-69, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22249710

RESUMO

In this paper, we present a framework for detecting interest points in 3-D meshes and computing their corresponding descriptors. For that, we propose an intrinsic scale detection scheme per interest point and utilize it to derive two scale-invariant local features for mesh models. First, we present the scale-invariant spin image local descriptor that is a scale-invariant formulation of the spin image descriptor. Second, we adapt the scale-invariant feature transform feature to mesh data by representing the vicinity of each interest point as a depth map and estimating its dominant angle using the principal component analysis to achieve rotation invariance. The proposed features were experimentally shown to be robust to scale changes and partial mesh matching, and they were compared favorably with other local mesh features on the SHREC'10 and SHREC'11 testbeds. We applied the proposed local features to mesh retrieval using the bag-of-features approach and achieved state-of-the-art retrieval accuracy. Last, we applied the proposed local features to register models to scanned depth scenes and achieved high registration accuracy.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Simulação por Computador , Aumento da Imagem/métodos , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
17.
IEEE Trans Pattern Anal Mach Intell ; 32(12): 2205-15, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20975118

RESUMO

We present a computational approach to high-order matching of data sets in IR(d). Those are matchings based on data affinity measures that score the matching of more than two pairs of points at a time. High-order affinities are represented by tensors and the matching is then given by a rank-one approximation of the affinity tensor and a corresponding discretization. Our approach is rigorously justified by extending Zass and Shashua's hypergraph matching to high-order spectral matching. This paves the way for a computationally efficient dual-marginalization spectral matching scheme. We also show that, based on the spectral properties of random matrices, affinity tensors can be randomly sparsified while retaining the matching accuracy. Our contributions are experimentally validated by applying them to synthetic as well as real data sets.

18.
IEEE Trans Pattern Anal Mach Intell ; 32(7): 1227-38, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20489226

RESUMO

We present a spectral approach for detecting and analyzing rotational and reflectional symmetries in n-dimensions. Our main contribution is the derivation of a symmetry detection and analysis scheme for sets of points in IRn and its extension to image analysis by way of local features. Each object is represented by a set of points S 2 IRn, where the symmetry is manifested by the multiple self-alignments of S. The alignment problem is formulated as a quadratic binary optimization problem, with an efficient solution via spectral relaxation. For symmetric objects, this results in a multiplicity of eigenvalues whose corresponding eigenvectors allow the detection and analysis of both types of symmetry. We improve the scheme's robustness by incorporating geometrical constraints into the spectral analysis. Our approach is experimentally verified by applying it to 2D and 3D synthetic objects as well as real images.

19.
IEEE Trans Image Process ; 19(5): 1319-27, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20071259

RESUMO

We propose two computational approaches for improving the retrieval of planar shapes. First, we suggest a geometrically motivated quadratic similarity measure, that is optimized by way of spectral relaxation of a quadratic assignment. By utilizing state-of-the-art shape descriptors and a pairwise serialization constraint, we derive a formulation that is resilient to boundary noise, articulations and nonrigid deformations. This allows both shape matching and retrieval. We also introduce a shape meta-similarity measure that agglomerates pairwise shape similarities and improves the retrieval accuracy. When applied to the MPEG-7 shape dataset in conjunction with the proposed geometric matching scheme, we obtained a retrieval rate of 92.5%.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
20.
IEEE Trans Pattern Anal Mach Intell ; 28(11): 1784-97, 2006 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17063683

RESUMO

Data fusion and multicue data matching are fundamental tasks of high-dimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is three-fold: First, we present the Laplace-Beltrami approach for computing density invariant embeddings which are essential for integrating different sources of data. Second, we describe a refinement of the Nyström extension algorithm called "geometric harmonics." We also explain how to use this tool for data assimilation. Finally, we introduce a multicue data matching scheme based on nonlinear spectral graphs alignment. The effectiveness of the presented schemes is validated by applying it to the problems of lipreading and image sequence alignment.


Assuntos
Algoritmos , Inteligência Artificial , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Análise por Conglomerados , Aumento da Imagem/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...