Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9567-9582, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37030772

RESUMO

Compatible features enable the direct comparison of old and new learned features allowing to use them interchangeably over time. In visual search systems, this eliminates the need to extract new features from the gallery-set when the representation model is upgraded with novel data. This has a big value in real applications as re-indexing the gallery-set can be computationally expensive when the gallery-set is large, or even infeasible due to privacy or other concerns of the application. In this paper, we propose CoReS, a new training procedure to learn representations that are compatible with those previously learned, grounding on the stationarity of the features as provided by fixed classifiers based on polytopes. With this solution, classes are maximally separated in the representation space and maintain their spatial configuration stationary as new classes are added, so that there is no need to learn any mappings between representations nor to impose pairwise training with the previously learned model. We demonstrate that our training procedure largely outperforms the current state of the art and is particularly effective in the case of multiple upgrades of the training-set, which is the typical case in real applications.

2.
Sensors (Basel) ; 23(1)2023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-36617115

RESUMO

Action understanding is a fundamental computer vision branch for several applications, ranging from surveillance to robotics. Most works deal with localizing and recognizing the action in both time and space, without providing a characterization of its evolution. Recent works have addressed the prediction of action progress, which is an estimate of how far the action has advanced as it is performed. In this paper, we propose to predict action progress using a different modality compared to previous methods: body joints. Human body joints carry very precise information about human poses, which we believe are a much more lightweight and effective way of characterizing actions and therefore their execution. Estimating action progress can in fact be determined based on the understanding of how key poses follow each other during the development of an activity. We show how an action progress prediction model can exploit body joints and integrate it with modules providing keypoint and action information in order to be run directly from raw pixels. The proposed method is experimentally validated on the Penn Action Dataset.

3.
IEEE Trans Affect Comput ; 13(4): 1813-1826, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36452255

RESUMO

We propose an automatic method to estimate self-reported pain based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and the pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is used for representing the trajectory of landmarks on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. A curve fitting algorithm is used to smooth the trajectories and temporal alignment is performed to compute the similarity between the trajectories on the manifold. A Support Vector Regression classifier is then trained to encode extracted trajectories into pain intensity levels consistent with self-reported pain intensity measurement. Finally, a late fusion of the estimation for each region is performed to obtain the final predicted pain level. The proposed approach is evaluated on two publicly available datasets, the UNBCMcMaster Shoulder Pain Archive and the Biovid Heat Pain dataset. We compared our method to the state-of-the-art on both datasets using different testing protocols, showing the competitiveness of the proposed approach.

4.
J Imaging ; 7(12)2021 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-34940724

RESUMO

Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images. In particular, we provide a solution to the problem of estimating the shape of the body when the subject is wearing clothes. This is a highly challenging scenario as loose clothes might hide the underlying body shape to a large extent. To this aim, we make use of a parametric 3D body model, the SMPL, whose parameters describe the body pose and shape of the body. Our main intuition is that the shape parameters associated with an individual should not change whether the subject is wearing clothes or not. To improve the shape estimation under clothing, we train a deep convolutional network to regress the shape parameters from a single image of a person. To increase the robustness to clothing, we build our training dataset by associating the shape parameters of a "minimally clothed" person to other samples of the same person wearing looser clothes. Experimental validation shows that our approach can more accurately estimate body shape parameters with respect to state-of-the-art approaches, even in the case of loose clothes.

5.
Artigo em Inglês | MEDLINE | ID: mdl-34651145

RESUMO

We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Curve fitting and temporal alignment were then used to smooth the extracted trajectories. A Support Vector Regression model was then trained to encode the extracted trajectories into ten pain intensity levels consistent with the Visual Analogue Scale for pain intensity measurement. The proposed approach was evaluated using the UNBC McMaster Shoulder Pain Archive and was compared to the state-of-the-art on the same data. Using both 5-fold cross-validation and leave-one-subject-out cross-validation, our results are competitive with respect to state-of-the-art methods.

6.
Sensors (Basel) ; 21(2)2021 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-33467595

RESUMO

Facial Action Units (AUs) correspond to the deformation/contraction of individual facial muscles or their combinations. As such, each AU affects just a small portion of the face, with deformations that are asymmetric in many cases. Generating and analyzing AUs in 3D is particularly relevant for the potential applications it can enable. In this paper, we propose a solution for 3D AU detection and synthesis by developing on a newly defined 3D Morphable Model (3DMM) of the face. Differently from most of the 3DMMs existing in the literature, which mainly model global variations of the face and show limitations in adapting to local and asymmetric deformations, the proposed solution is specifically devised to cope with such difficult morphings. During a training phase, the deformation coefficients are learned that enable the 3DMM to deform to 3D target scans showing neutral and facial expression of the same individual, thus decoupling expression from identity deformations. Then, such deformation coefficients are used, on the one hand, to train an AU classifier, on the other, they can be applied to a 3D neutral scan to generate AU deformations in a subject-independent manner. The proposed approach for AU detection is validated on the Bosphorus dataset, reporting competitive results with respect to the state-of-the-art, even in a challenging cross-dataset setting. We further show the learned coefficients are general enough to synthesize realistic 3D face instances with AUs activation.

7.
Artigo em Inglês | MEDLINE | ID: mdl-30059306

RESUMO

Face recognition "in the wild" has been revolutionized by the deployment of deep learning based approaches. In fact, it has been extensively demonstrated that Deep Convolutional Neural Networks (DCNNs) are powerful enough to overcome most of the limits that affected face recognition algorithms based on hand-crafted features. These include variations in illumination, pose, expression and occlusion, to mention some. The DCNNs discriminative power comes from the fact that low- and high-level representations are learned directly from the raw image data. As a consequence, we expect the performance of a DCNN to be influenced by the characteristics of the image/video data that are fed to the network, and their preprocessing. In this work, we present a thorough analysis of several aspects that impact on the use of DCNN for face recognition. The evaluation has been carried out from two main perspectives: the network architecture and the similarity measures used to compare deeply learned features; the data (source and quality) and their preprocessing (bounding box and alignment). Results obtained on the IJB-A, MegaFace, UMDFaces and YouTube Faces datasets indicate viable hints for designing, training and testing DCNNs. Taking into account the outcomes of the experimental evaluation, we show how competitive performance with respect to the state-of-the-art can be reached even with standard DCNN architectures and pipeline.

8.
IEEE Trans Pattern Anal Mach Intell ; 37(8): 1629-42, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26353000

RESUMO

In this paper we introduce a method for person re-identification based on discriminative, sparse basis expansions of targets in terms of a labeled gallery of known individuals. We propose an iterative extension to sparse discriminative classifiers capable of ranking many candidate targets. The approach makes use of soft- and hard- re-weighting to redistribute energy among the most relevant contributing elements and to ensure that the best candidates are ranked at each iteration. Our approach also leverages a novel visual descriptor which we show to be discriminative while remaining robust to pose and illumination variations. An extensive comparative evaluation is given demonstrating that our approach achieves state-of-the-art performance on single- and multi-shot person re-identification scenarios on the VIPeR, i-LIDS, ETHZ, and CAVIAR4REID datasets. The combination of our descriptor and iterative sparse basis expansion improves state-of-the-art rank-1 performance by six percentage points on VIPeR and by 20 on CAVIAR4REID compared to other methods with a single gallery image per person. With multiple gallery and probe images per person our approach improves by 17 percentage points the state-of-the-art on i-LIDS and by 72 on CAVIAR4REID at rank-1. The approach is also quite efficient, capable of single-shot person re-identification over galleries containing hundreds of individuals at about 30 re-identifications per second.


Assuntos
Algoritmos , Identificação Biométrica/métodos , Humanos , Gravação em Vídeo/métodos
9.
Forensic Sci Int ; 251: e9-e14, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25851695

RESUMO

Photographic documents both in digital and in printed format plays a fundamental role in crime scene analysis. Photos are crucial to reconstruct what happened and also to freeze the fact scenario with all the different present objects and evidences. Consequently, it is immediate to comprehend the paramount importance of the assessment of the authenticity of such images, to avoid that a possible malicious counterfeiting leads to a wrong evaluation of the circumstance. In this paper, a case study in which some printed photos, brought as documental evidences of a familiar murder, had been fraudulently modified to bias the final judgement is presented. In particular, the usage of CADET image forensic tool, to verify printed photos integrity, is introduced and discussed.

10.
IEEE Trans Image Process ; 24(1): 220-35, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25398180

RESUMO

In this paper, we present a novel and original framework, which we dubbed mesh-local binary pattern (LBP), for computing local binary-like-patterns on a triangular-mesh manifold. This framework can be adapted to all the LBP variants employed in 2D image analysis. As such, it allows extending the related techniques to mesh surfaces. After describing the foundations, the construction and the main features of the mesh-LBP, we derive its possible variants and show how they can extend most of the 2D-LBP variants to the mesh manifold. In the experiments, we give evidence of the presence of the uniformity aspect in the mesh-LBP, similar to the one noticed in the 2D-LBP. We also report repeatability experiments that confirm, in particular, the rotation-invariance of mesh-LBP descriptors. Furthermore, we analyze the potential of mesh-LBP for the task of 3D texture classification of triangular-mesh surfaces collected from public data sets. Comparison with state-of-the-art surface descriptors, as well as with 2D-LBP counterparts applied on depth images, also evidences the effectiveness of the proposed framework. Finally, we illustrate the robustness of the mesh-LBP with respect to the class of mesh irregularity typical to 3D surface-digitizer scans.

11.
IEEE Trans Cybern ; 45(7): 1340-52, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25216492

RESUMO

Recognizing human actions in 3-D video sequences is an important open problem that is currently at the heart of many research domains including surveillance, natural interfaces and rehabilitation. However, the design and development of models for action recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, clothing and appearance. In this paper, we propose a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition. The proposed solution develops on fitting a human skeleton model to acquired data so as to represent the 3-D coordinates of the joints and their change over time as a trajectory in a suitable action space. Thanks to such a 3-D joint-based framework, the proposed solution is capable to capture both the shape and the dynamics of the human body, simultaneously. The action recognition problem is then formulated as the problem of computing the similarity between the shape of trajectories in a Riemannian manifold. Classification using k-nearest neighbors is finally performed on this manifold taking advantage of Riemannian geometry in the open curve shape space. Experiments are carried out on four representative benchmarks to demonstrate the potential of the proposed solution in terms of accuracy/latency for a low-latency action recognition. Comparative results with state-of-the-art methods are reported.


Assuntos
Imageamento Tridimensional/métodos , Aprendizado de Máquina , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Imagem Corporal Total/métodos , Actigrafia/métodos , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Gravação em Vídeo/métodos
12.
IEEE Trans Pattern Anal Mach Intell ; 36(12): 2538-51, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26353156

RESUMO

In this paper, we present the ALIEN tracking method that exploits oversampling of local invariant representations to build a robust object/context discriminative classifier. To this end, we use multiple instances of scale invariant local features weakly aligned along the object template. This allows taking into account the 3D shape deviations from planarity and their interactions with shadows, occlusions, and sensor quantization for which no invariant representations can be defined. A non-parametric learning algorithm based on the transitive matching property discriminates the object from the context and prevents improper object template updating during occlusion. We show that our learning rule has asymptotic stability under mild conditions and confirms the drift-free capability of the method in long-term tracking. A real-time implementation of the ALIEN tracker has been evaluated in comparison with the state-of-the-art tracking systems on an extensive set of publicly available video sequences that represent most of the critical conditions occurring in real tracking environments. We have reported superior or equal performance in most of the cases and verified tracking with no drift in very long video sequences.

13.
IEEE Trans Pattern Anal Mach Intell ; 36(5): 1033-40, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-26353235

RESUMO

In this paper, we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one's bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement. We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines.

14.
IEEE Trans Image Process ; 22(3): 1018-31, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23144034

RESUMO

We contribute, through this paper, to the design of a novel variational framework able to match and recognize multiple instances of multiple reference logos in image archives. Reference logos and test images are seen as constellations of local features (interest points, regions, etc.) and matched by minimizing an energy function mixing: 1) a fidelity term that measures the quality of feature matching, 2) a neighborhood criterion that captures feature co-occurrence/geometry, and 3) a regularization term that controls the smoothness of the matching solution. We also introduce a detection/recognition procedure and study its theoretical consistency. Finally, we show the validity of our method through extensive experiments on the challenging MICC-Logos dataset. Our method overtakes, by 20%, baseline as well as state-of-the-art matching/recognition procedures.


Assuntos
Inteligência Artificial , Documentação/métodos , Emblemas e Insígnias/classificação , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Algoritmos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
15.
IEEE Trans Pattern Anal Mach Intell ; 32(12): 2162-77, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20975115

RESUMO

In this paper, we present a novel approach to 3D face matching that shows high effectiveness in distinguishing facial differences between distinct individuals from differences induced by nonneutral expressions within the same individual. The approach takes into account geometrical information of the 3D face and encodes the relevant information into a compact representation in the form of a graph. Nodes of the graph represent equal width isogeodesic facial stripes. Arcs between pairs of nodes are labeled with descriptors, referred to as 3D Weighted Walkthroughs (3DWWs), that capture the mutual relative spatial displacement between all the pairs of points of the corresponding stripes. Face partitioning into isogeodesic stripes and 3DWWs together provide an approximate representation of local morphology of faces that exhibits smooth variations for changes induced by facial expressions. The graph-based representation permits very efficient matching for face recognition and is also suited to being employed for face identification in very large data sets with the support of appropriate index structures. The method obtained the best ranking at the SHREC 2008 contest for 3D face recognition. We present an extensive comparative evaluation of the performance with the FRGC v2.0 data set and the SHREC08 data set.


Assuntos
Identificação Biométrica/métodos , Face/anatomia & histologia , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Bases de Dados Factuais , Feminino , Humanos , Masculino , Análise de Componente Principal
16.
IEEE Trans Pattern Anal Mach Intell ; 32(3): 517-29, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20075475

RESUMO

Identifying correspondences between trajectory segments observed from nonsynchronized cameras is important for reconstruction of the complete trajectory of moving targets in a large scene. Such a reconstruction can be obtained from motion data by comparing the trajectory segments and estimating both the spatial and temporal alignments. Exhaustive testing of all possible correspondences of trajectories over a temporal window is only viable in the cases with a limited number of moving targets and large view overlaps. Therefore, alternative solutions are required for situations with several trajectories that are only partially visible in each view. In this paper, we propose a new method that is based on view-invariant representation of trajectories, which is used to produce a sparse set of salient points for trajectory segments observed in each view. Only the neighborhoods at these salient points in the view--invariant representation are then used to estimate the spatial and temporal alignment of trajectory pairs in different views. It is demonstrated that, for planar scenes, the method is able to recover with good precision and efficiency both spatial and temporal alignments, even given relatively small overlap between views and arbitrary (unknown) temporal shifts of the cameras. The method also provides the same capabilities in the case of trajectories that are only locally planar, but exhibit some nonplanarity at a global level.

18.
IEEE Trans Pattern Anal Mach Intell ; 27(1): 99-114, 2005 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-15628272

RESUMO

Image analysis and computer vision can be effectively employed to recover the three-dimensional structure of imaged objects, together with their surface properties. In this paper, we address the problem of metric reconstruction and texture acquisition from a single uncalibrated view of a surface of revolution (SOR). Geometric constraints induced in the image by the symmetry properties of the SOR structure are exploited to perform self-calibration of a natural camera, 3D metric reconstruction, and texture acquisition. By exploiting the analogy with the geometry of single axis motion, we demonstrate that the imaged apparent contour and the visible segments of two imaged cross sections in a single SOR view provide enough information for these tasks. Original contributions of the paper are: single view self-calibration and reconstruction based on planar rectification, previously developed for planar surfaces, has been extended to deal also with the SOR class of curved surfaces; self-calibration is obtained by estimating both camera focal length (one parameter) and principal point (two parameters) from three independent linear constraints for the SOR fixed entities; the invariant-based description of the SOR scaling function has been extended from affine to perspective projection. The solution proposed exploits both the geometric and topological properties of the transformation that relates the apparent contour to the SOR scaling function. Therefore, with this method, a metric localization of the SOR occluded parts can be made, so as to cope with them correctly. For the reconstruction of textured SORs, texture acquisition is performed without requiring the estimation of external camera calibration parameters, but only using internal camera parameters obtained from self-calibration.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Calibragem , Análise por Conglomerados , Simulação por Computador , Aumento da Imagem/métodos , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...