Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38923485

RESUMO

Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs. However, existing methods are often tailored to specific GAN architectures and are limited to either discovering global semantic directions that do not facilitate localized control, or require some form of supervision through manually provided regions or segmentation masks. In this light, we present an architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion. These factors are obtained by applying a semi-nonnegative tensor factorization on the feature maps, which in turn enables context-aware local image editing with pixel-level control. In addition, we show that the discovered appearance factors correspond to saliency maps that localize concepts of interest, without using any labels. Experiments on a wide range of GAN architectures and datasets show that, in comparison to the state of the art, our method is far more efficient in terms of training time and, most importantly, provides much more accurate localized control.

2.
Scand J Urol ; 59: 131-136, 2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38896113

RESUMO

OBJECTIVE: Disease recurrence, particularly intravesical recurrence (IVR) after radical nephroureterectomy (RNU) for upper tract urothelial carcinoma (UTUC), is common. We investigated whether violations of onco-surgical principles before or during RNU, collectively referred to as surgical violation (SV), were associated with survival outcomes.  Material and methods: Data from a consecutive series of patients who underwent RNU for UTUC 2001-2012 at Skåne University Hospital Lund/Malmö were collected. Preoperative insertion of a nephrostomy tube, opening the urinary tract during surgery or refraining from excising the distal ureter were considered as SVs. Survival outcomes in patients with and without SV (IVR-free [IVRFS], disease-specific [DSS] and overall survival [OS]) were assessed using multivariate Cox regression analyses (adjusted for tumour stage group, prior or concomitant bladder cancer, comorbidity and preoperative urinary cytology). RESULTS: Of 150 patients, 47 (31%) were subjected to at least one SV. Overall, SV was not associated with IVRFS (HR 0.81, 95% CI 0.4-1.6) but with worse DSS (HR 1.9, 95% CI 1.03-3.7) and OS (HR 1.9, 95% CI 1.2-3) in multivariable analysis. Additional analyses with a broader definition of SV including also preoperative instrumentation of the upper urinary tract (ureteroscopy and/or double J stenting) showed similar outcomes for DSS (HR 2.1, 95% CI 1.1-4.3). CONCLUSION: Worse survival outcomes, despite no difference in IVR, for patients that were subjected to the violation of sound onco-surgical principles before or during RNU for UTUC strengthen the notion that adhering to such principles is a cornerstone in upper tract urothelial cancer surgery.


Assuntos
Carcinoma de Células de Transição , Neoplasias Renais , Nefroureterectomia , Neoplasias Ureterais , Humanos , Nefroureterectomia/métodos , Feminino , Masculino , Idoso , Neoplasias Ureterais/cirurgia , Neoplasias Ureterais/mortalidade , Neoplasias Ureterais/patologia , Carcinoma de Células de Transição/cirurgia , Carcinoma de Células de Transição/mortalidade , Carcinoma de Células de Transição/patologia , Neoplasias Renais/cirurgia , Neoplasias Renais/mortalidade , Neoplasias Renais/patologia , Pessoa de Meia-Idade , Taxa de Sobrevida , Estudos Retrospectivos , Recidiva Local de Neoplasia/epidemiologia , Idoso de 80 Anos ou mais , Ureter/cirurgia
3.
Circ Arrhythm Electrophysiol ; 13(4): e007614, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32189516

RESUMO

BACKGROUND: Heart rate variability (HRV) and pulse rate variability are indices of autonomic cardiac modulation. Increased pericardial fat is associated with worse cardiovascular outcomes. We hypothesized that progressive increases in pericardial fat volume and inflammation prospectively dampen HRV in hypercholesterolemic pigs. METHODS: WT (wild type) or PCSK9 (proprotein convertase subtilisin-like/kexin type-9) gain-of-function Ossabaw mini-pigs were studied in vivo before and after 3 and 6 months of a normal diet (WT-normal diet, n=4; PCSK9-normal diet, n=6) or high-fat diet (HFD; WT-HFD, n=3; PCSK9-HFD, n=6). The arterial pulse waveform was obtained from an arterial telemetry transmitter to analyze HRV indices, including SD (SD of all pulse-to-pulse intervals over a single 5-minute period), root mean square of successive differences, proportion >50 ms of normal-to-normal R-R intervals, and the calculated ratio of low-to-high frequency distributions (low-frequency power/high-frequency power). Pericardial fat volumes were evaluated using multidetector computed tomography and its inflammation by gene expression of TNF (tumor necrosis factor)-α. Plasma lipid panel and norepinephrine level were also measured. RESULTS: At diet completion, hypercholesterolemic PCSK9-HFD had significantly (P<0.05 versus baseline) depressed HRV (SD of all pulse-to-pulse intervals over a single 5-minute period, root mean square of successive differences, proportion >50 ms, high-frequency power, low-frequency power), and both HFD groups had higher sympathovagal balance (SD of all pulse-to-pulse intervals over a single 5-minute period/root mean square of successive differences, low-frequency power/high-frequency power) compared with normal diet. Pericardial fat volumes and LDL (low-density lipoprotein) cholesterol concentrations correlated inversely with HRV and directly with sympathovagal balance, while sympathovagal balance correlated directly with plasma norepinephrine. Pericardial fat TNF-α expression was upregulated in PCSK9-HFD, colocalized with nerve fibers, and correlated inversely with root mean square of successive differences and proportion >50 ms. CONCLUSIONS: Progressive pericardial fat expansion and inflammation are associated with a fall in HRV in Ossabaw mini-pigs, implying aggravated autonomic imbalance. Hence, pericardial fat accumulation is associated with alterations in HRV and the autonomic nervous system. Visual Overview: A visual overview is available for this article.


Assuntos
Tecido Adiposo/fisiopatologia , Adiposidade , Arritmias Cardíacas/etiologia , Sistema Nervoso Autônomo/fisiopatologia , Frequência Cardíaca , Hipercolesterolemia/complicações , Inflamação/etiologia , Pericárdio/fisiopatologia , Tecido Adiposo/metabolismo , Animais , Animais Geneticamente Modificados , Arritmias Cardíacas/metabolismo , Arritmias Cardíacas/fisiopatologia , Sistema Nervoso Autônomo/metabolismo , Colesterol/sangue , Modelos Animais de Doenças , Hipercolesterolemia/metabolismo , Hipercolesterolemia/fisiopatologia , Inflamação/metabolismo , Inflamação/fisiopatologia , Mediadores da Inflamação/metabolismo , Masculino , Norepinefrina/sangue , Pericárdio/metabolismo , Suínos , Porco Miniatura/genética , Fatores de Tempo , Fator de Necrose Tumoral alfa/metabolismo
4.
IEEE Trans Pattern Anal Mach Intell ; 40(12): 2948-2962, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-29990153

RESUMO

In this paper, we propose a maximum margin classifier that deals with uncertainty in data input. More specifically, we reformulate the SVM framework such that each training example can be modeled by a multi-dimensional Gaussian distribution described by its mean vector and its covariance matrix-the latter modeling the uncertainty. We address the classification problem and define a cost function that is the expected value of the classical SVM cost when data samples are drawn from the multi-dimensional Gaussian distributions that form the set of the training examples. Our formulation approximates the classical SVM formulation when the training examples are isotropic Gaussians with variance tending to zero. We arrive at a convex optimization problem, which we solve efficiently in the primal form using a stochastic gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel Commercial Detection, and TRECVID MED datasets. Experimental results verify the effectiveness of the proposed method.

5.
IEEE Trans Image Process ; 24(8): 2393-403, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25872211

RESUMO

Face alignment has been well studied in recent years, however, when a face alignment model is applied on facial images with heavy partial occlusion, the performance deteriorates significantly. In this paper, instead of training an occlusion-aware model with visibility annotation, we address this issue via a model adaptation scheme that uses the result of a local regression forest (RF) voting method. In the proposed scheme, the consistency of the votes of the local RF in each of several oversegmented regions is used to determine the reliability of predicting the location of the facial landmarks. The latter is what we call regional predictive power (RPP). Subsequently, we adapt a holistic voting method (cascaded pose regression based on random ferns) by putting weights on the votes of each fern according to the RPP of the regions used in the fern tests. The proposed method shows superior performance over existing face alignment models in the most challenging data sets (COFW and 300-W). Moreover, it can also estimate with high accuracy (72.4% overlap ratio) which image areas belong to the face or nonface objects, on the heavily occluded images of the COFW data set, without explicit occlusion modeling.


Assuntos
Identificação Biométrica/métodos , Face/anatomia & histologia , Algoritmos , Bases de Dados Factuais , Árvores de Decisões , Humanos , Modelos Estatísticos
6.
IEEE Trans Image Process ; 24(2): 619-31, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25532183

RESUMO

In this paper, we propose a object alignment method that detects the landmarks of an object in 2D images. In the regression forests (RFs) framework, observations (patches) that are extracted at several image locations cast votes for the localization of several landmarks. We propose to refine the votes before accumulating them into the Hough space, by sieving and/or aggregating. In order to filter out false positive votes, we pass them through several sieves, each associated with a discrete or continuous latent variable. The sieves filter out votes that are not consistent with the latent variable in question, something that implicitly enforces global constraints. In order to aggregate the votes when necessary, we adjusts on-the-fly a proximity threshold by applying a classifier on middle-level features extracted from voting maps for the object landmark in question. Moreover, our method is able to predict the unreliability of an individual object landmark. This information can be useful for subsequent object analysis like object recognition. Our contributions are validated for two object alignment tasks, face alignment and car alignment, on data sets with challenging images collected in the wild, i.e., the Labeled Face in the Wild, the Annotated Facial Landmarks in the Wild, and the street scene car data set. We show that with the proposed approach, and without explicitly introducing shape models, we obtain performance superior or close to the state of the art for both tasks.

7.
IEEE Trans Pattern Anal Mach Intell ; 35(6): 1357-69, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23599052

RESUMO

We propose a method for head-pose invariant facial expression recognition that is based on a set of characteristic facial points. To achieve head-pose invariance, we propose the Coupled Scaled Gaussian Process Regression (CSGPR) model for head-pose normalization. In this model, we first learn independently the mappings between the facial points in each pair of (discrete) nonfrontal poses and the frontal pose, and then perform their coupling in order to capture dependences between them. During inference, the outputs of the coupled functions from different poses are combined using a gating function, devised based on the head-pose estimation for the query points. The proposed model outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data. To the best of our knowledge, the proposed method is the first one that is able to deal with expressive faces in the range from -45° to +45° pan rotation and -30° to +30° tilt rotation, and with continuous changes in head pose, despite the fact that training was conducted on a small set of discrete poses. We evaluate the proposed method on synthetic and real images depicting acted and spontaneously displayed facial expressions.


Assuntos
Fácies , Distribuição Normal , Algoritmos , Biometria/métodos , Cabeça/anatomia & histologia , Humanos , Aumento da Imagem , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise de Regressão
8.
IEEE Trans Image Process ; 21(2): 816-27, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21859620

RESUMO

In this paper, we exploit the advantages of tensorial representations and propose several tensor learning models for regression. The model is based on the canonical/parallel-factor decomposition of tensors of multiple modes and allows the simultaneous projections of an input tensor to more than one direction along each mode. Two empirical risk functions are studied, namely, the square loss and ε -insensitive loss functions. The former leads to higher rank tensor ridge regression (TRR), and the latter leads to higher rank support tensor regression (STR), both formulated using the Frobenius norm for regularization. We also use the group-sparsity norm for regularization, favoring in that way the low rank decomposition of the tensorial weight. In that way, we achieve the automatic selection of the rank during the learning process and obtain the optimal-rank TRR and STR. Experiments conducted for the problems of head-pose, human-age, and 3-D body-pose estimations using real data from publicly available databases, verified not only the superiority of tensors over their vector counterparts but also the efficiency of the proposed algorithms.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Postura/fisiologia , Análise de Regressão , Máquina de Vetores de Suporte , Bases de Dados Factuais , Humanos , Gravação em Vídeo
9.
IEEE Trans Neural Netw Learn Syst ; 23(1): 127-37, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24808462

RESUMO

One of the most informative measures for feature extraction (FE) is mutual information (MI). In terms of MI, the optimal FE creates new features that jointly have the largest dependency on the target class. However, obtaining an accurate estimate of a high-dimensional MI as well as optimizing with respect to it is not always easy, especially when only small training sets are available. In this paper, we propose an efficient tree-based method for FE in which at each step a new feature is created by selecting and linearly combining two features such that the MI between the new feature and the class is maximized. Both the selection of the features to be combined and the estimation of the coefficients of the linear transform rely on estimating 2-D MIs. The estimation of the latter is computationally very efficient and robust. The effectiveness of our method is evaluated on several real-world data sets. The results show that the classification accuracy obtained by the proposed method is higher than that achieved by other FE methods.

10.
IEEE Trans Syst Man Cybern B Cybern ; 41(5): 1366-81, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21642042

RESUMO

Computer vision techniques have made considerable progress in recognizing object categories by learning models that normally rely on a set of discriminative features. However, in contrast to human perception that makes extensive use of logic-based rules, these models fail to benefit from knowledge that is explicitly provided. In this paper, we propose a framework that can perform knowledge-assisted analysis of visual content. We use ontologies to model the domain knowledge and a set of conditional probabilities to model the application context. Then, a Bayesian network is used for integrating statistical and explicit knowledge and performing hypothesis testing using evidence-driven probabilistic inference. In addition, we propose the use of a focus-of-attention (FoA) mechanism that is based on the mutual information between concepts. This mechanism selects the most prominent hypotheses to be verified/tested by the BN, hence removing the need to exhaustively test all possible combinations of the hypotheses set. We experimentally evaluate our framework using content from three domains and for the following three tasks: 1) image categorization; 2) localized region labeling; and 3) weak annotation of video shot keyframes. The results obtained demonstrate the improvement in performance compared to a set of baseline concept classifiers that are not aware of any context or domain knowledge. Finally, we also demonstrate the ability of the proposed FoA mechanism to significantly reduce the computational cost of visual inference while obtaining results comparable to the exhaustive case.


Assuntos
Teorema de Bayes , Cibernética , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Algoritmos , Atenção , Probabilidade , Percepção Visual
11.
IEEE Trans Image Process ; 20(4): 1126-40, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20851793

RESUMO

In this paper we address the problem of localization and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic ensembles of feature descriptors. Evidence for the spatiotemporal localization of the activity is accumulated in a probabilistic spatiotemporal voting scheme. The local nature of the proposed voting framework allows us to deal with multiple activities taking place in the same scene, as well as with activities in the presence of clutter and occlusion. We use boosting in order to select characteristic ensembles per class. This leads to a set of class specific codebooks where each codeword is an ensemble of features. During training, we store the spatial positions of the codeword ensembles with respect to a set of reference points, as well as their temporal positions with respect to the start and end of the action instance. During testing, each activated codeword ensemble casts votes concerning the spatiotemporal position and extend of the action, using the information that was stored during training. Mean Shift mode estimation in the voting space provides the most probable hypotheses concerning the localization of the subjects at each frame, as well as the extend of the activities depicted in the image sequences. We present classification and localization results for a number of publicly available datasets, and for a number of sequences where there is a significant amount of clutter and occlusion.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Inteligência Artificial , Humanos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
12.
IEEE Trans Pattern Anal Mach Intell ; 32(11): 1940-54, 2010 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-20847386

RESUMO

In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set.


Assuntos
Expressão Facial , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos , Algoritmos , Metodologias Computacionais , Face , Gestos , Humanos , Armazenamento e Recuperação da Informação/métodos , Modelos Biológicos , Reconhecimento Psicológico , Fatores de Tempo
13.
IEEE Trans Pattern Anal Mach Intell ; 32(9): 1553-67, 2010 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-20634552

RESUMO

This paper addresses the problem of robust template tracking in image sequences. Our work falls within the discriminative framework in which the observations at each frame yield direct probabilistic predictions of the state of the target. Our primary contribution is that we explicitly address the problem that the prediction accuracy for different observations varies, and in some cases, can be very low. To this end, we couple the predictor to a probabilistic classifier which, when trained, can determine the probability that a new observation can accurately predict the state of the target (that is, determine the "relevance" or "reliability" of the observation in question). In the particle filtering framework, we derive a recursive scheme for maintaining an approximation of the posterior probability of the state in which multiple observations can be used and their predictions moderated by their corresponding relevance. In this way, the predictions of the "relevant" observations are emphasized, while the predictions of the "irrelevant" observations are suppressed. We apply the algorithm to the problem of 2D template tracking and demonstrate that the proposed scheme outperforms classical methods for discriminative tracking both in the case of motions which are large in magnitude and also for partial occlusions.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
IEEE Trans Image Process ; 17(9): 1685-99, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18713674

RESUMO

Low-level image analysis systems typically detect "points of interest", i.e., areas of natural images that contain corners or edges. Most of the robust and computationally efficient detectors proposed for this task use the autocorrelation matrix of the localized image derivatives. Although the performance of such detectors and their suitability for particular applications has been studied in relevant literature, their behavior under limited input source (image) precision or limited computational or energy resources is largely unknown. All existing frameworks assume that the input image is readily available for processing and that sufficient computational and energy resources exist for the completion of the result. Nevertheless, recent advances in incremental image sensors or compressed sensing, as well as the demand for low-complexity scene analysis in sensor networks now challenge these assumptions. In this paper, we investigate an approach to compute salient points of images incrementally, i.e., the salient point detector can operate with a coarsely quantized input image representation and successively refine the result (the derived salient points) as the image precision is successively refined by the sensor. This has the advantage that the image sensing and the salient point detection can be terminated at any input image precision (e.g., bound set by the sensory equipment or by computation, or by the salient point accuracy required by the application) and the obtained salient points under this precision are readily available. We focus on the popular detector proposed by Harris and Stephens and demonstrate how such an approach can operate when the image samples are refined in a bitwise manner, i.e., the image bitplanes are received one-by-one from the image sensor. We estimate the required energy for image sensing as well as the computation required for the salient point detection based on stochastic source modeling. The computation and energy required by the proposed incremental refinement approach is compared against the conventional salient-point detector realization that operates directly on each source precision and cannot refine the result. Our experiments demonstrate the feasibility of incremental approaches for salient point detection in various classes of natural images. In addition, a first comparison between the results obtained by the intermediate detectors is presented and a novel application for adaptive low-energy image sensing based on points of saliency is presented.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
15.
IEEE Trans Syst Man Cybern B Cybern ; 36(3): 710-9, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16761823

RESUMO

This paper addresses the problem of human-action recognition by introducing a sparse representation of image sequences as a collection of spatiotemporal events that are localized at points that are salient both in space and time. The spatiotemporal salient points are detected by measuring the variations in the information content of pixel neighborhoods not only in space but also in time. An appropriate distance metric between two collections of spatiotemporal salient points is introduced, which is based on the chamfer distance and an iterative linear time-warping technique that deals with time expansion or time-compression issues. A classification scheme that is based on relevance vector machines and on the proposed distance measure is proposed. Results on real image sequences from a small database depicting people performing 19 aerobic exercises are presented.


Assuntos
Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Modelos Biológicos , Movimento , Reconhecimento Automatizado de Padrão/métodos , Análise e Desempenho de Tarefas , Gravação em Vídeo/métodos , Algoritmos , Simulação por Computador , Humanos , Técnica de Subtração , Fatores de Tempo
16.
IEEE Trans Syst Man Cybern B Cybern ; 36(2): 433-49, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16602602

RESUMO

Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87% is achieved.


Assuntos
Inteligência Artificial , Face/anatomia & histologia , Face/fisiologia , Expressão Facial , Interpretação de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Análise por Conglomerados , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Fotografação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de Subtração , Fatores de Tempo , Gravação em Vídeo/métodos
17.
IEEE Trans Image Process ; 15(1): 1-11, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16435532

RESUMO

In this paper, we propose a new scheme that merges color- and shape-invariant information for object recognition. To obtain robustness against photometric changes, color-invariant derivatives are computed first. Color invariance is an important aspect of any object recognition scheme, as color changes considerably with the variation in illumination, object pose, and camera viewpoint. These color invariant derivatives are then used to obtain similarity invariant shape descriptors. Shape invariance is equally important as, under a change in camera viewpoint and object pose, the shape of a rigid object undergoes a perspective projection on the image plane. Then, the color and shape invariants are combined in a multidimensional color-shape context which is subsequently used as an index. As the indexing scheme makes use of a color-shape invariant context, it provides a high-discriminative information cue robust against varying imaging conditions. The matching function of the color-shape context allows for fast recognition, even in the presence of object occlusion and cluttering. From the experimental results, it is shown that the method recognizes rigid objects with high accuracy in 3-D complex scenes and is robust against changing illumination, camera viewpoint, object pose, and noise.


Assuntos
Algoritmos , Cor , Colorimetria/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Iluminação , Reconhecimento Automatizado de Padrão/métodos , Inteligência Artificial , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Técnica de Subtração
18.
IEEE Trans Image Process ; 13(11): 1432-43, 2004 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-15540453

RESUMO

This paper presents a method for dense optical flow estimation in which the motion field within patches that result from an initial intensity segmentation is parametrized with models of different order. We propose a novel formulation which introduces regularization constraints between the model parameters of neighboring patches. In this way, we provide the additional constraints for very small patches and for patches whose intensity variation cannot sufficiently constrain the estimation of their motion parameters. In order to preserve motion discontinuities, we use robust functions as a regularization mean. We adopt a three-frame approach and control the balance between the backward and forward constraints by a real-valued direction field on which regularization constraints are applied. An iterative deterministic relaxation method is employed in order to solve the corresponding optimization problem. Experimental results show that the proposed method deals successfully with motions large in magnitude, motion discontinuities, and produces accurate piecewise-smooth motion fields.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Técnica de Subtração , Análise por Conglomerados , Gráficos por Computador , Simulação por Computador , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Biológicos , Modelos Estatísticos , Movimento (Física) , Análise Numérica Assistida por Computador , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Interface Usuário-Computador , Caminhada/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...