Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
Add more filters










Publication year range
1.
Scand J Urol ; 59: 131-136, 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38896113

ABSTRACT

OBJECTIVE: Disease recurrence, particularly intravesical recurrence (IVR) after radical nephroureterectomy (RNU) for upper tract urothelial carcinoma (UTUC), is common. We investigated whether violations of onco-surgical principles before or during RNU, collectively referred to as surgical violation (SV), were associated with survival outcomes.  Material and methods: Data from a consecutive series of patients who underwent RNU for UTUC 2001-2012 at Skåne University Hospital Lund/Malmö were collected. Preoperative insertion of a nephrostomy tube, opening the urinary tract during surgery or refraining from excising the distal ureter were considered as SVs. Survival outcomes in patients with and without SV (IVR-free [IVRFS], disease-specific [DSS] and overall survival [OS]) were assessed using multivariate Cox regression analyses (adjusted for tumour stage group, prior or concomitant bladder cancer, comorbidity and preoperative urinary cytology). RESULTS: Of 150 patients, 47 (31%) were subjected to at least one SV. Overall, SV was not associated with IVRFS (HR 0.81, 95% CI 0.4-1.6) but with worse DSS (HR 1.9, 95% CI 1.03-3.7) and OS (HR 1.9, 95% CI 1.2-3) in multivariable analysis. Additional analyses with a broader definition of SV including also preoperative instrumentation of the upper urinary tract (ureteroscopy and/or double J stenting) showed similar outcomes for DSS (HR 2.1, 95% CI 1.1-4.3). CONCLUSION: Worse survival outcomes, despite no difference in IVR, for patients that were subjected to the violation of sound onco-surgical principles before or during RNU for UTUC strengthen the notion that adhering to such principles is a cornerstone in upper tract urothelial cancer surgery.


Subject(s)
Carcinoma, Transitional Cell , Kidney Neoplasms , Nephroureterectomy , Ureteral Neoplasms , Humans , Nephroureterectomy/methods , Female , Male , Aged , Ureteral Neoplasms/surgery , Ureteral Neoplasms/mortality , Ureteral Neoplasms/pathology , Carcinoma, Transitional Cell/surgery , Carcinoma, Transitional Cell/mortality , Carcinoma, Transitional Cell/pathology , Kidney Neoplasms/surgery , Kidney Neoplasms/mortality , Kidney Neoplasms/pathology , Middle Aged , Survival Rate , Retrospective Studies , Neoplasm Recurrence, Local/epidemiology , Aged, 80 and over , Ureter/surgery
2.
Article in English | MEDLINE | ID: mdl-38923485

ABSTRACT

Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs. However, existing methods are often tailored to specific GAN architectures and are limited to either discovering global semantic directions that do not facilitate localized control, or require some form of supervision through manually provided regions or segmentation masks. In this light, we present an architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion. These factors are obtained by applying a semi-nonnegative tensor factorization on the feature maps, which in turn enables context-aware local image editing with pixel-level control. In addition, we show that the discovered appearance factors correspond to saliency maps that localize concepts of interest, without using any labels. Experiments on a wide range of GAN architectures and datasets show that, in comparison to the state of the art, our method is far more efficient in terms of training time and, most importantly, provides much more accurate localized control.

3.
Circ Arrhythm Electrophysiol ; 13(4): e007614, 2020 04.
Article in English | MEDLINE | ID: mdl-32189516

ABSTRACT

BACKGROUND: Heart rate variability (HRV) and pulse rate variability are indices of autonomic cardiac modulation. Increased pericardial fat is associated with worse cardiovascular outcomes. We hypothesized that progressive increases in pericardial fat volume and inflammation prospectively dampen HRV in hypercholesterolemic pigs. METHODS: WT (wild type) or PCSK9 (proprotein convertase subtilisin-like/kexin type-9) gain-of-function Ossabaw mini-pigs were studied in vivo before and after 3 and 6 months of a normal diet (WT-normal diet, n=4; PCSK9-normal diet, n=6) or high-fat diet (HFD; WT-HFD, n=3; PCSK9-HFD, n=6). The arterial pulse waveform was obtained from an arterial telemetry transmitter to analyze HRV indices, including SD (SD of all pulse-to-pulse intervals over a single 5-minute period), root mean square of successive differences, proportion >50 ms of normal-to-normal R-R intervals, and the calculated ratio of low-to-high frequency distributions (low-frequency power/high-frequency power). Pericardial fat volumes were evaluated using multidetector computed tomography and its inflammation by gene expression of TNF (tumor necrosis factor)-α. Plasma lipid panel and norepinephrine level were also measured. RESULTS: At diet completion, hypercholesterolemic PCSK9-HFD had significantly (P<0.05 versus baseline) depressed HRV (SD of all pulse-to-pulse intervals over a single 5-minute period, root mean square of successive differences, proportion >50 ms, high-frequency power, low-frequency power), and both HFD groups had higher sympathovagal balance (SD of all pulse-to-pulse intervals over a single 5-minute period/root mean square of successive differences, low-frequency power/high-frequency power) compared with normal diet. Pericardial fat volumes and LDL (low-density lipoprotein) cholesterol concentrations correlated inversely with HRV and directly with sympathovagal balance, while sympathovagal balance correlated directly with plasma norepinephrine. Pericardial fat TNF-α expression was upregulated in PCSK9-HFD, colocalized with nerve fibers, and correlated inversely with root mean square of successive differences and proportion >50 ms. CONCLUSIONS: Progressive pericardial fat expansion and inflammation are associated with a fall in HRV in Ossabaw mini-pigs, implying aggravated autonomic imbalance. Hence, pericardial fat accumulation is associated with alterations in HRV and the autonomic nervous system. Visual Overview: A visual overview is available for this article.


Subject(s)
Adipose Tissue/physiopathology , Adiposity , Arrhythmias, Cardiac/etiology , Autonomic Nervous System/physiopathology , Heart Rate , Hypercholesterolemia/complications , Inflammation/etiology , Pericardium/physiopathology , Adipose Tissue/metabolism , Animals , Animals, Genetically Modified , Arrhythmias, Cardiac/metabolism , Arrhythmias, Cardiac/physiopathology , Autonomic Nervous System/metabolism , Cholesterol/blood , Disease Models, Animal , Hypercholesterolemia/metabolism , Hypercholesterolemia/physiopathology , Inflammation/metabolism , Inflammation/physiopathology , Inflammation Mediators/metabolism , Male , Norepinephrine/blood , Pericardium/metabolism , Swine , Swine, Miniature/genetics , Time Factors , Tumor Necrosis Factor-alpha/metabolism
4.
IEEE Trans Pattern Anal Mach Intell ; 40(12): 2948-2962, 2018 12.
Article in English | MEDLINE | ID: mdl-29990153

ABSTRACT

In this paper, we propose a maximum margin classifier that deals with uncertainty in data input. More specifically, we reformulate the SVM framework such that each training example can be modeled by a multi-dimensional Gaussian distribution described by its mean vector and its covariance matrix-the latter modeling the uncertainty. We address the classification problem and define a cost function that is the expected value of the classical SVM cost when data samples are drawn from the multi-dimensional Gaussian distributions that form the set of the training examples. Our formulation approximates the classical SVM formulation when the training examples are isotropic Gaussians with variance tending to zero. We arrive at a convex optimization problem, which we solve efficiently in the primal form using a stochastic gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel Commercial Detection, and TRECVID MED datasets. Experimental results verify the effectiveness of the proposed method.

5.
IEEE Trans Image Process ; 24(8): 2393-403, 2015 Aug.
Article in English | MEDLINE | ID: mdl-25872211

ABSTRACT

Face alignment has been well studied in recent years, however, when a face alignment model is applied on facial images with heavy partial occlusion, the performance deteriorates significantly. In this paper, instead of training an occlusion-aware model with visibility annotation, we address this issue via a model adaptation scheme that uses the result of a local regression forest (RF) voting method. In the proposed scheme, the consistency of the votes of the local RF in each of several oversegmented regions is used to determine the reliability of predicting the location of the facial landmarks. The latter is what we call regional predictive power (RPP). Subsequently, we adapt a holistic voting method (cascaded pose regression based on random ferns) by putting weights on the votes of each fern according to the RPP of the regions used in the fern tests. The proposed method shows superior performance over existing face alignment models in the most challenging data sets (COFW and 300-W). Moreover, it can also estimate with high accuracy (72.4% overlap ratio) which image areas belong to the face or nonface objects, on the heavily occluded images of the COFW data set, without explicit occlusion modeling.


Subject(s)
Biometric Identification/methods , Face/anatomy & histology , Algorithms , Databases, Factual , Decision Trees , Humans , Models, Statistical
6.
IEEE Trans Image Process ; 24(2): 619-31, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25532183

ABSTRACT

In this paper, we propose a object alignment method that detects the landmarks of an object in 2D images. In the regression forests (RFs) framework, observations (patches) that are extracted at several image locations cast votes for the localization of several landmarks. We propose to refine the votes before accumulating them into the Hough space, by sieving and/or aggregating. In order to filter out false positive votes, we pass them through several sieves, each associated with a discrete or continuous latent variable. The sieves filter out votes that are not consistent with the latent variable in question, something that implicitly enforces global constraints. In order to aggregate the votes when necessary, we adjusts on-the-fly a proximity threshold by applying a classifier on middle-level features extracted from voting maps for the object landmark in question. Moreover, our method is able to predict the unreliability of an individual object landmark. This information can be useful for subsequent object analysis like object recognition. Our contributions are validated for two object alignment tasks, face alignment and car alignment, on data sets with challenging images collected in the wild, i.e., the Labeled Face in the Wild, the Annotated Facial Landmarks in the Wild, and the street scene car data set. We show that with the proposed approach, and without explicitly introducing shape models, we obtain performance superior or close to the state of the art for both tasks.

7.
IEEE Trans Pattern Anal Mach Intell ; 35(6): 1357-69, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23599052

ABSTRACT

We propose a method for head-pose invariant facial expression recognition that is based on a set of characteristic facial points. To achieve head-pose invariance, we propose the Coupled Scaled Gaussian Process Regression (CSGPR) model for head-pose normalization. In this model, we first learn independently the mappings between the facial points in each pair of (discrete) nonfrontal poses and the frontal pose, and then perform their coupling in order to capture dependences between them. During inference, the outputs of the coupled functions from different poses are combined using a gating function, devised based on the head-pose estimation for the query points. The proposed model outperforms state-of-the-art regression-based approaches to head-pose normalization, 2D and 3D Point Distribution Models (PDMs), and Active Appearance Models (AAMs), especially in cases of unknown poses and imbalanced training data. To the best of our knowledge, the proposed method is the first one that is able to deal with expressive faces in the range from -45° to +45° pan rotation and -30° to +30° tilt rotation, and with continuous changes in head pose, despite the fact that training was conducted on a small set of discrete poses. We evaluate the proposed method on synthetic and real images depicting acted and spontaneously displayed facial expressions.


Subject(s)
Facies , Normal Distribution , Algorithms , Biometry/methods , Head/anatomy & histology , Humans , Image Enhancement , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Regression Analysis
8.
IEEE Trans Image Process ; 21(2): 816-27, 2012 Feb.
Article in English | MEDLINE | ID: mdl-21859620

ABSTRACT

In this paper, we exploit the advantages of tensorial representations and propose several tensor learning models for regression. The model is based on the canonical/parallel-factor decomposition of tensors of multiple modes and allows the simultaneous projections of an input tensor to more than one direction along each mode. Two empirical risk functions are studied, namely, the square loss and ε -insensitive loss functions. The former leads to higher rank tensor ridge regression (TRR), and the latter leads to higher rank support tensor regression (STR), both formulated using the Frobenius norm for regularization. We also use the group-sparsity norm for regularization, favoring in that way the low rank decomposition of the tensorial weight. In that way, we achieve the automatic selection of the rank during the learning process and obtain the optimal-rank TRR and STR. Experiments conducted for the problems of head-pose, human-age, and 3-D body-pose estimations using real data from publicly available databases, verified not only the superiority of tensors over their vector counterparts but also the efficiency of the proposed algorithms.


Subject(s)
Image Processing, Computer-Assisted/methods , Posture/physiology , Regression Analysis , Support Vector Machine , Databases, Factual , Humans , Video Recording
9.
IEEE Trans Neural Netw Learn Syst ; 23(1): 127-37, 2012 Jan.
Article in English | MEDLINE | ID: mdl-24808462

ABSTRACT

One of the most informative measures for feature extraction (FE) is mutual information (MI). In terms of MI, the optimal FE creates new features that jointly have the largest dependency on the target class. However, obtaining an accurate estimate of a high-dimensional MI as well as optimizing with respect to it is not always easy, especially when only small training sets are available. In this paper, we propose an efficient tree-based method for FE in which at each step a new feature is created by selecting and linearly combining two features such that the MI between the new feature and the class is maximized. Both the selection of the features to be combined and the estimation of the coefficients of the linear transform rely on estimating 2-D MIs. The estimation of the latter is computationally very efficient and robust. The effectiveness of our method is evaluated on several real-world data sets. The results show that the classification accuracy obtained by the proposed method is higher than that achieved by other FE methods.

10.
IEEE Trans Syst Man Cybern B Cybern ; 41(5): 1366-81, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21642042

ABSTRACT

Computer vision techniques have made considerable progress in recognizing object categories by learning models that normally rely on a set of discriminative features. However, in contrast to human perception that makes extensive use of logic-based rules, these models fail to benefit from knowledge that is explicitly provided. In this paper, we propose a framework that can perform knowledge-assisted analysis of visual content. We use ontologies to model the domain knowledge and a set of conditional probabilities to model the application context. Then, a Bayesian network is used for integrating statistical and explicit knowledge and performing hypothesis testing using evidence-driven probabilistic inference. In addition, we propose the use of a focus-of-attention (FoA) mechanism that is based on the mutual information between concepts. This mechanism selects the most prominent hypotheses to be verified/tested by the BN, hence removing the need to exhaustively test all possible combinations of the hypotheses set. We experimentally evaluate our framework using content from three domains and for the following three tasks: 1) image categorization; 2) localized region labeling; and 3) weak annotation of video shot keyframes. The results obtained demonstrate the improvement in performance compared to a set of baseline concept classifiers that are not aware of any context or domain knowledge. Finally, we also demonstrate the ability of the proposed FoA mechanism to significantly reduce the computational cost of visual inference while obtaining results comparable to the exhaustive case.


Subject(s)
Bayes Theorem , Cybernetics , Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Algorithms , Attention , Probability , Visual Perception
11.
IEEE Trans Image Process ; 20(4): 1126-40, 2011 Apr.
Article in English | MEDLINE | ID: mdl-20851793

ABSTRACT

In this paper we address the problem of localization and recognition of human activities in unsegmented image sequences. The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the activity which relies on the spatiotemporal localization of characteristic ensembles of feature descriptors. Evidence for the spatiotemporal localization of the activity is accumulated in a probabilistic spatiotemporal voting scheme. The local nature of the proposed voting framework allows us to deal with multiple activities taking place in the same scene, as well as with activities in the presence of clutter and occlusion. We use boosting in order to select characteristic ensembles per class. This leads to a set of class specific codebooks where each codeword is an ensemble of features. During training, we store the spatial positions of the codeword ensembles with respect to a set of reference points, as well as their temporal positions with respect to the start and end of the action instance. During testing, each activated codeword ensemble casts votes concerning the spatiotemporal position and extend of the action, using the information that was stored during training. Mean Shift mode estimation in the voting space provides the most probable hypotheses concerning the localization of the subjects at each frame, as well as the extend of the activities depicted in the image sequences. We present classification and localization results for a number of publicly available datasets, and for a number of sequences where there is a significant amount of clutter and occlusion.


Subject(s)
Algorithms , Image Interpretation, Computer-Assisted/methods , Movement/physiology , Pattern Recognition, Automated/methods , Photography/methods , Subtraction Technique , Video Recording/methods , Artificial Intelligence , Humans , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
12.
IEEE Trans Pattern Anal Mach Intell ; 32(11): 1940-54, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20847386

ABSTRACT

In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set.


Subject(s)
Facial Expression , Pattern Recognition, Automated/methods , Video Recording/methods , Algorithms , Computing Methodologies , Face , Gestures , Humans , Information Storage and Retrieval/methods , Models, Biological , Recognition, Psychology , Time Factors
13.
IEEE Trans Pattern Anal Mach Intell ; 32(9): 1553-67, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20634552

ABSTRACT

This paper addresses the problem of robust template tracking in image sequences. Our work falls within the discriminative framework in which the observations at each frame yield direct probabilistic predictions of the state of the target. Our primary contribution is that we explicitly address the problem that the prediction accuracy for different observations varies, and in some cases, can be very low. To this end, we couple the predictor to a probabilistic classifier which, when trained, can determine the probability that a new observation can accurately predict the state of the target (that is, determine the "relevance" or "reliability" of the observation in question). In the particle filtering framework, we derive a recursive scheme for maintaining an approximation of the posterior probability of the state in which multiple observations can be used and their predictions moderated by their corresponding relevance. In this way, the predictions of the "relevant" observations are emphasized, while the predictions of the "irrelevant" observations are suppressed. We apply the algorithm to the problem of 2D template tracking and demonstrate that the proposed scheme outperforms classical methods for discriminative tracking both in the case of motions which are large in magnitude and also for partial occlusions.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
14.
IEEE Trans Image Process ; 17(9): 1685-99, 2008 Sep.
Article in English | MEDLINE | ID: mdl-18713674

ABSTRACT

Low-level image analysis systems typically detect "points of interest", i.e., areas of natural images that contain corners or edges. Most of the robust and computationally efficient detectors proposed for this task use the autocorrelation matrix of the localized image derivatives. Although the performance of such detectors and their suitability for particular applications has been studied in relevant literature, their behavior under limited input source (image) precision or limited computational or energy resources is largely unknown. All existing frameworks assume that the input image is readily available for processing and that sufficient computational and energy resources exist for the completion of the result. Nevertheless, recent advances in incremental image sensors or compressed sensing, as well as the demand for low-complexity scene analysis in sensor networks now challenge these assumptions. In this paper, we investigate an approach to compute salient points of images incrementally, i.e., the salient point detector can operate with a coarsely quantized input image representation and successively refine the result (the derived salient points) as the image precision is successively refined by the sensor. This has the advantage that the image sensing and the salient point detection can be terminated at any input image precision (e.g., bound set by the sensory equipment or by computation, or by the salient point accuracy required by the application) and the obtained salient points under this precision are readily available. We focus on the popular detector proposed by Harris and Stephens and demonstrate how such an approach can operate when the image samples are refined in a bitwise manner, i.e., the image bitplanes are received one-by-one from the image sensor. We estimate the required energy for image sensing as well as the computation required for the salient point detection based on stochastic source modeling. The computation and energy required by the proposed incremental refinement approach is compared against the conventional salient-point detector realization that operates directly on each source precision and cannot refine the result. Our experiments demonstrate the feasibility of incremental approaches for salient point detection in various classes of natural images. In addition, a first comparison between the results obtained by the intermediate detectors is presented and a novel application for adaptive low-energy image sensing based on points of saliency is presented.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Reproducibility of Results , Sensitivity and Specificity
15.
IEEE Trans Syst Man Cybern B Cybern ; 36(3): 710-9, 2006 Jun.
Article in English | MEDLINE | ID: mdl-16761823

ABSTRACT

This paper addresses the problem of human-action recognition by introducing a sparse representation of image sequences as a collection of spatiotemporal events that are localized at points that are salient both in space and time. The spatiotemporal salient points are detected by measuring the variations in the information content of pixel neighborhoods not only in space but also in time. An appropriate distance metric between two collections of spatiotemporal salient points is introduced, which is based on the chamfer distance and an iterative linear time-warping technique that deals with time expansion or time-compression issues. A classification scheme that is based on relevance vector machines and on the proposed distance measure is proposed. Results on real image sequences from a small database depicting people performing 19 aerobic exercises are presented.


Subject(s)
Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Models, Biological , Movement , Pattern Recognition, Automated/methods , Task Performance and Analysis , Video Recording/methods , Algorithms , Computer Simulation , Humans , Subtraction Technique , Time Factors
16.
IEEE Trans Syst Man Cybern B Cybern ; 36(2): 433-49, 2006 Apr.
Article in English | MEDLINE | ID: mdl-16602602

ABSTRACT

Automatic analysis of human facial expression is a challenging problem with many applications. Most of the existing automated systems for facial expression analysis attempt to recognize a few prototypic emotional expressions, such as anger and happiness. Instead of representing another approach to machine analysis of prototypic facial expressions of emotion, the method presented in this paper attempts to handle a large range of human facial behavior by recognizing facial muscle actions that produce expressions. Virtually all of the existing vision systems for facial muscle action detection deal only with frontal-view face images and cannot handle temporal dynamics of facial actions. In this paper, we present a system for automatic recognition of facial action units (AUs) and their temporal models from long, profile-view face image sequences. We exploit particle filtering to track 15 facial points in an input face-profile sequence, and we introduce facial-action-dynamics recognition from continuous video input using temporal rules. The algorithm performs both automatic segmentation of an input video into facial expressions pictured and recognition of temporal segments (i.e., onset, apex, offset) of 27 AUs occurring alone or in a combination in the input face-profile video. A recognition rate of 87% is achieved.


Subject(s)
Artificial Intelligence , Face/anatomy & histology , Face/physiology , Facial Expression , Image Interpretation, Computer-Assisted/methods , Movement/physiology , Pattern Recognition, Automated/methods , Algorithms , Cluster Analysis , Humans , Image Enhancement/methods , Information Storage and Retrieval/methods , Photography/methods , Reproducibility of Results , Sensitivity and Specificity , Subtraction Technique , Time Factors , Video Recording/methods
17.
IEEE Trans Image Process ; 15(1): 1-11, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16435532

ABSTRACT

In this paper, we propose a new scheme that merges color- and shape-invariant information for object recognition. To obtain robustness against photometric changes, color-invariant derivatives are computed first. Color invariance is an important aspect of any object recognition scheme, as color changes considerably with the variation in illumination, object pose, and camera viewpoint. These color invariant derivatives are then used to obtain similarity invariant shape descriptors. Shape invariance is equally important as, under a change in camera viewpoint and object pose, the shape of a rigid object undergoes a perspective projection on the image plane. Then, the color and shape invariants are combined in a multidimensional color-shape context which is subsequently used as an index. As the indexing scheme makes use of a color-shape invariant context, it provides a high-discriminative information cue robust against varying imaging conditions. The matching function of the color-shape context allows for fast recognition, even in the presence of object occlusion and cluttering. From the experimental results, it is shown that the method recognizes rigid objects with high accuracy in 3-D complex scenes and is robust against changing illumination, camera viewpoint, object pose, and noise.


Subject(s)
Algorithms , Color , Colorimetry/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Lighting , Pattern Recognition, Automated/methods , Artificial Intelligence , Image Enhancement/methods , Information Storage and Retrieval/methods , Subtraction Technique
18.
IEEE Trans Image Process ; 13(11): 1432-43, 2004 Nov.
Article in English | MEDLINE | ID: mdl-15540453

ABSTRACT

This paper presents a method for dense optical flow estimation in which the motion field within patches that result from an initial intensity segmentation is parametrized with models of different order. We propose a novel formulation which introduces regularization constraints between the model parameters of neighboring patches. In this way, we provide the additional constraints for very small patches and for patches whose intensity variation cannot sufficiently constrain the estimation of their motion parameters. In order to preserve motion discontinuities, we use robust functions as a regularization mean. We adopt a three-frame approach and control the balance between the backward and forward constraints by a real-valued direction field on which regularization constraints are applied. An iterative deterministic relaxation method is employed in order to solve the corresponding optimization problem. Experimental results show that the proposed method deals successfully with motions large in magnitude, motion discontinuities, and produces accurate piecewise-smooth motion fields.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Movement/physiology , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Subtraction Technique , Cluster Analysis , Computer Graphics , Computer Simulation , Humans , Image Enhancement/methods , Information Storage and Retrieval/methods , Models, Biological , Models, Statistical , Motion , Numerical Analysis, Computer-Assisted , Reproducibility of Results , Sensitivity and Specificity , User-Computer Interface , Walking/physiology
SELECTION OF CITATIONS
SEARCH DETAIL
...