Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 2505-2518, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35358043

RESUMO

Deep metric learning has yielded impressive results in tasks such as clustering and image retrieval by leveraging neural networks to obtain highly discriminative feature embeddings, which can be used to group samples into different classes. Much research has been devoted to the design of smart loss functions or data mining strategies for training such networks. Most methods consider only pairs or triplets of samples within a mini-batch to compute the loss function, which is commonly based on the distance between embeddings. We propose Group Loss, a loss function based on a differentiable label-propagation method that enforces embedding similarity across all samples of a group while promoting, at the same time, low-density regions amongst data points belonging to different groups. Guided by the smoothness assumption that "similar objects should belong to the same group", the proposed loss trains the neural network for a classification task, enforcing a consistent labelling amongst samples within a class. We design a set of inference strategies tailored towards our algorithm, named Group Loss++ that further improve the results of our model. We show state-of-the-art results on clustering and image retrieval on four retrieval datasets, and present competitive results on two person re-identification datasets, providing a unified framework for retrieval and re-identification.

2.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 7778-7796, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-34613910

RESUMO

In he past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird's-eye view of aerial images. More importantly, the lack of large-scale benchmarks has become a major obstacle to the development of object detection in aerial images (ODAI). In this paper, we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI. The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images. Based on this large-scale and well-annotated dataset, we build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated. Furthermore, we provide a code library for ODAI and build a website for evaluating different algorithms. Previous challenges run on DOTA have attracted more than 1300 teams worldwide. We believe that the expanded large-scale DOTA dataset, the extensive baselines, the code library and the challenges can facilitate the designs of robust algorithms and reproducible research on the problem of object detection in aerial images.


Assuntos
Algoritmos , Benchmarking
3.
IEEE Trans Cybern ; 51(5): 2748-2760, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-31331899

RESUMO

The training of autoencoder (AE) focuses on the selection of connection weights via a minimization of both the training error and a regularized term. However, the ultimate goal of AE training is to autoencode future unseen samples correctly (i.e., good generalization). Minimizing the training error with different regularized terms only indirectly minimizes the generalization error. Moreover, the trained model may not be robust to small perturbations of inputs which may lead to a poor generalization capability. In this paper, we propose a localized stochastic sensitive AE (LiSSA) to enhance the robustness of AE with respect to input perturbations. With the local stochastic sensitivity regularization, LiSSA reduces sensitivity to unseen samples with small differences (perturbations) from training samples. Meanwhile, LiSSA preserves the local connectivity from the original input space to the representation space that learns a more robustness features (intermediate representation) for unseen samples. The classifier using these learned features yields a better generalization capability. Extensive experimental results on 36 benchmarking datasets indicate that LiSSA outperforms several classical and recent AE training methods significantly on classification tasks.

4.
IEEE Trans Pattern Anal Mach Intell ; 42(11): 2737-2754, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31144627

RESUMO

Graph matching is an important and persistent problem in computer vision and pattern recognition for finding node-to-node correspondence between graphs. However, graph matching that incorporates pairwise constraints can be formulated as a quadratic assignment problem (QAP), which is NP-complete and results in intrinsic computational difficulties. This paper presents a functional representation for graph matching (FRGM) that aims to provide more geometric insights on the problem and reduce the space and time complexities. To achieve these goals, we represent each graph by a linear function space equipped with a functional such as inner product or metric, that has an explicit geometric meaning. Consequently, the correspondence matrix between graphs can be represented as a linear representation map. Furthermore, this map can be reformulated as a new parameterization for matching graphs in Euclidean space such that it is consistent with graphs under rigid or nonrigid deformations. This allows us to estimate the correspondence matrix and geometric deformations simultaneously. We use the representation of edge-attributes rather than the affinity matrix to reduce the space complexity and propose an efficient optimization strategy to reduce the time complexity. The experimental results on both synthetic and real-world datasets show that the FRGM can achieve state-of-the-art performance.

5.
IEEE Trans Pattern Anal Mach Intell ; 41(1): 148-161, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-29990281

RESUMO

This paper presents a new approach for the challenging problem of geo-localization using image matching in a structured database of city-wide reference images with known GPS coordinates. We cast the geo-localization as a clustering problem of local image features. Akin to existing approaches to the problem, our framework builds on low-level features which allow local matching between images. For each local feature in the query image, we find its approximate nearest neighbors in the reference set. Next, we cluster the features from reference images using Dominant Set clustering, which affords several advantages over existing approaches. First, it permits variable number of nodes in the cluster, which we use to dynamically select the number of nearest neighbors for each query feature based on its discrimination value. Second, this approach is several orders of magnitude faster than existing approaches. Thus, we obtain multiple clusters (different local maximizers) and obtain a robust final solution to the problem using multiple weak solutions through constrained Dominant Set clustering on global image features, where we enforce the constraint that the query image must be included in the cluster. This second level of clustering also bypasses heuristic approaches to voting and selecting the reference image that matches to the query. We evaluate the proposed framework on an existing dataset of 102k street view images as well as a new larger dataset of 300k images, and show that it outperforms the state-of-the-art by 20 and 7 percent, respectively, on the two datasets.

6.
IEEE Trans Pattern Anal Mach Intell ; 41(10): 2438-2451, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30040623

RESUMO

Image segmentation has come a long way since the early days of computer vision, and still remains a challenging task. Modern variations of the classical (purely bottom-up) approach, involve, e.g., some form of user assistance (interactive segmentation) or ask for the simultaneous segmentation of two or more images (co-segmentation). At an abstract level, all these variants can be thought of as "constrained" versions of the original formulation, whereby the segmentation process is guided by some external source of information. In this paper, we propose a new approach to tackle this kind of problems in a unified way. Our work is based on some properties of a family of quadratic optimization problems related to dominant sets, a graph-theoretic notion of a cluster which generalizes the concept of a maximal clique to edge-weighted graphs. In particular, we show that by properly controlling a regularization parameter which determines the structure and the scale of the underlying problem, we are in a position to extract groups of dominant-set clusters that are constrained to contain predefined elements. In particular, we shall focus on interactive segmentation and co-segmentation (in both the unsupervised and the interactive versions). The proposed algorithm can deal naturally with several types of constraints and input modalities, including scribbles, sloppy contours and bounding boxes, and is able to robustly handle noisy annotations on the part of the user. Experiments on standard benchmark datasets show the effectiveness of our approach as compared to state-of-the-art algorithms on a variety of natural images under several input conditions and constraints.

7.
IEEE Trans Neural Netw Learn Syst ; 28(11): 2466-2478, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-27514067

RESUMO

In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.

8.
IEEE Trans Pattern Anal Mach Intell ; 36(10): 2104-16, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26352638

RESUMO

Ensembles of randomized decision trees, known as Random Forests, have become a valuable machine learning tool for addressing many computer vision problems. Despite their popularity, few works have tried to exploit contextual and structural information in random forests in order to improve their performance. In this paper, we propose a simple and effective way to integrate contextual information in random forests, which is typically reflected in the structured output space of complex problems like semantic image labelling. Our paper has several contributions: We show how random forests can be augmented with structured label information and be used to deliver structured low-level predictions. The learning task is carried out by employing a novel split function evaluation criterion that exploits the joint distribution observed in the structured label space. This allows the forest to learn typical label transitions between object classes and avoid locally implausible label configurations. We provide two approaches for integrating the structured output predictions obtained at a local level from the forest into a concise, global, semantic labelling. We integrate our new ideas also in the Hough-forest framework with the view of exploiting contextual information at the classification level to improve the performance on the task of object detection. Finally, we provide experimental evidence for the effectiveness of our approach on different tasks: Semantic image labelling on the challenging MSRCv2 and CamVid databases, reconstruction of occluded handwritten Chinese characters on the Kaist database and pedestrian detection on the TU Darmstadt databases.

9.
IEEE Trans Pattern Anal Mach Intell ; 35(6): 1312-27, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23599050

RESUMO

Hypergraph clustering refers to the process of extracting maximally coherent groups from a set of objects using high-order (rather than pairwise) similarities. Traditional approaches to this problem are based on the idea of partitioning the input data into a predetermined number of classes, thereby obtaining the clusters as a by-product of the partitioning process. In this paper, we offer a radically different view of the problem. In contrast to the classical approach, we attempt to provide a meaningful formalization of the very notion of a cluster and we show that game theory offers an attractive and unexplored perspective that serves our purpose well. To this end, we formulate the hypergraph clustering problem in terms of a noncooperative multiplayer "clustering game," and show that a natural notion of a cluster turns out to be equivalent to a classical (evolutionary) game-theoretic equilibrium concept. We prove that the problem of finding the equilibria of our clustering game is equivalent to locally optimizing a polynomial function over the standard simplex, and we provide a discrete-time high-order replicator dynamics to perform this optimization, based on the Baum-Eagon inequality. Experiments over synthetic as well as real-world data are presented which show the superiority of our approach over the state of the art.


Assuntos
Algoritmos , Análise por Conglomerados , Evolução Biológica , Teoria dos Jogos , Humanos
10.
IEEE Trans Pattern Anal Mach Intell ; 29(1): 167-72, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17108392

RESUMO

We develop a new graph-theoretic approach for pairwise data clustering which is motivated by the analogies between the intuitive concept of a cluster and that of a dominant set of vertices, a notion introduced here which generalizes that of a maximal complete subgraph to edge-weighted graphs. We establish a correspondence between dominant sets and the extrema of a quadratic form over the standard simplex, thereby allowing the use of straightforward and easily implementable continuous optimization techniques from evolutionary game theory. Numerical examples on various point-set and image segmentation problems confirm the potential of the proposed approach.


Assuntos
Algoritmos , Inteligência Artificial , Análise por Conglomerados , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos
11.
Neural Comput ; 18(5): 1215-58, 2006 May.
Artigo em Inglês | MEDLINE | ID: mdl-16595063

RESUMO

Evolutionary game-theoretic models and, in particular, the so-called replicator equations have recently proven to be remarkably effective at approximately solving the maximum clique and related problems. The approach is centered around a classic result from graph theory that formulates the maximum clique problem as a standard (continuous) quadratic program and exploits the dynamical properties of these models, which, under a certain symmetry assumption, possess a Lyapunov function. In this letter, we generalize previous work along these lines in several respects. We introduce a wide family of game-dynamic equations known as payoff-monotonic dynamics, of which replicator dynamics are a special instance, and show that they enjoy precisely the same dynamical properties as standard replicator equations. These properties make any member of this family a potential heuristic for solving standard quadratic programs and, in particular, the maximum clique problem. Extensive simulations, performed on random as well as DIMACS benchmark graphs, show that this class contains dynamics that are considerably faster than and at least as accurate as replicator equations. One problem associated with these models, however, relates to their inability to escape from poor local solutions. To overcome this drawback, we focus on a particular subclass of payoff-monotonic dynamics used to model the evolution of behavior via imitation processes and study the stability of their equilibria when a regularization parameter is allowed to take on negative values. A detailed analysis of these properties suggests a whole class of annealed imitation heuristics for the maximum clique problem, which are based on the idea of varying the parameter during the imitation optimization process in a principled way, so as to avoid unwanted inefficient solutions. Experiments show that the proposed annealing procedure does help to avoid poor local optima by initially driving the dynamics toward promising regions in state space. Furthermore, the models outperform state-of-the-art neural network algorithms for maximum clique, such as mean field annealing, and compare well with powerful continuous-based heuristics.

12.
IEEE Trans Pattern Anal Mach Intell ; 27(7): 1087-99, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16013756

RESUMO

We address the problem of comparing attributed trees and propose four novel distance measures centered around the notion of a maximal similarity common subtree. The proposed measures are general and defined on trees endowed with either symbolic or continuous-valued attributes and can be applied to rooted as well as unrooted trees. We prove that our measures satisfy the metric constraints and provide a polynomial-time algorithm to compute them. This is a remarkable and attractive property, since the computation of traditional edit-distance-based metrics is, in general, NP-complete, at least in the unordered case. We experimentally validate the usefulness of our metrics on shape matching tasks and compare them with (an approximation of) edit-distance.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Estatísticos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Análise por Conglomerados , Simulação por Computador , Análise Numérica Assistida por Computador , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...