Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Robot AI ; 9: 887910, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36071857

RESUMO

Inspired by human behavior when traveling over unknown terrain, this study proposes the use of probing strategies and integrates them into a traversability analysis framework to address safe navigation on unknown rough terrain. Our framework integrates collapsibility information into our existing traversability analysis, as vision and geometric information alone could be misled by unpredictable non-rigid terrains such as soft soil, bush area, or water puddles. With the new traversability analysis framework, our robot has a more comprehensive assessment of unpredictable terrain, which is critical for its safety in outdoor environments. The pipeline first identifies the terrain's geometric and semantic properties using an RGB-D camera and desired probing locations on questionable terrains. These regions are probed using a force sensor to determine the risk of terrain collapsing when the robot steps over it. This risk is formulated as a collapsibility metric, which estimates an unpredictable region's ground collapsibility. Thereafter, the collapsibility metric, together with geometric and semantic spatial data, is combined and analyzed to produce global and local traversability grid maps. These traversability grid maps tell the robot whether it is safe to step over different regions of the map. The grid maps are then utilized to generate optimal paths for the robot to safely navigate to its goal. Our approach has been successfully verified on a quadrupedal robot in both simulation and real-world experiments.

2.
Artigo em Inglês | MEDLINE | ID: mdl-29994115

RESUMO

Existing subspace clustering methods typically employ shallow models to estimate underlying subspaces of unlabeled data points and cluster them into corresponding groups. However, due to the limited representative capacity of the employed shallow models, those methods may fail in handling realistic data without the linear subspace structure. To address this issue, we propose a novel subspace clustering approach by introducing a new deep model-Structured AutoEncoder (StructAE). The StructAE learns a set of explicit transformations to progressively map input data points into nonlinear latent spaces while preserving the local and global subspace structure. In particular, to preserve local structure, the StructAE learns representations for each data point by minimizing reconstruction error w.r.t. itself. To preserve global structure, the StructAE incorporates a prior structured information by encouraging the learned representation to preserve specified reconstruction patterns over the entire data set. To the best of our knowledge, StructAE is one of first deep subspace clustering approaches. Extensive experiments show that the proposed StructAE significantly outperforms 15 state-of-the-art subspace clustering approaches in terms of five evaluation metrics.

3.
Front Comput Neurosci ; 12: 103, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30622466

RESUMO

Hough transform (HT) is one of the most well-known techniques in computer vision that has been the basis of many practical image processing algorithms. HT however is designed to work for frame-based systems such as conventional digital cameras. Recently, event-based systems such as Dynamic Vision Sensor (DVS) cameras, has become popular among researchers. Event-based cameras have a significantly high temporal resolution (1 µs), but each pixel can only detect change and not color. As such, the conventional image processing algorithms cannot be readily applied to event-based output streams. Therefore, it is necessary to adapt the conventional image processing algorithms for event-based cameras. This paper provides a systematic explanation, starting from extending conventional HT to 3D HT, adaptation to event-based systems, and the implementation of the 3D HT using Spiking Neural Networks (SNNs). Using SNN enables the proposed solution to be easily realized on hardware using FPGA, without requiring CPU or additional memory. In addition, we also discuss techniques for optimal SNN-based implementation using efficient number of neurons for the required accuracy and resolution along each dimension, without increasing the overall computational complexity. We hope that this will help to reduce the gap between event-based and frame-based systems.

4.
IEEE Trans Image Process ; 24(7): 2140-52, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25823034

RESUMO

Recognition of natural emotions from human faces is an interesting topic with a wide range of potential applications, such as human-computer interaction, automated tutoring systems, image and video retrieval, smart environments, and driver warning systems. Traditionally, facial emotion recognition systems have been evaluated on laboratory controlled data, which is not representative of the environment faced in real-world applications. To robustly recognize the facial emotions in real-world natural situations, this paper proposes an approach called extreme sparse learning, which has the ability to jointly learn a dictionary (set of basis) and a nonlinear classification model. The proposed approach combines the discriminative power of extreme learning machine with the reconstruction property of sparse representation to enable accurate classification when presented with noisy signals and imperfect data recorded in natural settings. In addition, this paper presents a new local spatio-temporal descriptor that is distinctive and pose-invariant. The proposed framework is able to achieve the state-of-the-art recognition accuracy on both acted and spontaneous facial emotion databases.


Assuntos
Emoções/fisiologia , Expressão Facial , Reconhecimento Facial/fisiologia , Interpretação de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Fotografação/métodos , Algoritmos , Humanos , Aumento da Imagem/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de Subtração
5.
IEEE Trans Pattern Anal Mach Intell ; 35(10): 2484-97, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23969391

RESUMO

Most face recognition systems require faces to be detected and localized a priori. In this paper, an approach to simultaneously detect and localize multiple faces having arbitrary views and different scales is proposed. The main contribution of this paper is the introduction of a face constellation, which enables multiview face detection and localization. In contrast to other multiview approaches that require many manually labeled images for training, the proposed face constellation requires only a single reference image of a face containing two manually indicated reference points for initialization. Subsequent training face images from arbitrary views are automatically added to the constellation (registered to the reference image) based on finding the correspondences between distinctive local features. Thus, the key advantage of the proposed scheme is the minimal manual intervention required to train the face constellation. We also propose an approach to identify distinctive correspondence points between pairs of face images in the presence of a large amount of false matches. To detect and localize multiple faces with arbitrary views, we then propose a probabilistic classifier-based formulation to evaluate whether a local feature cluster corresponds to a face. Experimental results conducted on the FERET, CMU, and FDDB datasets show that our proposed approach has better performance compared to the state-of-the-art approaches for detecting faces with arbitrary pose.


Assuntos
Algoritmos , Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Aumento da Imagem/métodos , Interface Usuário-Computador
6.
IEEE Trans Image Process ; 20(7): 2049-62, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21245008

RESUMO

Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.


Assuntos
Inteligência Artificial , Identificação Biométrica/métodos , Face/anatomia & histologia , Adolescente , Adulto , Fatores Etários , Algoritmos , Criança , Pré-Escolar , Análise por Conglomerados , Bases de Dados Factuais , Análise Discriminante , Humanos , Lactente , Pessoa de Meia-Idade
7.
IEEE Trans Image Process ; 15(6): 1583-600, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16764283

RESUMO

This paper presents a real-time foreground detection method for monitoring swimming activities at an outdoor swimming pool. Robust performance and high accuracy of detecting objects-of-interest are two central issues of concern. Therefore, in this paper, a considerable amount of attention has been placed on the following aspects: 1) to establish a better method of modeling aquatic background, which exhibitis dynamic characteristics with random spatial movements, and 2) to establish a method of enhancing the visibility of the foreground by removing specular reflection at nighttime. First, the development of a new background modeling method is reported. In the proposed approach, the background is modeled as a composition of homogeneous blob movements. With an implementation of a spatial searching process, the proposed method shows capability in associating and distinguishing movements caused by the background. Hence, this contributes to better performance in foreground detection. On the issue of enhancing the visibility of the foreground, a decision-based filtering scheme is proposed as a preprocessing step. A defined concept term, fluctuation measure, is defined for classifying each pixel to be one of the predefined types. This has allowed suitable spatial or spatiotemporal filters to be applied accordingly for color the compensation step. All of these developments are evaluated by testing live on a busy Olympic-size outdoor public swimming pool. Both qualitative and quantitative evaluations are reported. This provides a comprehensive study of the system.


Assuntos
Biometria/métodos , Interpretação de Imagem Assistida por Computador/métodos , Movimento , Reconhecimento Automatizado de Padrão/métodos , Natação , Gravação em Vídeo/métodos , Água , Algoritmos , Inteligência Artificial , Análise por Conglomerados , Sistemas Computacionais , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
IEEE Trans Syst Man Cybern B Cybern ; 34(2): 1196-209, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15376864

RESUMO

In this paper, we treat the problem of combining fingerprint and speech biometric decisions as a classifier fusion problem. By exploiting the specialist capabilities of each classifier, a combined classifier may yield results which would not be possible in a single classifier. The Feedforward Neural Network provides a natural choice for such data fusion as it has been shown to be a universal approximator. However, the training process remains much to be a trial-and-error effort since no learning algorithm can guarantee convergence to optimal solution within finite iterations. In this work, we propose a network model to generate different combinations of the hyperbolic functions to achieve some approximation and classification properties. This is to circumvent the iterative training problem as seen in neural networks learning. In many decision data fusion applications, since individual classifiers or estimators to be combined would have attained a certain level of classification or approximation accuracy, this hyperbolic functions network can be used to combine these classifiers taking their decision outputs as the inputs to the network. The proposed hyperbolic functions network model is first applied to a function approximation problem to illustrate its approximation capability. This is followed by some case studies on pattern classification problems. The model is finally applied to combine the fingerprint and speaker verification decisions which show either better or comparable results with respect to several commonly used methods.


Assuntos
Inteligência Artificial , Biometria/métodos , Técnicas de Apoio para a Decisão , Dermatoglifia/classificação , Reconhecimento Automatizado de Padrão , Medida da Produção da Fala/métodos , Fala/classificação , Algoritmos , Humanos , Armazenamento e Recuperação da Informação/métodos , Integração de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...