Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Neural Netw Learn Syst ; 32(1): 77-90, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32167913

RESUMO

Pedestrian path prediction is a very challenging problem because scenes are often crowded or contain obstacles. Existing state-of-the-art long short-term memory (LSTM)-based prediction methods have been mainly focused on analyzing the influence of other people in the neighborhood of each pedestrian while neglecting the role of potential destinations in determining a walking path. In this article, we propose classifying pedestrian trajectories into a number of route classes (RCs) and using them to describe the pedestrian movement patterns. Based on the RCs obtained from trajectory clustering, our algorithm, which we name the prediction of pedestrian paths by LSTM (PoPPL), predicts the destination regions through a bidirectional LSTM classification network in the first stage and then generates trajectories corresponding to the predicted destination regions through one of the three proposed LSTM-based architectures in the second stage. Our algorithm also outputs probabilities of multiple predicted trajectories that head toward the destination regions. We have evaluated PoPPL against other state-of-the-art methods on two public data sets. The results show that our algorithm outperforms other methods and incorporating potential destination prediction improves the trajectory prediction accuracy.


Assuntos
Movimento , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Análise por Conglomerados , Previsões , Humanos , Redes Neurais de Computação , Pedestres
2.
IEEE Trans Image Process ; 29: 15-28, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31283506

RESUMO

Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been proposed in the literature. However, there still does not exist a thorough comparison of these Kinect-based techniques under the grouping of feature types, such as handcrafted versus deep learning features and depth-based versus skeleton-based features. In this paper, we analyze and compare 10 recent Kinect-based algorithms for both cross-subject action recognition and cross-view action recognition using six benchmark datasets. In addition, we have implemented and improved some of these techniques and included their variants in the comparison. Our experiments show that the majority of methods perform better on cross-subject action recognition than cross-view action recognition, that the skeleton-based features are more robust for cross-view recognition than the depth-based features, and that the deep learning features are suitable for large datasets.

3.
IEEE Trans Cybern ; 50(4): 1726-1738, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30582560

RESUMO

Visual tracking has been an active research area in computer vision for decades. However, the performance of existing techniques is still challenged by various factors, such as occlusion and change in appearance of the target. In this paper, we propose a novel framework based on correlation filtering and probabilistic finite state machines (FSMs) to handle occlusion. In our tracking framework, the target is partitioned into several parts whose occlusion states are automatically detected. A set of states for the target is defined in terms of the combination of the parts' occlusion states. The probabilistic FSMs are then used to model the target's state transitions so as to reduce the effect of noise in the output response maps of correlation filters. Our target model's update strategy is adaptable online depending on the estimated state of the target. Extensive experiments have been performed on several public benchmarks and the proposed algorithm achieves competitive results against state-of-the-art techniques.

4.
IEEE Trans Image Process ; 27(3): 1361-1375, 2018 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29990195

RESUMO

We present a multiple pedestrian tracking method for monocular videos captured by a fixed camera in an interacting multiple model (IMM) framework. Our tracking method involves multiple IMM trackers running in parallel, which are tied together by a robust data association component. We investigate two data association strategies which take into account both the target appearance and motion errors. We use a 4D color histogram as the appearance model for each pedestrian returned by a people detector that is based on the histogram of oriented gradients features. Short-term occlusion problems and false negative errors from the detector are dealt with using a sliding window of video frames, where tracking persists in the absence of observations. Our method has been evaluated, and compared both qualitatively and quantitatively with four state-of-the-art visual tracking methods using benchmark video databases. The experiments demonstrate that, on average, our tracking method outperforms these four methods.

5.
IEEE Trans Image Process ; 25(1): 92-103, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26595920

RESUMO

The choice of metric critically affects the performance of classification and clustering algorithms. Metric learning algorithms attempt to improve performance, by learning a more appropriate metric. Unfortunately, most of the current algorithms learn a distance function which is not invariant to rigid transformations of images. Therefore, the distances between two images and their rigidly transformed pair may differ, leading to inconsistent classification or clustering results. We propose to constrain the learned metric to be invariant to the geometry preserving transformations of images that induce permutations in the feature space. The constraint that these transformations are isometries of the metric ensures consistent results and improves accuracy. Our second contribution is a dimension reduction technique that is consistent with the isometry constraints. Our third contribution is the formulation of the isometry constrained logistic discriminant metric learning (IC-LDML) algorithm, by incorporating the isometry constraints within the objective function of the LDML algorithm. The proposed algorithm is compared with the existing techniques on the publicly available labeled faces in the wild, viewpoint-invariant pedestrian recognition, and Toy Cars data sets. The IC-LDML algorithm has outperformed existing techniques for the tasks of face recognition, person identification, and object classification by a significant margin.


Assuntos
Algoritmos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Automóveis , Bases de Dados Factuais , Face/anatomia & histologia , Humanos , Pedestres/classificação , Jogos e Brinquedos
6.
PLoS One ; 10(5): e0127113, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26000460

RESUMO

Active video games that require physical exertion during game play have been shown to confer health benefits. Typically, energy expended during game play is measured using devices attached to players, such as accelerometers, or portable gas analyzers. Since 2010, active video gaming technology incorporates marker-less motion capture devices to simulate human movement into game play. Using the Kinect Sensor and Microsoft SDK this research aimed to estimate the mechanical work performed by the human body and estimate subsequent metabolic energy using predictive algorithmic models. Nineteen University students participated in a repeated measures experiment performing four fundamental movements (arm swings, standing jumps, body-weight squats, and jumping jacks). Metabolic energy was captured using a Cortex Metamax 3B automated gas analysis system with mechanical movement captured by the combined motion data from two Kinect cameras. Estimations of the body segment properties, such as segment mass, length, centre of mass position, and radius of gyration, were calculated from the Zatsiorsky-Seluyanov's equations of de Leva, with adjustment made for posture cost. GPML toolbox implementation of the Gaussian Process Regression, a locally weighted k-Nearest Neighbour Regression, and a linear regression technique were evaluated for their performance on predicting the metabolic cost from new feature vectors. The experimental results show that Gaussian Process Regression outperformed the other two techniques by a small margin. This study demonstrated that physical activity energy expenditure during exercise, using the Kinect camera as a motion capture system, can be estimated from segmental mechanical work. Estimates for high-energy activities, such as standing jumps and jumping jacks, can be made accurately, but for low-energy activities, such as squatting, the posture of static poses should be considered as a contributing factor. When translated into the active video gaming environment, the results could be incorporated into game play to more accurately control the energy expenditure requirements.


Assuntos
Metabolismo Energético/fisiologia , Exercício Físico/fisiologia , Movimento/fisiologia , Recreação/fisiologia , Jogos de Vídeo , Adulto , Feminino , Humanos , Masculino , Esforço Físico/fisiologia , Postura/fisiologia , Adulto Jovem
7.
IEEE Trans Image Process ; 22(11): 4286-300, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23846470

RESUMO

In this paper, we propose a hybrid method that combines Gaussian process learning, a particle filter, and annealing to track the 3D pose of a human subject in video sequences. Our approach, which we refer to as annealed Gaussian process guided particle filter, comprises two steps. In the training step, we use a supervised learning method to train a Gaussian process regressor that takes the silhouette descriptor as an input and produces multiple output poses modeled by a mixture of Gaussian distributions. In the tracking step, the output pose distributions from the Gaussian process regression are combined with the annealed particle filter to track the 3D pose in each frame of the video sequence. Our experiments show that the proposed method does not require initialization and does not lose tracking of the pose. We compare our approach with a standard annealed particle filter using the HumanEva-I dataset and with other state of the art approaches using the HumanEva-II dataset. The evaluation results show that our approach can successfully track the 3D human pose over long video sequences and give more accurate pose tracking results than the annealed particle filter.


Assuntos
Algoritmos , Biometria/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Postura , Simulação por Computador , Humanos , Aumento da Imagem/métodos , Modelos Biológicos , Modelos Estatísticos , Distribuição Normal , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...