Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 43(2): 623-637, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31369369

RESUMO

Much progress has been made for non-rigid structure from motion (NRSfM) during the last two decades, which made it possible to provide reasonable solutions for synthetically-created benchmark data. In order to utilize these NRSfM techniques in more realistic situations, however, we are now facing two important problems that must be solved: First, general scenes contain complex deformations as well as multiple objects, which violates the usual assumptions of previous NRSfM proposals. Second, there are many unreconstructable regions in the video, either because of the discontinued tracks of 2D trajectories or those regions static towards the camera, which require careful manipulations. In this paper, we show that a consensus-based reconstruction framework can handle these issues effectively. Even though the entire scene is complex, its parts usually have simpler deformations, and even though there are some unreconstructable parts, they can be weeded out to reduce their harmful effect on the entire reconstruction. The main difficulty of this approach lies in identifying appropriate parts, however, it can be effectively avoided by sampling parts stochastically and then aggregate their reconstructions afterwards. Experimental results show that the proposed method renews the state-of-the-art for popular benchmark data under much harsher environments, i.e., narrow camera view ranges, and it can reconstruct video-based real-world data effectively for as many areas as it can without an elaborated user input.

2.
Sensors (Basel) ; 19(23)2019 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-31795509

RESUMO

Compound eyes, also known as insect eyes, have a unique structure. They have a hemispheric surface, and a lot of single eyes are deployed regularly on the surface. Thanks to this unique form, using the compound images has several advantages, such as a large field of view (FOV) with low aberrations. We can exploit these benefits in high-level vision applications, such as object recognition, or semantic segmentation for a moving robot, by emulating the compound images that describe the captured scenes from compound eye cameras. In this paper, to the best of our knowledge, we propose the first convolutional neural network (CNN)-based ego-motion classification algorithm designed for the compound eye structure. To achieve this, we introduce a voting-based approach that fully utilizes one of the unique features of compound images, specifically, the compound images consist of a lot of single eye images. The proposed method classifies a number of local motions by CNN, and these local classifications which represent the motions of each single eye image, are aggregated to the final classification by a voting procedure. For the experiments, we collected a new dataset for compound eye camera ego-motion classification which contains scenes of the inside and outside of a certain building. The samples of the proposed dataset consist of two consequent emulated compound images and the corresponding ego-motion class. The experimental results show that the proposed method has achieved the classification accuracy of 85.0%, which is superior compared to the baselines on the proposed dataset. Also, the proposed model is light-weight compared to the conventional CNN-based image recognition algorithms such as AlexNet, ResNet50, and MobileNetV2.


Assuntos
Processamento de Imagem Assistida por Computador , Movimento (Física) , Redes Neurais de Computação , Gravação em Vídeo/instrumentação , Algoritmos , Animais , Olho Composto de Artrópodes , Humanos , Propriedades de Superfície
3.
IEEE Trans Pattern Anal Mach Intell ; 41(1): 107-120, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-29990037

RESUMO

In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires to process an image of pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.


Assuntos
Cabeça/fisiologia , Processamento de Imagem Assistida por Computador/métodos , Postura/fisiologia , Algoritmos , Humanos , Modelos Estatísticos , Redes Neurais de Computação , Tronco/fisiologia
4.
Sensors (Basel) ; 18(2)2018 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-29443897

RESUMO

This paper considers two important problems for autonomous robot navigation in a dynamic environment, where the goal is to predict pedestrian motion and control a robot with the prediction for safe navigation. While there are several methods for predicting the motion of a pedestrian and controlling a robot to avoid incoming pedestrians, it is still difficult to safely navigate in a dynamic environment due to challenges, such as the varying quality and complexity of training data with unwanted noises. This paper addresses these challenges simultaneously by proposing a robust kernel subspace learning algorithm based on the recent advances in nuclear-norm and l 1 -norm minimization. We model the motion of a pedestrian and the robot controller using Gaussian processes. The proposed method efficiently approximates a kernel matrix used in Gaussian process regression by learning low-rank structured matrix (with symmetric positive semi-definiteness) to find an orthogonal basis, which eliminates the effects of erroneous and inconsistent data. Based on structured kernel subspace learning, we propose a robust motion model and motion controller for safe navigation in dynamic environments. We evaluate the proposed robust kernel learning in various tasks, including regression, motion prediction, and motion control problems, and demonstrate that the proposed learning-based systems are robust against outliers and outperform existing regression and navigation methods.

5.
IEEE Trans Neural Netw Learn Syst ; 26(2): 237-51, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25608287

RESUMO

Low-rank matrix approximation plays an important role in the area of computer vision and image processing. Most of the conventional low-rank matrix approximation methods are based on the l2 -norm (Frobenius norm) with principal component analysis (PCA) being the most popular among them. However, this can give a poor approximation for data contaminated by outliers (including missing data), because the l2 -norm exaggerates the negative effect of outliers. Recently, to overcome this problem, various methods based on the l1 -norm, such as robust PCA methods, have been proposed for low-rank matrix approximation. Despite the robustness of the methods, they require heavy computational effort and substantial memory for high-dimensional data, which is impractical for real-world problems. In this paper, we propose two efficient low-rank factorization methods based on the l1 -norm that find proper projection and coefficient matrices using the alternating rectified gradient method. The proposed methods are applied to a number of low-rank matrix approximation problems to demonstrate their efficiency and robustness. The experimental results show that our proposals are efficient in both execution time and reconstruction performance unlike other state-of-the-art methods.

6.
Sensors (Basel) ; 14(11): 21151-73, 2014 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-25390406

RESUMO

This paper proposes VibeComm, a novel communication method for smart devices using a built-in vibrator and accelerometer. The proposed approach is ideal for low-rate off-line communication, and its communication medium is an object on which smart devices are placed, such as tables and desks. When more than two smart devices are placed on an object and one device wants to transmit a message to the other devices, the transmitting device generates a sequence of vibrations. The vibrations are propagated through the object on which the devices are placed. The receiving devices analyze their accelerometer readings to decode incoming messages. The proposed method can be the alternative communication method when general types of radio communication methods are not available. VibeComm is implemented on Android smartphones, and a comprehensive set of experiments is conducted to show its feasibility.

7.
Sensors (Basel) ; 14(3): 5516-35, 2014 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-24658618

RESUMO

There is a growing interest in 3D content following the recent developments in 3D movies, 3D TVs and 3D smartphones. However, 3D content creation is still dominated by professionals, due to the high cost of 3D motion capture instruments. The availability of a low-cost motion capture system will promote 3D content generation by general users and accelerate the growth of the 3D market. In this paper, we describe the design and implementation of a real-time motion capture system based on a portable low-cost wireless camera sensor network. The proposed system performs motion capture based on the data-driven 3D human pose reconstruction method to reduce the computation time and to improve the 3D reconstruction accuracy. The system can reconstruct accurate 3D full-body poses at 16 frames per second using only eight markers on the subject's body. The performance of the motion capture system is evaluated extensively in experiments.


Assuntos
Redes de Comunicação de Computadores/instrumentação , Sistemas Computacionais , Movimento (Física) , Software , Tecnologia sem Fio/instrumentação , Bases de Dados como Assunto , Humanos , Processamento de Imagem Assistida por Computador , Articulações/fisiologia , Modelos Anatômicos , Fotografação/instrumentação , Postura , Amplitude de Movimento Articular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...