Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
IEEE Trans Image Process ; 30: 4571-4586, 2021.
Article in English | MEDLINE | ID: mdl-33830921

ABSTRACT

Classifying and modeling texture images, especially those with significant rotation, illumination, scale, and view-point variations, is a hot topic in the computer vision field. Inspired by local graph structure (LGS), local ternary patterns (LTP), and their variants, this paper proposes a novel image feature descriptor for texture and material classification, which we call Petersen Graph Multi-Orientation based Multi-Scale Ternary Pattern (PGMO-MSTP). PGMO-MSTP is a histogram representation that efficiently encodes the joint information within an image across feature and scale spaces, exploiting the concepts of both LTP-like and LGS-like descriptors, in order to overcome the shortcomings of these approaches. We first designed two single-scale horizontal and vertical Petersen Graph-based Ternary Pattern descriptors ( PGTPh and PGTPv ). The essence of PGTPh and PGTPv is to encode each 5×5 image patch, extending the ideas of the LTP and LGS concepts, according to relationships between pixels sampled in a variety of spatial arrangements (i.e., up, down, left, and right) of Petersen graph-shaped oriented sampling structures. The histograms obtained from the single-scale descriptors PGTPh and PGTPv are then combined, in order to build the effective multi-scale PGMO-MSTP model. Extensive experiments are conducted on sixteen challenging texture data sets, demonstrating that PGMO-MSTP can outperform state-of-the-art handcrafted texture descriptors and deep learning-based feature extraction approaches. Moreover, a statistical comparison based on the Wilcoxon signed rank test demonstrates that PGMO-MSTP performed the best over all tested data sets.

2.
Article in English | MEDLINE | ID: mdl-31634834

ABSTRACT

The background Initialization (BI) problem has attracted the attention of researchers in different image/video processing fields. Recently, a tensor-based technique called spatiotemporal slice-based singular value decomposition (SS-SVD) has been proposed for background initialization. SS-SVD applies the SVD on the tensor slices and estimates the background from low-rank information. Despite its efficiency in background initialization, the performance of SS-SVD requires further improvement in the case of complex sequences with challenges such as stationary foreground objects (SFOs), illumination changes, low frame-rate, and clutter. In this paper, a self-motion-assisted tensor completion method is proposed to overcome the limitations of SS-SVD in complex video sequences and enhance the visual appearance of the initialized background. With the proposed method, the motion information, extracted from the sparse portion of the tensor slices, is incorporated with the low-rank information of SS-SVD to eliminate existing artifacts in the initiated background. Efficient blending schemes between the low-rank (background) and sparse (foreground) information of the tensor slices is developed for scenarios such as SFO removal, lighting variation processing, low frame-rate processing, crowdedness estimation, and best frame selection. The performance of the proposed method on video sequences with complex scenarios is compared with the top-ranked state-of-the-art techniques in the field of background initialization. The results not only validate the improved performance over the majority of the tested challenges but also demonstrate the capability of the proposed method to initialize the background in less computational time.

3.
Sensors (Basel) ; 19(11)2019 May 28.
Article in English | MEDLINE | ID: mdl-31142006

ABSTRACT

Convolutional Network (ConvNet), with its strong image representation ability, has achieved significant progress in the computer vision and robotic fields. In this paper, we propose a visual localization approach based on place recognition that combines the powerful ConvNet features and localized image sequence matching. The image distance matrix is constructed based on the cosine distance of extracted ConvNet features, and then a sequence search technique is applied on this distance matrix for the final visual recognition. To speed up the computational efficiency, the locality sensitive hashing (LSH) method is applied to achieve real-time performances with minimal accuracy degradation. We present extensive experiments on four real world data sets to evaluate each of the specific challenges in visual recognition. A comprehensive performance comparison of different ConvNet layers (each defining a level of features) considering both appearance and illumination changes is conducted. Compared with the traditional approaches based on hand-crafted features and single image matching, the proposed method shows good performances even in the presence of appearance and illumination changes.

4.
Sensors (Basel) ; 19(3)2019 Feb 08.
Article in English | MEDLINE | ID: mdl-30744074

ABSTRACT

: The Codebook model is one of the popular real-time models for background subtraction. In this paper, we first extend it from traditional Red-Green-Blue (RGB) color model to multispectral sequences. A self-adaptive mechanism is then designed based on the statistical information extracted from the data themselves, with which the performance has been improved, in addition to saving time and effort to search for the appropriate parameters. Furthermore, the Spectral Information Divergence is introduced to evaluate the spectral distance between the current and reference vectors, together with the Brightness and Spectral Distortion. Experiments on five multispectral sequences with different challenges have shown that the multispectral self-adaptive Codebook model is more capable of detecting moving objects than the corresponding RGB sequences. The proposed research framework opens a door for future works for applying multispectral sequences in moving object detection.

5.
Neural Netw ; 111: 35-46, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30660101

ABSTRACT

Graph-based embedding methods are very useful for reducing the dimension of high-dimensional data and for extracting their relevant features. In this paper, we introduce a novel nonlinear method called Flexible Discriminant graph-based Embedding with feature selection (FDEFS). The proposed algorithm aims to classify image sample data in supervised learning and semi-supervised learning settings. Specifically, our method incorporates the Manifold Smoothness, Margin Discriminant Embedding and the Sparse Regression for feature selection. The weights add ℓ2,1-norm regularization for local linear approximation. The sparse regression implicitly performs feature selection on the original features of data matrix and of the linear transform. We also provide an effective solution method to optimize the objective function. We apply the algorithm on six public image datasets including scene, face and object datasets. These experiments demonstrate the effectiveness of the proposed embedding method. They also show that proposed the method compares favorably with many competing embedding methods.


Subject(s)
Pattern Recognition, Automated/methods , Photic Stimulation/methods , Supervised Machine Learning , Algorithms , Humans , Pattern Recognition, Automated/trends , Supervised Machine Learning/trends
6.
IEEE Trans Image Process ; 27(6): 3114-3126, 2018 06.
Article in English | MEDLINE | ID: mdl-29993806

ABSTRACT

Extracting the background from a video in the presence of various moving patterns is the focus of several background-initialization approaches. To model the scene background using rank-one matrices, this paper proposes a background-initialization technique that relies on the singular-value decomposition (SVD) of spatiotemporally extracted slices from the video tensor. The proposed method is referred to as spatiotemporal slice-based SVD (SS-SVD). To determine the SVD components that best model the background, a depth analysis of the computation of the left/right singular vectors and singular values is performed, and the relationship with tensor-tube fibers is determined. The analysis proves that a rank-1 matrix extracted from the first left and right singular vectors and singular value represents an efficient model of the scene background. The performance of the proposed SS-SVD method is evaluated using 93 complex video sequences of different challenges, and the method is compared with state-of-the-art tensor/matrix completion-based methods, statistical-based methods, search-based methods, and labeling-based methods. The results not only show better performance over most of the tested challenges, but also demonstrate the capability of the proposed technique to solve the background-initialization problem in a less computational time and with fewer frames.

7.
Sensors (Basel) ; 17(6)2017 Jun 14.
Article in English | MEDLINE | ID: mdl-28613251

ABSTRACT

A critical concern of autonomous vehicles is safety. Different approaches have tried to enhance driving safety to reduce the number of fatal crashes and severe injuries. As an example, Intelligent Speed Adaptation (ISA) systems warn the driver when the vehicle exceeds the recommended speed limit. However, these systems only take into account fixed speed limits without considering factors like road geometry. In this paper, we consider road curvature with speed limits to automatically adjust vehicle's speed with the ideal one through our proposed Dynamic Speed Adaptation (DSA) method. Furthermore, 'curve analysis extraction' and 'speed limits database creation' are also part of our contribution. An algorithm that analyzes GPS information off-line identifies high curvature segments and estimates the speed for each curve. The speed limit database contains information about the different speed limit zones for each traveled path. Our DSA senses speed limits and curves of the road using GPS information and ensures smooth speed transitions between current and ideal speeds. Through experimental simulations with different control algorithms on real and simulated datasets, we prove that our method is able to significantly reduce lateral errors on sharp curves, to respect speed limits and consequently increase safety and comfort for the passenger.

8.
Sensors (Basel) ; 17(1)2017 Jan 17.
Article in English | MEDLINE | ID: mdl-28106746

ABSTRACT

A precise GNSS (Global Navigation Satellite System) localization is vital for autonomous road vehicles, especially in cluttered or urban environments where satellites are occluded, preventing accurate positioning. We propose to fuse GPS (Global Positioning System) data with fisheye stereovision to face this problem independently to additional data, possibly outdated, unavailable, and needing correlation with reality. Our stereoscope is sky-facing with 360° × 180° fisheye cameras to observe surrounding obstacles. We propose a 3D modelling and plane extraction through following steps: stereoscope self-calibration for decalibration robustness, stereo matching considering neighbours epipolar curves to compute 3D, and robust plane fitting based on generated cartography and Hough transform. We use these 3D data with GPS raw data to estimate NLOS (Non Line Of Sight) reflected signals pseudorange delay. We exploit extracted planes to build a visibility mask for NLOS detection. A simplified 3D canyon model allows to compute reflections pseudorange delays. In the end, GPS positioning is computed considering corrected pseudoranges. With experimentations on real fixed scenes, we show generated 3D models reaching metric accuracy and improvement of horizontal GPS positioning accuracy by more than 50%. The proposed procedure is effective, and the proposed NLOS detection outperforms CN0-based methods (Carrier-to-receiver Noise density).

9.
IEEE Trans Vis Comput Graph ; 21(9): 1045-57, 2015 Sep.
Article in English | MEDLINE | ID: mdl-26357286

ABSTRACT

The goal of structured mesh is to generate a compressed representation of the 3D surface, where near objects are provided with more details than objects far from the camera, according to the disparity map. The solution is based on the Kohonens Self-Organizing Map algorithm for the benefits of its ability to generate a topological map according to a probability distribution and its potential to be a natural massive parallel algorithm. The disparity map, which stands for a density distribution that reflects the proximity of objects to the camera, is partitioned into an appropriate number of cell units, in such a way that each cell is associated to a processing unit and responsible of a certain area of the plane. The advantage of the proposed model is that it is decentralized and based on data decomposition. The required processing units and memory are with linearly increasing relationship to the problem size. Experimental results show that our GPU implementation is able to provide near real-time performance with small size disparity maps and the running time increases in a linear way with a very weak increasing coefficient. The proposed method is suitable to deal with large scale problems in a massively parallel way.

10.
Sensors (Basel) ; 15(2): 3172-203, 2015 Feb 02.
Article in English | MEDLINE | ID: mdl-25648706

ABSTRACT

In this paper, we present a novel strategy for roof segmentation from aerial images (orthophotoplans) based on the cooperation of edge- and region-based segmentation methods. The proposed strategy is composed of three major steps. The first one, called the pre-processing step, consists of simplifying the acquired image with an appropriate couple of invariant and gradient, optimized for the application, in order to limit illumination changes (shadows, brightness, etc.) affecting the images. The second step is composed of two main parallel treatments: on the one hand, the simplified image is segmented by watershed regions. Even if the first segmentation of this step provides good results in general, the image is often over-segmented. To alleviate this problem, an efficient region merging strategy adapted to the orthophotoplan particularities, with a 2D modeling of roof ridges technique, is applied. On the other hand, the simplified image is segmented by watershed lines. The third step consists of integrating both watershed segmentation strategies into a single cooperative segmentation scheme in order to achieve satisfactory segmentation results. Tests have been performed on orthophotoplans containing 100 roofs with varying complexity, and the results are evaluated with the VINETcriterion using ground-truth image segmentation. A comparison with five popular segmentation techniques of the literature demonstrates the effectiveness and the reliability of the proposed approach. Indeed, we obtain a good segmentation rate of 96% with the proposed method compared to 87.5% with statistical region merging (SRM), 84% with mean shift, 82% with color structure code (CSC), 80% with efficient graph-based segmentation algorithm (EGBIS) and 71% with JSEG.


Subject(s)
Image Interpretation, Computer-Assisted , Pattern Recognition, Automated , Remote Sensing Technology , Satellite Imagery , Algorithms , Color , Humans , Image Enhancement
11.
Sensors (Basel) ; 14(6): 10454-78, 2014 Jun 13.
Article in English | MEDLINE | ID: mdl-24932866

ABSTRACT

Occupancy grid map is a popular tool for representing the surrounding environments of mobile robots/intelligent vehicles. Its applications can be dated back to the 1980s, when researchers utilized sonar or LiDAR to illustrate environments by occupancy grids. However, in the literature, research on vision-based occupancy grid mapping is scant. Furthermore, when moving in a real dynamic world, traditional occupancy grid mapping is required not only with the ability to detect occupied areas, but also with the capability to understand dynamic environments. The paper addresses this issue by presenting a stereo-vision-based framework to create a dynamic occupancy grid map, which is applied in an intelligent vehicle driving in an urban scenario. Besides representing the surroundings as occupancy grids, dynamic occupancy grid mapping could provide the motion information of the grids. The proposed framework consists of two components. The first is motion estimation for the moving vehicle itself and independent moving objects. The second is dynamic occupancy grid mapping, which is based on the estimated motion information and the dense disparity map. The main benefit of the proposed framework is the ability of mapping occupied areas and moving objects at the same time. This is very practical in real applications. The proposed method is evaluated using real data acquired by our intelligent vehicle platform "SeTCar" in urban environments.

SELECTION OF CITATIONS
SEARCH DETAIL
...