Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Cybern ; 49(2): 481-494, 2019 Feb.
Article in English | MEDLINE | ID: mdl-29990288

ABSTRACT

Effective 3-D local features are significant elements for 3-D shape analysis. Existing hand-crafted 3-D local descriptors are effective but usually involve intensive human intervention and prior knowledge, which burdens the subsequent processing procedures. An alternative resorts to the unsupervised learning of features from raw 3-D representations via popular deep learning models. However, this alternative suffers from several significant unresolved issues, such as irregular vertex topology, arbitrary mesh resolution, orientation ambiguity on the 3-D surface, and rigid and slightly nonrigid transformation invariance. To tackle these issues, we propose an unsupervised 3-D local feature learning framework based on a novel permutation voxelization strategy to learn high-level and hierarchical 3-D local features from raw 3-D voxels. Specifically, the proposed strategy first applies a novel voxelization which discretizes each 3-D local region with irregular vertex topology and arbitrary mesh resolution into regular voxels, and then, a novel permutation is applied to permute the voxels to simultaneously eliminate the effect of rotation transformation and orientation ambiguity on the surface. Based on the proposed strategy, the permuted voxels can fully encode the geometry and structure of each local region in regular, sparse, and binary vectors. These voxel vectors are highly suitable for the learning of hierarchical common surface patterns by stacked sparse autoencoder with hierarchical abstraction and sparse constraint. Experiments are conducted on three aspects for evaluating the learned local features: 1) global shape retrieval; 2) partial shape retrieval; and 3) shape correspondence. The experimental results show that the learned local features outperform the other state-of-the-art 3-D shape descriptors.

2.
IEEE Trans Image Process ; 27(9): 3049-3063, 2018 06.
Article in English | MEDLINE | ID: mdl-29993805

ABSTRACT

The discriminability of Bag-of-Words representations can be increased via encoding the spatial relationship among virtual words on 3D shapes. However, this encoding task involves several issues, including arbitrary mesh resolutions, irregular vertex topology, orientation ambiguity on 3D surface, invariance to rigid and non-rigid shape transformations. To address these issues, a novel unsupervised spatial learning framework based on deep neural network, deep spatiality (DS), is proposed. Specifically, DS employs two novel components: spatial context extractor and deep context learner. Spatial context extractor extracts the spatial relationship among virtual words in a local region into a raw spatial representation. Along a consistent circular direction, a directed circular graph is constructed to encode relative positions between pairwise virtual words in each face ring into a relative spatial matrix. By decomposing each relative spatial matrix using SVD, the raw spatial representation is formed, from which deep context learner conducts unsupervised learning of global and local features. Deep context learner is a deep neural network with a novel model structure to adapt the proposed coupled softmax layer, which encodes not only the discriminative information among local regions but also the one among global shapes. Experimental results show that DS outperforms state-of-the-art methods.


Subject(s)
Deep Learning , Imaging, Three-Dimensional/methods , Unsupervised Machine Learning , Software
3.
PLoS One ; 13(4): e0195114, 2018.
Article in English | MEDLINE | ID: mdl-29649294

ABSTRACT

Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.


Subject(s)
Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Pattern Recognition, Automated/methods , Algorithms , Imaging, Three-Dimensional , Machine Learning , Models, Statistical , Software , User-Computer Interface
4.
IEEE Trans Image Process ; 26(8): 3707-3720, 2017 Aug.
Article in English | MEDLINE | ID: mdl-28534770

ABSTRACT

Highly discriminative 3D shape representations can be formed by encoding the spatial relationship among virtual words into the Bag of Words (BoW) method. To achieve this challenging task, several unresolved issues in the encoding procedure must be overcome for 3D shapes, including: 1) arbitrary mesh resolution; 2) irregular vertex topology; 3) orientation ambiguity on the 3D surface; and 4) invariance to rigid and non-rigid shape transformations. In this paper, a novel spatially enhanced 3D shape representation called bag of spatial context correlations (BoSCCs) is proposed to address all these issues. Adopting a novel local perspective, BoSCC is able to describe a 3D shape by an occurrence frequency histogram of spatial context correlation patterns, which makes BoSCC become more compact and discriminative than previous global perspective-based methods. Specifically, the spatial context correlation is proposed to simultaneously encode the geometric and spatial information of a 3D local region by the correlation among spatial contexts of vertices in that region, which effectively resolves the aforementioned issues. The spatial context of each vertex is modeled by Markov chains in a multi-scale manner, which thoroughly captures the spatial relationship by the transition probabilities of intra-virtual words and the ones of inter-virtual words. The high discriminability and compactness of BoSCC are effective for classification and retrieval, especially in the scenarios of limited samples and partial shape retrieval. Experimental results show that BoSCC outperforms the state-of-the-art spatially enhanced BoW methods in three common applications: global shape retrieval, shape classification, and partial shape retrieval.

5.
IEEE Trans Neural Netw Learn Syst ; 28(10): 2268-2281, 2017 10.
Article in English | MEDLINE | ID: mdl-28113522

ABSTRACT

Discriminative features of 3-D meshes are significant to many 3-D shape analysis tasks. However, handcrafted descriptors and traditional unsupervised 3-D feature learning methods suffer from several significant weaknesses: 1) the extensive human intervention is involved; 2) the local and global structure information of 3-D meshes cannot be preserved, which is in fact an important source of discriminability; 3) the irregular vertex topology and arbitrary resolution of 3-D meshes do not allow the direct application of the popular deep learning models; 4) the orientation is ambiguous on the mesh surface; and 5) the effect of rigid and nonrigid transformations on 3-D meshes cannot be eliminated. As a remedy, we propose a deep learning model with a novel irregular model structure, called mesh convolutional restricted Boltzmann machines (MCRBMs). MCRBM aims to simultaneously learn structure-preserving local and global features from a novel raw representation, local function energy distribution. In addition, multiple MCRBMs can be stacked into a deeper model, called mesh convolutional deep belief networks (MCDBNs). MCDBN employs a novel local structure preserving convolution (LSPC) strategy to convolve the geometry and the local structure learned by the lower MCRBM to the upper MCRBM. LSPC facilitates resolving the challenging issue of the orientation ambiguity on the mesh surface in MCDBN. Experiments using the proposed MCRBM and MCDBN were conducted on three common aspects: global shape retrieval, partial shape retrieval, and shape correspondence. Results show that the features learned by the proposed methods outperform the other state-of-the-art 3-D shape features.

6.
IEEE Trans Neural Netw Learn Syst ; 27(6): 1150-62, 2016 06.
Article in English | MEDLINE | ID: mdl-26571539

ABSTRACT

We introduce a new concept for detecting the saliency of 3-D shapes, that is, human-centered saliency (HCS) detection on the surface of shapes, whereby a given shape is analyzed not based on geometric or topological features directly obtained from the shape itself, but by studying how a human uses the object. Using virtual agents to simulate the ways in which humans interact with objects helps to understand shapes and detect their salient parts in relation to their functions. HCS detection is less affected by inconsistencies between the geometry or topology of the analyzed 3-D shapes. The potential benefit of the proposed method is that it is adaptable to variable shapes with the same semantics, as well as being robust against a geometrical and topological noise. Given a 3-D shape, its salient part is detected by automatically selecting a corresponding agent and making them interact with each other. Their adaption and alignment depend on an optimization framework and a training process. We demonstrate the detected salient parts for different types of objects together with the stability thereof. The salient parts can be used for important vision tasks, such as 3-D shape retrieval.

7.
IEEE Trans Biomed Eng ; 59(5): 1354-63, 2012 May.
Article in English | MEDLINE | ID: mdl-22345521

ABSTRACT

Photoacoustic (PA) tomography (PAT) is a rapidly developing imaging modality that can provide high contrast and spatial-resolution images of light-absorption distribution in tissue. However, reconstruction of the absorption distribution is affected by nonuniform light fluence. This paper introduces a reconstruction method for reducing amplification of noise and artifacts in low-fluence regions. In this method, fluence compensation is integrated into model-based reconstruction, and the absorption distribution is iteratively updated. At each iteration, we calculate the residual between detected PA signals and the signals computed by a forward model using the initial pressure, which is the product of estimated voxel value and light fluence. By minimizing the residual, the reconstructed values converge to the true absorption distribution. In addition, we developed a matrix compression method for reducing memory requirements and accelerating reconstruction speed. The results of simulation and phantom experiments indicate that the proposed method provides a better contrast-to-noise ratio (CNR) in low-fluence regions. We expect that the capability of increasing imaging depth will broaden the clinical applications of PAT.


Subject(s)
Image Processing, Computer-Assisted/methods , Photoacoustic Techniques/methods , Signal Processing, Computer-Assisted , Tomography/methods , Computer Simulation , Phantoms, Imaging , Signal-To-Noise Ratio
SELECTION OF CITATIONS
SEARCH DETAIL
...