Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
IEEE Trans Image Process ; 33: 1432-1447, 2024.
Article in English | MEDLINE | ID: mdl-38354079

ABSTRACT

Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples in support images. Although progress has been made recently by combining prototype-based metric learning, existing methods still face two main challenges. First, various intra-class objects between the support and query images or semantically similar inter-class objects can seriously harm the segmentation performance due to their poor feature representations. Second, the latent novel classes are treated as the background in most methods, leading to a learning bias, whereby these novel classes are difficult to correctly segment as foreground. To solve these problems, we propose a dual-branch learning method. The class-specific branch encourages representations of objects to be more distinguishable by increasing the inter-class distance while decreasing the intra-class distance. In parallel, the class-agnostic branch focuses on minimizing the foreground class feature distribution and maximizing the features between the foreground and background, thus increasing the generalizability to novel classes in the test stage. Furthermore, to obtain more representative features, pixel-level and prototype-level semantic learning are both involved in the two branches. The method is evaluated on PASCAL- 5i 1 -shot, PASCAL- 5i 5 -shot, COCO- 20i 1 -shot, and COCO- 20i 5 -shot, and extensive experiments show that our approach is effective for few-shot semantic segmentation despite its simplicity.

2.
Article in English | MEDLINE | ID: mdl-37801374

ABSTRACT

Creating visualizations of multiple volumetric density fields is demanding in virtual reality (VR) applications, which often include divergent volumetric density distributions mixed with geometric models and physics-based simulations. Real-time rendering of such complex environments poses significant challenges for rendering quality and performance. This paper presents a novel scheme for efficient real-time rendering of varying translucent volumetric density fields with global illumination (GI) effects on high-resolution binocular VR displays. Our scheme proposes creative solutions to address three challenges involved in the target problem. Firstly, to tackle the doubled heavy workloads of binocular ray marching, we explore the anti-aliasing principles and more advanced potentials of ray marching on interior cube-map faces, and propose a coupled ray-marching technique that converges to multi-resolution cube maps with interleaved adaptive sampling. Secondly, we devise a fully dynamic ambient GI approximation method that leverages spherical-harmonics (SH) transform information of the phase function to reduce the huge amount of ray sampling required for GI while ensuring fidelity. The method catalyzes spatial ray-marching reuse and adaptive temporal accumulation. Thirdly, we deploy a two-phase ray-tracing algorithm with a tiled k-buffer to achieve fast processing of order-independent transparency (OIT) for multiple volume instances. Consequently, high-quality and high-performance real-time dynamic volume rendering can be achieved under constrained budgets controlled by developers. As our solution supports mixed mesh-volume rendering, the test results prove the practical usefulness of our approach for high-resolution binocular VR rendering on hybrid multi-volumetric and geometric environments.

3.
IEEE Trans Vis Comput Graph ; 28(9): 3168-3179, 2022 Sep.
Article in English | MEDLINE | ID: mdl-33523813

ABSTRACT

In simulating viscous incompressible SPH fluids, incompressibility and viscosity are typically solved in two separate stages. However, the interference between pressure and shear forces could cause the missing of behaviors that include preservation of sharp surface details and remarkable viscous behaviors such as buckling and rope coiling. To alleviate this problem, we introduce for the first time the semi-implicit method for pressure linked equations (SIMPLE) into SPH to solve incompressible fluids with a broad range viscosity. We propose to link incompressibility and viscosity solvers, and impose incompressibility and viscosity constraints iteratively to gradually remove the interference between pressure and shear forces. We will also discuss how to solve the particle deficiency problem for both incompressibility and viscosity solvers. Our method is stable at simulating incompressible fluids whose viscosity can range from zero to an extremely high value. Compared to state-of-the-art methods, our method not only produces realistic viscous behaviors, but is also better at preserving sharp surface details.

4.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 7731-7746, 2022 11.
Article in English | MEDLINE | ID: mdl-34546919

ABSTRACT

This paper introduces versatile filters to construct efficient convolutional neural networks that are widely used in various visual recognition tasks. Considering the demands of efficient deep learning techniques running on cost-effective hardware, a number of methods have been developed to learn compact neural networks. Most of these works aim to slim down filters in different ways, e.g., investigating small, sparse or quantized filters. In contrast, we treat filters from an additive perspective. A series of secondary filters can be derived from a primary filter with the help of binary masks. These secondary filters all inherit in the primary filter without occupying more storage, but once been unfolded in computation they could significantly enhance the capability of the filter by integrating information extracted from different receptive fields. Besides spatial versatile filters, we additionally investigate versatile filters from the channel perspective. Binary masks can be further customized for different primary filters under orthogonal constraints. We conduct theoretical analysis on network complexity and an efficient convolution scheme is introduced. Experimental results on benchmark datasets and neural networks demonstrate that our versatile filters are able to achieve comparable accuracy as that of original filters, but require less memory and computation cost.


Subject(s)
Algorithms , Neural Networks, Computer
5.
IEEE Trans Pattern Anal Mach Intell ; 42(8): 2011-2023, 2020 08.
Article in English | MEDLINE | ID: mdl-31034408

ABSTRACT

The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251 percent, surpassing the winning entry of 2016 by a relative improvement of  âˆ¼ 25 percent. Models and code are available at https://github.com/hujie-frank/SENet.

6.
IEEE Trans Vis Comput Graph ; 25(7): 2471-2481, 2019 Jul.
Article in English | MEDLINE | ID: mdl-29993746

ABSTRACT

Modeling virtual textiles has long been an appealing topic in computer graphics. To date, considerable effort has been devoted to their distinctive appearance and physically-based simulation. The apperance of staining patterns, commonly seen on textiles, has received comparatively little attention. This paper introduces techniques for simulating staining effects on fabric. Based on the microstructure of yarn, we propose a triple-layer model (TLM) to handle the liquid-yarn interaction for the wetting and wicking computation, and we formalize the liquid spreading in woven cloth into two typical actions, the in-yarn diffusion and the cross-yarn diffusion. The dye diffusion is driven by the liquid diffusion and the concentration distribution of pigments. The warp-weft anisotropy is handled by simulation of the yarn's structure in the two directions. Experimental results demonstrate that a wide range of fabric stain phenomenon on different textile materials, such as the water ring effect, the high saturate stain contour, and the dynamic wash away effect, can be simulated effectively without loss of visual realism. The realism of our simulation results is comparable to effects shown in photographs of real-world examples.

7.
IEEE Trans Vis Comput Graph ; 24(9): 2589-2599, 2018 09.
Article in English | MEDLINE | ID: mdl-28952943

ABSTRACT

Unified simulation of versatile elastoplastic materials and different dimensions offers many advantages in animation production, contact handling, and hardware acceleration. The unstructured particle representation is particularly suitable for this task, thanks to its simplicity. However, previous meshless techniques either need too much computational cost for addressing stability issues, or lack physical meanings and fail to generate interesting deformation behaviors, such as the Poisson effect. In this paper, we study the development of an elastoplastic model under the state-based peridynamics framework, which uses integrals rather than partial derivatives in its formulation. To model elasticity, we propose a unique constitutive model and an efficient iterative simulator solved in a projective dynamics way. To handle plastic behaviors, we incorporate our simulator with the Drucker-Prager yield criterion and a reference position update scheme, both of which are implemented under peridynamics. Finally, we show how to strengthen the simulator by position-based constraints and spatially varying stiffness models, to achieve incompressibility, particle redistribution, cohesion, and friction effects in viscoelastic and granular flows. Our experiments demonstrate that our unified, meshless simulator is flexible, efficient, robust, and friendly with parallel computing.

8.
J Opt Soc Am A Opt Image Sci Vis ; 33(12): 2365-2375, 2016 Dec 01.
Article in English | MEDLINE | ID: mdl-27906263

ABSTRACT

Boundary priors have been extensively studied in salient region detection problems over the past few decades. Although several models based on the boundary prior have achieved good detection performance, there still exist drawbacks. The most common one is that they fail to detect the salient object when the background is complex or the salient object touches the image boundary. In this paper, we propose a novel model to detect the salient region. It is based on background cues and one complementary cue, that is, a foreground cue. A saliency score is obtained via solving an energy optimization problem which takes both the background cue and foreground cue into consideration. Extensive experiments, including both quantitative and qualitative evaluations on five widely used datasets, demonstrate the superiority of our proposed model to several other state-of-the-art models.

9.
IEEE Trans Vis Comput Graph ; 22(8): 1973-86, 2016 08.
Article in English | MEDLINE | ID: mdl-26353373

ABSTRACT

We propose a semi-Lagrangian method for multiphase interface tracking. In contrast to previous methods, our method maintains an explicit polygonal mesh, which is reconstructed from an unsigned distance function and an indicator function, to track the interface of arbitrary number of phases. The surface mesh is reconstructed at each step using an efficient multiphase polygonization procedure with precomputed stencils while the distance and indicator function are updated with an accurate semi-Lagrangian path tracing from the meshes of the last step. Furthermore, we provide an adaptive data structure, multiphase distance tree, to accelerate the updating of both the distance function and the indicator function. In addition, the adaptive structure also enables us to contour the distance tree accurately with simple bisection techniques. The major advantage of our method is that it can easily handle topological changes without ambiguities and preserve both the sharp features and the volume well. We will evaluate its efficiency, accuracy and robustness in the results part with several examples.

10.
IEEE Trans Image Process ; 24(10): 3109-23, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26080049

ABSTRACT

The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

11.
Opt Express ; 21(8): 10070-86, 2013 Apr 22.
Article in English | MEDLINE | ID: mdl-23609712

ABSTRACT

In this paper, we present an efficient Computer Generated Integral Imaging (CGII) method, called multiple ray cluster rendering (MRCR). Based on the MRCR, an interactive integral imaging system is realized, which provides accurate 3D image satisfying the changeable observers' positions in real time. The MRCR method can generate all the elemental image pixels within only one rendering pass by ray reorganization of multiple ray clusters and 3D content duplication. It is compatible with various graphic contents including mesh, point cloud, and medical data. Moreover, multi-sampling method is embedded in MRCR method for acquiring anti-aliased 3D image result. To our best knowledge, the MRCR method outperforms the existing CGII methods in both the speed performance and the display quality. Experimental results show that the proposed CGII method can achieve real-time computational speed for large-scale 3D data with about 50,000 points.


Subject(s)
Algorithms , Computer Graphics , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Signal Processing, Computer-Assisted , User-Computer Interface
12.
IEEE Trans Pattern Anal Mach Intell ; 34(7): 1437-44, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22450820

ABSTRACT

We propose a method for intrinsic image decomposition based on retinex theory and texture analysis. While most previous methods approach this problem by analyzing local gradient properties, our technique additionally identifies distant pixels with the same reflectance through texture analysis, and uses these nonlocal reflectance constraints to significantly reduce ambiguity in decomposition. We formulate the decomposition problem as the minimization of a quadratic function which incorporates both the retinex constraint and our nonlocal texture constraint. This optimization can be solved in closed form with the standard conjugate gradient algorithm. Extensive experimentation with comparisons to previous techniques validate our method in terms of both decomposition accuracy and runtime efficiency.

13.
IEEE Comput Graph Appl ; 32(1): 78-86, 2012.
Article in English | MEDLINE | ID: mdl-24808295

ABSTRACT

This approach employs a hybrid texel-and-points scheme, allowing the volume models to handle time-varying simulations. The modeling of grass carries out physically based calculations on the point-based structure. These calculations express the geometric deformation of each grass blade while providing a basis for further transformation of the desired texel array.

14.
IEEE Trans Vis Comput Graph ; 14(1): 73-83, 2008.
Article in English | MEDLINE | ID: mdl-17993703

ABSTRACT

This paper presents the layer-based representation of polyhedrons and its use for point-in-polyhedron tests. In the representation, the facets and edges of a polyhedron are sequentially arranged, and so, the binary search algorithm is efficiently used to speed up inclusion tests. In comparison with conventional representation for polyhedrons, the layer-based representation we propose greatly reduces the storage requirement because it represents much information implicitly, though it still has a storage complexity O(n). It is simple to implement, and robust for inclusion tests because many singularities are erased in constructing the layer-based representation. By incorporating an octree structure for organizing polyhedrons, our approach can run at a speed comparable with Binary Space Partitioning (BSP)-based inclusion tests, and at the same time greatly reduce storage and preprocessing time in treating large polyhedrons. We have developed an efficient solution for point-in-polyhedron tests with the time complexity varying between O(n) and O(logn), depending on the polyhedron shape and the constructed representation, and less than O(log3n) in most cases. The time complexity of preprocess is between O(n) and O(n2), varying with polyhedrons, where n is the edge number of a polyhedron.


Subject(s)
Algorithms , Computer Graphics , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Image Enhancement/methods , Information Storage and Retrieval/methods , Numerical Analysis, Computer-Assisted , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...