Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38315589

RESUMO

Recently, memory-based networks have achieved promising performance for video object segmentation (VOS). However, existing methods still suffer from unsatisfactory segmentation accuracy and inferior efficiency. The reasons are mainly twofold: 1) during memory construction, the inflexible memory storage mechanism results in a weak discriminative ability for similar appearances in complex scenarios, leading to video-level temporal redundancy, and 2) during memory reading, matching robustness and memory retrieval accuracy decrease as the number of video frames increases. To address these challenges, we propose an adaptive sparse memory network (ASM) that efficiently and effectively performs VOS by sparsely leveraging previous guidance while attending to key information. Specifically, we design an adaptive sparse memory constructor (ASMC) to adaptively memorize informative past frames according to dynamic temporal changes in video frames. Furthermore, we introduce an attentive local memory reader (ALMR) to quickly retrieve relevant information using a subset of memory, thereby reducing frame-level redundant computation and noise in a simpler and more convenient manner. To prevent key features from being discarded by the subset of memory, we further propose a novel attentive local feature aggregation (ALFA) module, which preserves useful cues by selectively aggregating discriminative spatial dependence from adjacent frames, thereby effectively increasing the receptive field of each memory frame. Extensive experiments demonstrate that our model achieves state-of-the-art performance with real-time speed on six popular VOS benchmarks. Furthermore, our ASM can be applied to existing memory-based methods as generic plugins to achieve significant performance improvements. More importantly, our method exhibits robustness in handling sparse videos with low frame rates.

2.
IEEE Trans Image Process ; 32: 3924-3938, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37432823

RESUMO

Recently, memory-based methods have achieved remarkable progress in video object segmentation. However, the segmentation performance is still limited by error accumulation and redundant memory, primarily because of 1) the semantic gap caused by similarity matching and memory reading via heterogeneous key-value encoding; 2) the continuously growing and inaccurate memory through directly storing unreliable predictions of all previous frames. To address these issues, we propose an efficient, effective, and robust segmentation method based on Isogenous Memory Sampling and Frame-Relation mining (IMSFR). Specifically, by utilizing an isogenous memory sampling module, IMSFR consistently conducts memory matching and reading between sampled historical frames and the current frame in an isogenous space, minimizing the semantic gap while speeding up the model through an efficient random sampling. Furthermore, to avoid key information loss during the sampling process, we further design a frame-relation temporal memory module to mine inter-frame relations, thereby effectively preserving contextual information from the video sequence and alleviating error accumulation. Extensive experiments demonstrate the effectiveness and efficiency of the proposed IMSFR method. In particular, our IMSFR achieves state-of-the-art performance on six commonly used benchmarks in terms of region similarity & contour accuracy and speed. Our model also exhibits strong robustness against frame sampling due to its large receptive field.

3.
IEEE Trans Neural Netw Learn Syst ; 31(9): 3620-3633, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-31714242

RESUMO

Outliers due to occlusion, pixel corruption, and so on pose serious challenges to face recognition despite the recent progress brought by sparse representation. In this article, we show that robust statistics implemented by the state-of-the-art methods are insufficient for robustness against dense gross errors. By modeling the distribution of coding residuals with a Laplacian-uniform mixture, we obtain a sparse representation that is significantly more robust than the previous methods. The nonconvex error term of the implemented objective function is nondifferentiable at zero and cannot be properly addressed by the usual iteratively reweighted least-squares formulation. We show that an iterative robust coding algorithm can be derived by local linear approximation of the nonconvex error term, which is both effective and efficient. With iteratively reweighted l1 minimization of the error term, the proposed algorithm is capable of handling the sparsity assumption of the coding errors more appropriately than the previous methods. Notably, it has the distinct property of addressing error detection and error correction cooperatively in the robust coding process. The proposed method demonstrates significantly improved robustness for face recognition against dense gross errors, either contiguous or discontiguous, as verified by extensive experiments.

4.
IEEE Trans Cybern ; 43(2): 490-503, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22929435

RESUMO

Since 2001, a novel type of recurrent neural network called Zhang neural network (ZNN) has been proposed, investigated, and exploited for solving online time-varying problems in a variety of scientific and engineering fields. In this paper, three discrete-time ZNN models are first proposed to solve the problem of time-varying quadratic minimization (TVQM). Such discrete-time ZNN models exploit methodologically the time derivatives of time-varying coefficients and the inverse of the time-varying coefficient matrix. To eliminate explicit matrix-inversion operation, the quasi-Newton BFGS method is introduced, which approximates effectively the inverse of the Hessian matrix; thus, three discrete-time ZNN models combined with the quasi-Newton BFGS method (named ZNN-BFGS) are proposed and investigated for TVQM. In addition, according to the criterion of whether the time-derivative information of time-varying coefficients is explicitly known/used or not, these proposed discrete-time models are classified into three categories: 1) models with time-derivative information known (i.e., ZNN-K and ZNN-BFGS-K models), 2) models with time-derivative information unknown (i.e., ZNN-U and ZNN-BFGS-U models), and 3) simplified models without using time-derivative information (i.e., ZNN-S and ZNN-BFGS-S models). The well-known gradient-based neural network is also developed to handle TVQM for comparison with the proposed ZNN and ZNN-BFGS models. Illustrative examples are provided and analyzed to substantiate the efficacy of these proposed models for TVQM.


Assuntos
Redes Neurais de Computação , Algoritmos , Simulação por Computador
5.
IEEE Trans Neural Netw ; 19(5): 746-57, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18467205

RESUMO

The adaptive-subspace self-organizing map (ASSOM) is useful for invariant feature generation and visualization. However, the learning procedure of the ASSOM is slow. In this paper, two fast implementations of the ASSOM are proposed to boost ASSOM learning based on insightful discussions of the basis rotation operator of ASSOM. We investigate the objective function approximately maximized by the classical rotation operator. We then explore a sequence of two schemes to apply the proposed ASSOM implementations to saliency-based invariant feature construction for image classification. In the first scheme, a cumulative activity map computed from a single ASSOM is used as descriptor of the input image. In the second scheme, we use one ASSOM for each image category and a joint cumulative activity map is calculated as the descriptor. Both schemes are evaluated on a subset of the Corel photo database with ten classes. The multi-ASSOM scheme is favored. It is also applied to adult image filtering and shows promising results.


Assuntos
Inteligência Artificial , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Humanos , Modelos Lineares , Reprodutibilidade dos Testes , Pele
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...