Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-37227909

RESUMO

For cross-domain pattern classification, the supervised information (i.e., labeled patterns) in the source domain is often employed to help classify the unlabeled target domain patterns. In practice, multiple target domains are usually available. The unlabeled patterns (in different target domains) which have high-confidence predictions, can also provide some pseudo-supervised information for the downstream classification task. The performance in each target domain would be further improved if the pseudo-supervised information in different target domains can be effectively used. To this end, we propose an evidential multi-target domain adaptation (EMDA) method to take full advantage of the useful information in the single-source and multiple target domains. In EMDA, we first align distributions of the source and target domains by reducing maximum mean discrepancy (MMD) and covariance difference across domains. After that, we use the classifier learned by the labeled source domain data to classify query patterns in the target domains. The query patterns with high-confidence predictions are then selected to train a new classifier for yielding an extra piece of soft classification results of query patterns. The two pieces of soft classification results are then combined by evidence theory. In practice, their reliabilities/weights are usually diverse, and an equal treatment of them often yields the unreliable combination result. Thus, we propose to use the distribution discrepancy across domains to estimate their weighting factors, and discount them before fusing. The evidential combination of the two pieces of discounted soft classification results is employed to make the final class decision. The effectiveness of EMDA was verified by comparing with many advanced domain adaptation methods on several cross-domain pattern classification benchmark datasets.

2.
Artigo em Inglês | MEDLINE | ID: mdl-37027778

RESUMO

Besides combining appearance and motion information, another crucial factor for video salient object detection (VSOD) is to mine spatial-temporal (ST) knowledge, including complementary long-short temporal cues and global-local spatial context from neighboring frames. However, the existing methods only explored part of them and ignored their complementarity. In this article, we propose a novel complementary ST transformer (CoSTFormer) for VSOD, which has a short-global branch and a long-local branch to aggregate complementary ST contexts. The former integrates the global context from the neighboring two frames using dense pairwise attention, while the latter is designed to fuse long-term temporal information from more consecutive frames with local attention windows. In this way, we decompose the ST context into a short-global part and a long-local part and leverage the powerful transformer to model the context relationship and learn their complementarity. To solve the contradiction between local window attention and object motion, we propose a novel flow-guided window attention (FGWA) mechanism to align the attention windows with object and camera movements. Furthermore, we deploy CoSTFormer on fused appearance and motion features, thus enabling the effective combination of all three VSOD factors. Besides, we present a pseudo video generation method to synthesize sufficient video clips from static images for training ST saliency models. Extensive experiments have verified the effectiveness of our method and illustrated that we achieve new state-of-the-art results on several benchmark datasets.

3.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 8321-8337, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34437057

RESUMO

Conventional salient object detection models cannot differentiate the importance of different salient objects. Recently, two works have been proposed to detect saliency ranking by assigning different degrees of saliency to different objects. However, one of these models cannot differentiate object instances and the other focuses more on sequential attention shift order inference. In this paper, we investigate a practical problem setting that requires simultaneously segment salient instances and infer their relative saliency rank order. We present a novel unified model as the first end-to-end solution, where an improved Mask R-CNN is first used to segment salient instances and a saliency ranking branch is then added to infer the relative saliency. For relative saliency ranking, we build a new graph reasoning module by combining four graphs to incorporate the instance interaction relation, local contrast, global contrast, and a high-level semantic prior, respectively. A novel loss function is also proposed to effectively train the saliency ranking branch. Besides, a new dataset and an evaluation metric are proposed for this task, aiming at pushing forward this field of research. Finally, experimental results demonstrate that our proposed model is more effective than previous methods. We also show an example of its practical usage on adaptive image retargeting.

4.
IEEE Trans Image Process ; 30: 5862-5874, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34152985

RESUMO

Different from conventional instance segmentation, salient instance segmentation (SIS) faces two difficulties. The first is that it involves segmenting salient instances only while ignoring background, and the second is that it targets generic object instances without pre-defined object categories. In this paper, based on the state-of-the-art Mask R-CNN model, we propose to leverage complementary saliency and contour information to handle these two challenges. We first improve Mask R-CNN by introducing an interleaved execution strategy and proposing a novel mask head network to incorporate global context within each RoI. Then we add two branches to Mask R-CNN for saliency and contour detection, respectively. We fuse the Mask R-CNN features with the saliency and contour features, where the former supply pixel-wise saliency information to help with identifying salient regions and the latter provide a generic object contour prior to help detect and segment generic objects. We also propose a novel multiscale global attention model to generate attentive global features from multiscale representative features for feature fusion. Experimental results demonstrate that all our proposed model components can improve SIS performance. Finally, our overall model outperforms state-of-the-art SIS methods and Mask R-CNN by more than 6% and 3%, respectively. By using additional multitask training data, we can further improve the model performance on the ILSO dataset.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...