Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38153834

ABSTRACT

Transformers have astounding representational power but typically consume considerable computation which is quadratic with image resolution. The prevailing Swin transformer reduces computational costs through a local window strategy. However, this strategy inevitably causes two drawbacks: 1) the local window-based self-attention (WSA) hinders global dependency modeling capability and 2) recent studies point out that local windows impair robustness. To overcome these challenges, we pursue a preferable trade-off between computational cost and performance. Accordingly, we propose a novel factorization self-attention (FaSA) mechanism that enjoys both the advantages of local window cost and long-range dependency modeling capability. By factorizing the conventional attention matrix into sparse subattention matrices, FaSA captures long-range dependencies, while aggregating mixed-grained information at a computational cost equivalent to the local WSA. Leveraging FaSA, we present the factorization vision transformer (FaViT) with a hierarchical structure. FaViT achieves high performance and robustness, with linear computational complexity concerning input image spatial resolution. Extensive experiments have shown FaViT's advanced performance in classification and downstream tasks. Furthermore, it also exhibits strong model robustness to corrupted and biased data and hence demonstrates benefits in favor of practical applications. In comparison to the baseline model Swin-T, our FaViT-B2 significantly improves classification accuracy by 1% and robustness by 7% , while reducing model parameters by 14% . Our code will soon be publicly available: at https://github.com/q2479036243/FaViT.

2.
IEEE Trans Med Imaging ; 42(9): 2476-2489, 2023 09.
Article in English | MEDLINE | ID: mdl-35862338

ABSTRACT

Automatic subcutaneous vessel imaging with near-infrared (NIR) optical apparatus can promote the accuracy of locating blood vessels, thus significantly contributing to clinical venipuncture research. Though deep learning models have achieved remarkable success in medical image segmentation, they still struggle in the subfield of subcutaneous vessel segmentation due to the scarcity and low-quality of annotated data. To relieve it, this work presents a novel semi-supervised learning framework, SCANet, that achieves accurate vessel segmentation through an alternate training strategy. The SCANet is composed of a multi-scale recurrent neural network that embeds coarse-to-fine features and two auxiliary branches, a consistency decoder and an adversarial learning branch, responsible for strengthening fine-grained details and eliminating differences between ground-truths and predictions, respectively. Equipped with a novel semi-supervised alternate training strategy, the three components work collaboratively, enabling SCANet to accurately segment vessel regions with only a handful of labeled data and abounding unlabeled data. Moreover, to mitigate the shortage of annotated data in this field, we provide a new subcutaneous vessel dataset, VESSEL-NIR. Extensive experiments on a wide variety of tasks, including the segmentation of subcutaneous vessels, retinal vessels, and skin lesions, well demonstrate the superiority and generality of our approach.


Subject(s)
Neural Networks, Computer , Retinal Vessels , Retinal Vessels/diagnostic imaging , Supervised Machine Learning , Image Processing, Computer-Assisted/methods
3.
IEEE Trans Cybern ; 52(8): 7527-7540, 2022 Aug.
Article in English | MEDLINE | ID: mdl-33417585

ABSTRACT

Visual object tracking with semantic deep features has recently attracted much attention in computer vision. Especially, Siamese trackers, which aim to learn a decision making-based similarity evaluation, are widely utilized in the tracking community. However, the online updating of the Siamese fashion is still a tricky issue due to the limitation, which is a tradeoff between model adaption and degradation. To address such an issue, in this article, we propose a novel attentional transfer learning-based Siamese network (SiamATL), which fully exploits the previous knowledge to inspire the current tracker learning in the decision-making module. First, we explicitly model the template and surroundings by using an attentional online update strategy to avoid template pollution. Then, we introduce an instance-transfer discriminative correlation filter (ITDCF) to enhance the distinguishing ability of the tracker. Finally, we suggest a mutual compensation mechanism that integrates cross-correlation matching and ITDCF detection into the decision-making subnetwork to achieve online tracking. Comprehensive experiments demonstrate that our approach outperforms state-of-the-art tracking algorithms on multiple large-scale tracking datasets.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Attention , Learning , Machine Learning
SELECTION OF CITATIONS
SEARCH DETAIL
...