CCTwins: A Weakly-Supervised Transformer-based Crowd Counting Method with Adaptive Scene Consistency Attention
IEEE Transactions on Consumer Electronics
; : 1-1, 2023.
Article
Dans Anglais
| Scopus | ID: covidwho-20234982
ABSTRACT
Recently, crowd counting has attracted significant attention, particularly in the context of the COVID-19 pandemic, due to its ability to automatically provide accurate crowd numbers in images. To address the challenges of location-level labeling, several transformer-based crowd counting methods have been proposed with only count-level supervision. However, these methods directly use the transformer as an encoder without considering the uneven crowd distribution. To address this issue, we propose CCTwins, a novel transformer-based crowd counting method with only count-level supervision. Specifically, we introduce an adaptive scene consistency attention mechanism to enhance the transformer-based model Twins-SVT-L for feature extraction in crowded scenes. Additionally, we design a multi-level weakly-supervised loss function that generates estimated crowd numbers in a coarse-to-fine manner, making it more appropriate for weakly-supervised settings. Moreover, intermediate features supervised by count-level labels are utilized to fuse multi-scale features. Experimental results on four public datasets demonstrate that our proposed method outperforms the state-of-the-art weakly-supervised methods, achieving up to a 16.6% improvement in MAE and up to a 13.8% improvement in RMSE across all evaluation settings. Moreover, the proposed CCTwins obtains competitive counting performance, even when compared to the state-of-the-art fully-supervised methods. IEEE
Texte intégral:
Disponible
Collection:
Bases de données des oragnisations internationales
Base de données:
Scopus
Type d'étude:
Études expérimentales
langue:
Anglais
Revue:
IEEE Transactions on Consumer Electronics
Année:
2023
Type de document:
Article
Documents relatifs à ce sujet
MEDLINE
...
LILACS
LIS