Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 65
Filter
1.
Article in English | MEDLINE | ID: mdl-38949943

ABSTRACT

The broad learning system (BLS) featuring lightweight, incremental extension, and strong generalization capabilities has been successful in its applications. Despite these advantages, BLS struggles in multitask learning (MTL) scenarios with its limited ability to simultaneously unravel multiple complex tasks where existing BLS models cannot adequately capture and leverage essential information across tasks, decreasing their effectiveness and efficacy in MTL scenarios. To address these limitations, we proposed an innovative MTL framework explicitly designed for BLS, named group sparse regularization for broad multitask learning system using related task-wise (BMtLS-RG). This framework combines a task-related BLS learning mechanism with a group sparse optimization strategy, significantly boosting BLS's ability to generalize in MTL environments. The task-related learning component harnesses task correlations to enable shared learning and optimize parameters efficiently. Meanwhile, the group sparse optimization approach helps minimize the effects of irrelevant or noisy data, thus enhancing the robustness and stability of BLS in navigating complex learning scenarios. To address the varied requirements of MTL challenges, we presented two additional variants of BMtLS-RG: BMtLS-RG with sharing parameters of feature mapped nodes (BMtLS-RGf), which integrates a shared feature mapping layer, and BMtLS-RGf and enhanced nodes (BMtLS-RGfe), which further includes an enhanced node layer atop the shared feature mapping structure. These adaptations provide customized solutions tailored to the diverse landscape of MTL problems. We compared BMtLS-RG with state-of-the-art (SOTA) MTL and BLS algorithms through comprehensive experimental evaluation across multiple practical MTL and UCI datasets. BMtLS-RG outperformed SOTA methods in 97.81% of classification tasks and achieved optimal performance in 96.00% of regression tasks, demonstrating its superior accuracy and robustness. Furthermore, BMtLS-RG exhibited satisfactory training efficiency, outperforming existing MTL algorithms by 8.04-42.85 times.

2.
Article in English | MEDLINE | ID: mdl-38691434

ABSTRACT

This article studies an emerging practical problem called heterogeneous prototype learning (HPL). Unlike the conventional heterogeneous face synthesis (HFS) problem that focuses on precisely translating a face image from a source domain to another target one without removing facial variations, HPL aims at learning the variation-free prototype of an image in the target domain while preserving the identity characteristics. HPL is a compounded problem involving two cross-coupled subproblems, that is, domain transfer and prototype learning (PL), thus making most of the existing HFS methods that simply transfer the domain style of images unsuitable for HPL. To tackle HPL, we advocate disentangling the prototype and domain factors in their respective latent feature spaces and then replacing the source domain with the target one for generating a new heterogeneous prototype. In doing so, the two subproblems in HPL can be solved jointly in a unified manner. Based on this, we propose a disentangled HPL framework, dubbed DisHPL, which is composed of one encoder-decoder generator and two discriminators. The generator and discriminators play adversarial games such that the generator embeds contaminated images into a prototype feature space only capturing identity information and a domain-specific feature space, while generating realistic-looking heterogeneous prototypes. Experiments on various heterogeneous datasets with diverse variations validate the superiority of DisHPL.

3.
Article in English | MEDLINE | ID: mdl-38593012

ABSTRACT

Graph-based multi-view clustering encodes multi-view data into sample affinities to find consensus representation, effectively overcoming heterogeneity across different views. However, traditional affinity measures tend to collapse as the feature dimension expands, posing challenges in estimating a unified alignment that reveals both crossview and inner relationships. To tackle this challenge, we propose to achieve multi-view uniform clustering via consensus representation coregularization. First, the sample affinities are encoded by both popular dyadic affinity and recent high-order affinities to comprehensively characterize spatial distributions of the HDLSS data. Second, a fused consensus representation is learned through aligning the multi-view lowdimensional representation by co-regularization. The learning of the fused representation is modeled by a high-order eigenvalue problem within manifold space to preserve the intrinsic connections and complementary correlations of original data. A numerical scheme via manifold minimization is designed to solve the high-order eigenvalue problem efficaciously. Experiments on eight HDLSS datasets demonstrate the effectiveness of our proposed method in comparison with the recent thirteen benchmark methods.

4.
Article in English | MEDLINE | ID: mdl-38652619

ABSTRACT

Cross-modal hashing (CMH) has attracted considerable attention in recent years. Almost all existing CMH methods primarily focus on reducing the modality gap and semantic gap, i.e., aligning multi-modal features and their semantics in Hamming space, without taking into account the space gap, i.e., difference between the real number space and the Hamming space. In fact, the space gap can affect the performance of CMH methods. In this paper, we analyze and demonstrate how the space gap affects the existing CMH methods, which therefore raises two problems: solution space compression and loss function oscillation. These two problems eventually cause the retrieval performance deteriorating. Based on these findings, we propose a novel algorithm, namely Semantic Channel Hashing (SCH). Firstly, we classify sample pairs into fully semantic-similar, partially semantic-similar, and semantic-negative ones based on their similarity and impose different constraints on them, respectively, to ensure that the entire Hamming space is utilized. Then, we introduce a semantic channel to alleviate the issue of loss function oscillation. Experimental results on three public datasets demonstrate that SCH outperforms the state-of-the-art methods. Furthermore, experimental validations are provided to substantiate the conjectures regarding solution space compression and loss function oscillation, offering visual evidence of their impact on the CMH methods. Codes are available at https://github.com/hutt94/SCH.

5.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 5080-5091, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38315604

ABSTRACT

Tensor spectral clustering (TSC) is an emerging approach that explores multi-wise similarities to boost learning. However, two key challenges have yet to be well addressed in the existing TSC methods: (1) The construction and storage of high-order affinity tensors to encode the multi-wise similarities are memory-intensive and hampers their applicability, and (2) they mostly employ a two-stage approach that integrates multiple affinity tensors of different orders to learn a consensus tensor spectral embedding, thus often leading to a suboptimal clustering result. To this end, this paper proposes a tensor spectral clustering network (TSC-Net) to achieve one-stage learning of a consensus tensor spectral embedding, while reducing the memory cost. TSC-Net employs a deep neural network that learns to map the input samples to the consensus tensor spectral embedding, guided by a TSC objective with multiple affinity tensors. It uses stochastic optimization to calculate a small part of the affinity tensors, thereby avoiding loading the whole affinity tensors for computation, thus significantly reducing the memory cost. Through using an ensemble of multiple affinity tensors, the TSC can dramatically improve clustering performance. Empirical studies on benchmark datasets demonstrate that TSC-Net outperforms the recent baseline methods.

6.
Article in English | MEDLINE | ID: mdl-38289837

ABSTRACT

Partial multilabel learning (PML) addresses the issue of noisy supervision, which contains an overcomplete set of candidate labels for each instance with only a valid subset of training data. Using label enhancement techniques, researchers have computed the probability of a label being ground truth. However, enhancing labels in the noisy label space makes it impossible for the existing partial multilabel label enhancement methods to achieve satisfactory results. Besides, few methods simultaneously involve the ambiguity problem, the feature space's redundancy, and the model's efficiency in PML. To address these issues, this article presents a novel joint partial multilabel framework using broad learning systems (namely BLS-PML) with three innovative mechanisms: 1) a trustworthy label space is reconstructed through a novel label enhancement method to avoid the bias caused by noisy labels; 2) a low-dimensional feature space is obtained by a confidence-based dimensionality reduction method to reduce the effect of redundancy in the feature space; and 3) a noise-tolerant BLS is proposed by adding a dimensionality reduction layer and a trustworthy label layer to deal with PML problem. We evaluated it on six real-world and seven synthetic datasets, using eight state-of-the-art partial multilabel algorithms as baselines and six evaluation metrics. Out of 144 experimental scenarios, our method significantly outperforms the baselines by about 80%, demonstrating its robustness and effectiveness in handling partial multilabel tasks.

7.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3637-3652, 2024 May.
Article in English | MEDLINE | ID: mdl-38145535

ABSTRACT

In multi-view environment, it would yield missing observations due to the limitation of the observation process. The most current representation learning methods struggle to explore complete information by lacking either cross-generative via simply filling in missing view data, or solidative via inferring a consistent representation among the existing views. To address this problem, we propose a deep generative model to learn a complete generative latent representation, namely Complete Multi-view Variational Auto-Encoders (CMVAE), which models the generation of the multiple views from a complete latent variable represented by a mixture of Gaussian distributions. Thus, the missing view can be fully characterized by the latent variables and is resolved by estimating its posterior distribution. Accordingly, a novel variational lower bound is introduced to integrate view-invariant information into posterior inference to enhance the solidative of the learned latent representation. The intrinsic correlations between views are mined to seek cross-view generality, and information leading to missing views is fused by view weights to reach solidity. Benchmark experimental results in clustering, classification, and cross-view image generation tasks demonstrate the superiority of CMVAE, while time complexity and parameter sensitivity analyses illustrate the efficiency and robustness. Additionally, application to bioinformatics data exemplifies its practical significance.

8.
Article in English | MEDLINE | ID: mdl-37566497

ABSTRACT

Mounting evidence shows that Alzheimer's disease (AD) manifests the dysfunction of the brain network much earlier before the onset of clinical symptoms, making its early diagnosis possible. Current brain network analyses treat high-dimensional network data as a regular matrix or vector, which destroys the essential network topology, thereby seriously affecting diagnosis accuracy. In this context, harmonic waves provide a solid theoretical background for exploring brain network topology. However, the harmonic waves are originally intended to discover neurological disease propagation patterns in the brain, which makes it difficult to accommodate brain disease diagnosis with high heterogeneity. To address this challenge, this article proposes a network manifold harmonic discriminant analysis (MHDA) method for accurately detecting AD. Each brain network is regarded as an instance drawn on a Stiefel manifold. Every instance is represented by a set of orthonormal eigenvectors (i.e., harmonic waves) derived from its Laplacian matrix, which fully respects the topological structure of the brain network. An MHDA method within the Stiefel space is proposed to identify the group-dependent common harmonic waves, which can be used as group-specific references for downstream analyses. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method in stratifying cognitively normal (CN) controls, mild cognitive impairment (MCI), and AD.

9.
Front Nutr ; 10: 1060226, 2023.
Article in English | MEDLINE | ID: mdl-37025617

ABSTRACT

Background: Cardiovascular diseases (CVDs) have been the major cause of mortality in type 2 diabetes. However, new approaches are still warranted since current diabetic medications, which focus mainly on glycemic control, do not effectively lower cardiovascular mortality rate in diabetic patients. Protocatechuic acid (PCA) is a phenolic acid widely distributed in garlic, onion, cauliflower and other plant-based foods. Given the anti-oxidative effects of PCA in vitro, we hypothesized that PCA would also have direct beneficial effects on endothelial function in addition to the systemic effects on vascular health demonstrated by previous studies. Methods and results: Since IL-1ß is the major pathological contributor to endothelial dysfunction in diabetes, the anti-inflammatory effects of PCA specific on endothelial cells were further verified by the use of IL-1ß-induced inflammation model. Direct incubation of db/db mouse aortas with physiological concentration of PCA significantly ameliorated endothelium-dependent relaxation impairment, as well as reactive oxygen species overproduction mediated by diabetes. In addition to the well-studied anti-oxidative activity, PCA demonstrated strong anti-inflammatory effects by suppressing the pro-inflammatory cytokines MCP1, VCAM1 and ICAM1, as well as increasing the phosphorylation of eNOS and Akt in the inflammatory endothelial cell model induced by the key player in diabetic endothelial dysfunction IL-1ß. Upon blocking of Akt phosphorylation, p-eNOS/eNOS remained low and the inhibition of pro-inflammatory cytokines by PCA ceased. Conclusion: PCA exerts protection on vascular endothelial function against inflammation through Akt/eNOS pathway, suggesting daily acquisition of PCA may be encouraged for diabetic patients.

10.
Article in English | MEDLINE | ID: mdl-37021983

ABSTRACT

The scene classification of remote sensing (RS) images plays an essential role in the RS community, aiming to assign the semantics to different RS scenes. With the increase of spatial resolution of RS images, high-resolution RS (HRRS) image scene classification becomes a challenging task because the contents within HRRS images are diverse in type, various in scale, and massive in volume. Recently, deep convolution neural networks (DCNNs) provide the promising results of the HRRS scene classification. Most of them regard HRRS scene classification tasks as single-label problems. In this way, the semantics represented by the manual annotation decide the final classification results directly. Although it is feasible, the various semantics hidden in HRRS images are ignored, thus resulting in inaccurate decision. To overcome this limitation, we propose a semantic-aware graph network (SAGN) for HRRS images. SAGN consists of a dense feature pyramid network (DFPN), an adaptive semantic analysis module (ASAM), a dynamic graph feature update module, and a scene decision module (SDM). Their function is to extract the multi-scale information, mine the various semantics, exploit the unstructured relations between diverse semantics, and make the decision for HRRS scenes, respectively. Instead of transforming single-label problems into multi-label issues, our SAGN elaborates the proper methods to make full use of diverse semantics hidden in HRRS images to accomplish scene classification tasks. The extensive experiments are conducted on three popular HRRS scene data sets. Experimental results show the effectiveness of the proposed SAGN.

11.
Article in English | MEDLINE | ID: mdl-37079407

ABSTRACT

Quality prediction is beneficial to intelligent inspection, advanced process control, operation optimization, and product quality improvements of complex industrial processes. Most of the existing work obeys the assumption that training samples and testing samples follow similar data distributions. The assumption is, however, not true for practical multimode processes with dynamics. In practice, traditional approaches mostly establish a prediction model using the samples from the principal operating mode (POM) with abundant samples. The model is inapplicable to other modes with a few samples. In view of this, this article will propose a novel dynamic latent variable (DLV)-based transfer learning approach, called transfer DLV regression (TDLVR), for quality prediction of multimode processes with dynamics. The proposed TDLVR can not only derive the dynamics between process variables and quality variables in the POM but also extract the co-dynamic variations among process variables between the POM and the new mode. This can effectively overcome data marginal distribution discrepancy and enrich the information of the new mode. To make full use of the available labeled samples from the new mode, an error compensation mechanism is incorporated into the established TDLVR, termed compensated TDLVR (CTDLVR), to adapt to the conditional distribution discrepancy. Empirical studies show the efficacy of the proposed TDLVR and CTDLVR methods in several case studies, including numerical simulation examples and two real-industrial process examples.

12.
Cells ; 12(4)2023 02 19.
Article in English | MEDLINE | ID: mdl-36831329

ABSTRACT

Progress has been made in identifying stem cell aging as a pathological manifestation of a variety of diseases, including obesity. Adipose stem cells (ASCs) play a core role in adipocyte turnover, which maintains tissue homeostasis. Given aberrant lineage determination as a feature of stem cell aging, failure in adipogenesis is a culprit of adipose hypertrophy, resulting in adiposopathy and related complications. In this review, we elucidate how ASC fails in entering adipogenic lineage, with a specific focus on extracellular signaling pathways, epigenetic drift, metabolic reprogramming, and mechanical stretch. Nonetheless, such detrimental alternations can be reversed by guiding ASCs towards adipogenesis. Considering the pathological role of ASC aging in obesity, targeting adipogenesis as an anti-obesity treatment will be a key area of future research, and a strategy to rejuvenate tissue stem cell will be capable of alleviating metabolic syndrome.


Subject(s)
Adipocytes , Adipose Tissue , Humans , Adipose Tissue/metabolism , Adipocytes/metabolism , Adipogenesis , Stem Cells/metabolism , Aging , Obesity/metabolism
13.
IEEE Trans Neural Netw Learn Syst ; 34(2): 867-881, 2023 02.
Article in English | MEDLINE | ID: mdl-34403349

ABSTRACT

Single sample per person face recognition (SSPP FR) is one of the most challenging problems in FR due to the extreme lack of enrolment data. To date, the most popular SSPP FR methods are the generic learning methods, which recognize query face images based on the so-called prototype plus variation (i.e., P+V) model. However, the classic P+V model suffers from two major limitations: 1) it linearly combines the prototype and variation images in the observational pixel-spatial space and cannot generalize to multiple nonlinear variations, e.g., poses, which are common in face images and 2) it would be severely impaired once the enrolment face images are contaminated by nuisance variations. To address the two limitations, it is desirable to disentangle the prototype and variation in a latent feature space and to manipulate the images in a semantic manner. To this end, we propose a novel disentangled prototype plus variation model, dubbed DisP+V, which consists of an encoder-decoder generator and two discriminators. The generator and discriminators play two adversarial games such that the generator nonlinearly encodes the images into a latent semantic space, where the more discriminative prototype feature and the less discriminative variation feature are disentangled. Meanwhile, the prototype and variation features can guide the generator to generate an identity-preserved prototype and the corresponding variation, respectively. Experiments on various real-world face datasets demonstrate the superiority of our DisP+V model over the classic P+V model for SSPP FR. Furthermore, DisP+V demonstrates its unique characteristics in both prototype recovery and face editing/interpolation.


Subject(s)
Algorithms , Neural Networks, Computer , Humans , Face , Pattern Recognition, Automated/methods
14.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7541-7554, 2023 Oct.
Article in English | MEDLINE | ID: mdl-35120009

ABSTRACT

Recent weakly supervised semantic segmentation methods generate pseudolabels to recover the lost position information in weak labels for training the segmentation network. Unfortunately, those pseudolabels often contain mislabeled regions and inaccurate boundaries due to the incomplete recovery of position information. It turns out that the result of semantic segmentation becomes determinate to a certain degree. In this article, we decompose the position information into two components: high-level semantic information and low-level physical information, and develop a componentwise approach to recover each component independently. Specifically, we propose a simple yet effective pseudolabels updating mechanism to iteratively correct mislabeled regions inside objects to precisely refine high-level semantic information. To reconstruct low-level physical information, we utilize a customized superpixel-based random walk mechanism to trim the boundaries. Finally, we design a novel network architecture, namely, a dual-feedback network (DFN), to integrate the two mechanisms into a unified model. Experiments on benchmark datasets show that DFN outperforms the existing state-of-the-art methods in terms of intersection-over-union (mIoU).

15.
IEEE Trans Cybern ; 53(6): 3688-3701, 2023 Jun.
Article in English | MEDLINE | ID: mdl-35427226

ABSTRACT

Reversible data hiding in ciphertext has potential applications for privacy protection and transmitting extra data in a cloud environment. For instance, an original plain-text image can be recovered from the encrypted image generated after data embedding, while the embedded data can be extracted before or after decryption. However, homomorphic processing can hardly be applied to an encrypted image with hidden data to generate the desired image. This is partly due to that the image content may be changed by preprocessing or/and data embedding. Even if the corresponding plain-text pixel values are kept unchanged by lossless data hiding, the hidden data will be destroyed by outer processing. To address this issue, a lossless data hiding method called random element substitution (RES) is proposed for the Paillier cryptosystem by substituting the to-be-hidden bits for the random element of a cipher value. Moreover, the RES method is combined with another preprocessing-free algorithm to generate two schemes for lossless data hiding in encrypted images. With either scheme, a processed image will be obtained after the encrypted image undergoes processing in the homomorphic encrypted domain. Besides retrieving a part of the hidden data without image decryption, the data hidden with the RES method can be extracted after decryption, even after some processing has been conducted on encrypted images. The experimental results show the efficacy and superior performance of the proposed schemes.

16.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7621-7634, 2023 Oct.
Article in English | MEDLINE | ID: mdl-35130173

ABSTRACT

This work addresses unsupervised partial domain adaptation (PDA), in which classes in the target domain are a subset of the source domain. The key challenges of PDA are how to leverage source samples in the shared classes to promote positive transfer and filter out the irrelevant source samples to mitigate negative transfer. Existing PDA methods based on adversarial DA do not consider the loss of class discriminative representation. To this end, this article proposes a contrastive learning-assisted alignment (CLA) approach for PDA to jointly align distributions across domains for better adaptation and to reweight source instances to reduce the contribution of outlier instances. A contrastive learning-assisted conditional alignment (CLCA) strategy is presented for distribution alignment. CLCA first exploits contrastive losses to discover the class discriminative information in both domains. It then employs a contrastive loss to match the clusters across the two domains based on adversarial domain learning. In this respect, CLCA attempts to reduce the domain discrepancy by matching the class-conditional and marginal distributions. Moreover, a new reweighting scheme is developed to improve the quality of weights estimation, which explores information from both the source and the target domains. Empirical results on several benchmark datasets demonstrate that the proposed CLA outperforms the existing state-of-the-art PDA methods.

17.
IEEE Trans Neural Netw Learn Syst ; 34(9): 6530-6544, 2023 Sep.
Article in English | MEDLINE | ID: mdl-36094993

ABSTRACT

Heterogeneous attribute data composed of attributes with different types of values are quite common in a variety of real-world applications. As data annotation is usually expensive, clustering has provided a promising way for processing unlabeled data, where the adopted similarity measure plays a key role in determining the clustering accuracy. However, it is a very challenging task to appropriately define the similarity between data objects with heterogeneous attributes because the values from heterogeneous attributes are generally with very different characteristics. Specifically, numerical attributes are with quantitative values, while categorical attributes are with qualitative values. Furthermore, categorical attributes can be categorized into nominal and ordinal ones according to the order information of their values. To circumvent the awkward gap among the heterogeneous attributes, this article will propose a new dissimilarity metric for cluster analysis of such data. We first study the connections among the heterogeneous attributes and build graph representations for them. Then, a metric is proposed, which computes the dissimilarities between attribute values under the guidance of the graph structures. Finally, we develop a new k -means-type clustering algorithm associated with this proposed metric. It turns out that the proposed method is competent to perform cluster analysis of datasets composed of an arbitrary combination of numerical, nominal, and ordinal attributes. Experimental results show its efficacy in comparison with its counterparts.

18.
Curr Probl Cardiol ; 48(1): 101380, 2023 Jan.
Article in English | MEDLINE | ID: mdl-36031015

ABSTRACT

Immune checkpoint inhibitors (ICI) have known associations with cardiotoxicity. However, a representative quantification of the adverse cardiovascular events and cardiovascular attendances amongst Asian users of ICI has been lacking. This retrospective cohort study identified all ICI users in Hong Kong, China, between 2013 and 2021. All patients were followed up until the end of 2021 for the primary outcome of major adverse cardiovascular event (MACE; a composite of cardiovascular mortality, myocardial infarction, heart failure, and stroke). Patients with prior diagnosis of any component of MACE were excluded from all MACE analyses. In total, 4324 patients were analyzed (2905 (67.2%) males; median age 63.5 years old (interquartile range 55.4-70.7 years old); median follow-up 1.0 year (interquartile range 0.4-2.3 years)), of whom 153 were excluded from MACE analyses due to prior events. MACE occurred in 116 (2.8%) with an incidence rate (IR) of 1.7 [95% confidence interval: 1.4, 2.0] events per 100 patient-years; IR was higher within the first year of follow-up (2.9 [2.3, 3.5] events per 100 patient-years). Cardiovascular hospitalization(s) occurred in 188 (4.4%) with 254 episodes (0.5% of all episodes) and 1555 days of hospitalization (1.3% of all hospitalized days), for whom the IR of cardiovascular hospitalization was 5.6 [4.6, 6.9] episodes per 100 person-years with 52.9 [39.8, 70.3] days' stay per 100 person-years. Amongst Asian users of ICI, MACE was uncommon, and a small proportion of hospitalizations were cardiovascular in nature. Most MACE and cardiovascular hospitalizations occurred during the first year after initiating ICI.


Subject(s)
Cardiovascular Diseases , Heart Failure , Myocardial Infarction , Male , Humans , Middle Aged , Aged , Female , Immune Checkpoint Inhibitors/adverse effects , Retrospective Studies , Hospitalization , Heart Failure/epidemiology , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/etiology
19.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 4812-4825, 2023 04.
Article in English | MEDLINE | ID: mdl-35921338

ABSTRACT

For long-tailed distributed data, existing classification models often learn overwhelmingly on the head classes while ignoring the tail classes, resulting in poor generalization capability. To address this problem, we thereby propose a new approach in this paper, in which a key point sensitive (KPS) loss is presented to regularize the key points strongly to improve the generalization performance of the classification model. Meanwhile, in order to improve the performance on tail classes, the proposed KPS loss also assigns relatively large margins on tail classes. Furthermore, we propose a gradient adjustment (GA) optimization strategy to re-balance the gradients of positive and negative samples for each class. By virtue of the gradient analysis of the loss function, it is found that the tail classes always receive negative signals during training, which misleads the tail prediction to be biased towards the head. The proposed GA strategy can circumvent excessive negative signals on tail classes and further improve the overall classification accuracy. Extensive experiments conducted on long-tailed benchmarks show that the proposed method is capable of significantly improving the classification accuracy of the model in tail classes while maintaining competent performance in head classes.


Subject(s)
Algorithms , Learning
20.
Article in English | MEDLINE | ID: mdl-36107889

ABSTRACT

Despite the great success of the existing work in fine-grained visual categorization (FGVC), there are still several unsolved challenges, e.g., poor interpretation and vagueness contribution. To circumvent this drawback, motivated by the hypersphere embedding method, we propose a discriminative suprasphere embedding (DSE) framework, which can provide intuitive geometric interpretation and effectively extract discriminative features. Specifically, DSE consists of three modules. The first module is a suprasphere embedding (SE) block, which learns discriminative information by emphasizing weight and phase. The second module is a phase activation map (PAM) used to analyze the contribution of local descriptors to the suprasphere feature representation, which uniformly highlights the object region and exhibits remarkable object localization capability. The last module is a class contribution map (CCM), which quantitatively analyzes the network classification decision and provides insight into the domain knowledge about classified objects. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed method in comparison with state-of-the-art methods.

SELECTION OF CITATIONS
SEARCH DETAIL
...