Search | VHL Regional Portal

1.

Automatic offline-capable smartphone paper-based microfluidic device for efficient biomarker detection of Alzheimer's disease.

Duan, Sixuan; Cai, Tianyu; Liu, Fuyuan; Li, Yifan; Yuan, Hang; Yuan, Wenwen; Huang, Kaizhu; Hoettges, Kai; Chen, Min; Lim, Eng Gee; Zhao, Chun; Song, Pengfei.

Anal Chim Acta ; 1308: 342575, 2024 Jun 15.

Article in English | MEDLINE | ID: mdl-38740448

ABSTRACT

BACKGROUND: Alzheimer's disease (AD) is a prevalent neurodegenerative disease with no effective treatment. Efficient and rapid detection plays a crucial role in mitigating and managing AD progression. Deep learning-assisted smartphone-based microfluidic paper analysis devices (µPADs) offer the advantages of low cost, good sensitivity, and rapid detection, providing a strategic pathway to address large-scale disease screening in resource-limited areas. However, existing smartphone-based detection platforms usually rely on large devices or cloud servers for data transfer and processing. Additionally, the implementation of automated colorimetric enzyme-linked immunoassay (c-ELISA) on µPADs can further facilitate the realization of smartphone µPADs platforms for efficient disease detection. RESULTS: This paper introduces a new deep learning-assisted offline smartphone platform for early AD screening, offering rapid disease detection in low-resource areas. The proposed platform features a simple mechanical rotating structure controlled by a smartphone, enabling fully automated c-ELISA on µPADs. Our platform successfully applied sandwich c-ELISA for detecting the ß-amyloid peptide 1-42 (Aß 1-42, a crucial AD biomarker) and demonstrated its efficacy in 38 artificial plasma samples (healthy: 19, unhealthy: 19, N = 6). Moreover, we employed the YOLOv5 deep learning model and achieved an impressive 97 % accuracy on a dataset of 1824 images, which is 10.16 % higher than the traditional method of curve-fitting results. The trained YOLOv5 model was seamlessly integrated into the smartphone using the NCNN (Tencent's Neural Network Inference Framework), enabling deep learning-assisted offline detection. A user-friendly smartphone application was developed to control the entire process, realizing a streamlined "samples in, answers out" approach. SIGNIFICANCE: This deep learning-assisted, low-cost, user-friendly, highly stable, and rapid-response automated offline smartphone-based detection platform represents a good advancement in point-of-care testing (POCT). Moreover, our platform provides a feasible approach for efficient AD detection by examining the level of Aß 1-42, particularly in areas with low resources and limited communication infrastructure.

Subject(s)

Alzheimer Disease , Amyloid beta-Peptides , Biomarkers , Enzyme-Linked Immunosorbent Assay , Paper , Smartphone , Alzheimer Disease/diagnosis , Alzheimer Disease/blood , Humans , Biomarkers/blood , Biomarkers/analysis , Amyloid beta-Peptides/analysis , Amyloid beta-Peptides/blood , Peptide Fragments/blood , Peptide Fragments/analysis , Lab-On-A-Chip Devices , Deep Learning , Automation , Microfluidic Analytical Techniques/instrumentation

2.

Instance-Specific Model Perturbation Improves Generalized Zero-Shot Learning.

Yang, Guanyu; Huang, Kaizhu; Zhang, Rui; Yang, Xi.

Neural Comput ; 36(5): 936-962, 2024 Apr 23.

Article in English | MEDLINE | ID: mdl-38457762

ABSTRACT

Zero-shot learning (ZSL) refers to the design of predictive functions on new classes (unseen classes) of data that have never been seen during training. In a more practical scenario, generalized zero-shot learning (GZSL) requires predicting both seen and unseen classes accurately. In the absence of target samples, many GZSL models may overfit training data and are inclined to predict individuals as categories that have been seen in training. To alleviate this problem, we develop a parameter-wise adversarial training process that promotes robust recognition of seen classes while designing during the test a novel model perturbation mechanism to ensure sufficient sensitivity to unseen classes. Concretely, adversarial perturbation is conducted on the model to obtain instance-specific parameters so that predictions can be biased to unseen classes in the test. Meanwhile, the robust training encourages the model robustness, leading to nearly unaffected prediction for seen classes. Moreover, perturbations in the parameter space, computed from multiple individuals simultaneously, can be used to avoid the effect of perturbations that are too extreme and ruin the predictions. Comparison results on four benchmark ZSL data sets show the effective improvement that the proposed framework made on zero-shot methods with learned metrics.

3.

EgPDE-Net: Building Continuous Neural Networks for Time Series Prediction With Exogenous Variables.

Gao, Penglei; Yang, Xi; Zhang, Rui; Guo, Ping; Goulermas, John Y; Huang, Kaizhu.

IEEE Trans Cybern ; PP2024 Feb 28.

Article in English | MEDLINE | ID: mdl-38416628

ABSTRACT

While exogenous variables have a major impact on performance improvement in time series analysis, interseries correlation and time dependence among them are rarely considered in the present continuous methods. The dynamical systems of multivariate time series could be modeled with complex unknown partial differential equations (PDEs) which play a prominent role in many disciplines of science and engineering. In this article, we propose a continuous-time model for arbitrary-step prediction to learn an unknown PDE system in multivariate time series whose governing equations are parameterized by self-attention and gated recurrent neural networks. The proposed model, exogenous-guided PDE network (EgPDE-Net), takes account of the relationships among the exogenous variables and their effects on the target series. Importantly, the model can be reduced into a regularized ordinary differential equation (ODE) problem with specially designed regularization guidance, which makes the PDE problem tractable to obtain numerical solutions and feasible to predict multiple future values of the target series at arbitrary time points. Extensive experiments demonstrate that our proposed model could achieve competitive accuracy over strong baselines: on average, it outperforms the best baseline by reducing 9.85% on RMSE and 13.98% on MAE for arbitrary-step prediction.

4.

Perturbation diversity certificates robust generalization.

Qian, Zhuang; Zhang, Shufei; Huang, Kaizhu; Wang, Qiufeng; Yi, Xinping; Gu, Bin; Xiong, Huan.

Neural Netw ; 172: 106117, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38232423

ABSTRACT

Whilst adversarial training has been proven to be one most effective defending method against adversarial attacks for deep neural networks, it suffers from over-fitting on training adversarial data and thus may not guarantee the robust generalization. This may result from the fact that the conventional adversarial training methods generate adversarial perturbations usually in a supervised way so that the resulting adversarial examples are highly biased towards the decision boundary, leading to an inhomogeneous data distribution. To mitigate this limitation, we propose to generate adversarial examples from a perturbation diversity perspective. Specifically, the generated perturbed samples are not only adversarial but also diverse so as to certify robust generalization and significant robustness improvement through a homogeneous data distribution. We provide theoretical and empirical analysis, establishing a foundation to support the proposed method. As a major contribution, we prove that promoting perturbations diversity can lead to a better robust generalization bound. To verify our methods' effectiveness, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW). Experimental results show that our method outperforms other state-of-the-art (e.g., PGD and Feature Scattering) in robust generalization performance.

Subject(s)

Generalization, Psychological , Neural Networks, Computer

5.

Rebalanced Zero-Shot Learning.

Ye, Zihan; Yang, Guanyu; Jin, Xiaobo; Liu, Youfa; Huang, Kaizhu.

IEEE Trans Image Process ; 32: 4185-4198, 2023.

Article in English | MEDLINE | ID: mdl-37467099

ABSTRACT

Zero-shot learning (ZSL) aims to identify unseen classes with zero samples during training. Broadly speaking, present ZSL methods usually adopt class-level semantic labels and compare them with instance-level semantic predictions to infer unseen classes. However, we find that such existing models mostly produce imbalanced semantic predictions, i.e. these models could perform precisely for some semantics, but may not for others. To address the drawback, we aim to introduce an imbalanced learning framework into ZSL. However, we find that imbalanced ZSL has two unique challenges: (1) Its imbalanced predictions are highly correlated with the value of semantic labels rather than the number of samples as typically considered in the traditional imbalanced learning; (2) Different semantics follow quite different error distributions between classes. To mitigate these issues, we first formalize ZSL as an imbalanced regression problem which offers empirical evidences to interpret how semantic labels lead to imbalanced semantic predictions. We then propose a re-weighted loss termed Re-balanced Mean-Squared Error (ReMSE), which tracks the mean and variance of error distributions, thus ensuring rebalanced learning across classes. As a major contribution, we conduct a series of analyses showing that ReMSE is theoretically well established. Extensive experiments demonstrate that the proposed method effectively alleviates the imbalance in semantic prediction and outperforms many state-of-the-art ZSL methods.

6.

Machine learning and 3D bioprinting.

Sun, Jie; Yao, Kai; An, Jia; Jing, Linzhi; Huang, Kaizhu; Huang, Dejian.

Int J Bioprint ; 9(4): 717, 2023.

Article in English | MEDLINE | ID: mdl-37323491

ABSTRACT

48With the growing number of biomaterials and printing technologies, bioprinting has brought about tremendous potential to fabricate biomimetic architectures or living tissue constructs. To make bioprinting and bioprinted constructs more powerful, machine learning (ML) is introduced to optimize the relevant processes, applied materials, and mechanical/biological performances. The objectives of this work were to collate, analyze, categorize, and summarize published articles and papers pertaining to ML applications in bioprinting and their impact on bioprinted constructs, as well as the directions of potential development. From the available references, both traditional ML and deep learning (DL) have been applied to optimize the printing process, structural parameters, material properties, and biological/mechanical performance of bioprinted constructs. The former uses features extracted from image or numerical data as inputs in prediction model building, and the latter uses the image directly for segmentation or classification model building. All of these studies present advanced bioprinting with a stable and reliable printing process, desirable fiber/droplet diameter, and precise layer stacking, and also enhance the bioprinted constructs with better design and cell performance. The current challenges and outlooks in developing process-material-performance models are highlighted, which may pave the way for revolutionizing bioprinting technologies and bioprinted construct design.

7.

Mind the Gap: Alleviating Local Imbalance for Unsupervised Cross-Modality Medical Image Segmentation.

Su, Zixian; Yao, Kai; Yang, Xi; Wang, Qiufeng; Yan, Yuyao; Sun, Jie; Huang, Kaizhu.

IEEE J Biomed Health Inform ; 27(7): 3396-3407, 2023 Jul.

Article in English | MEDLINE | ID: mdl-37134027

ABSTRACT

Unsupervised cross-modality medical image adaptation aims to alleviate the severe domain gap between different imaging modalities without using the target domain label. A key in this campaign relies upon aligning the distributions of source and target domain. One common attempt is to enforce the global alignment between two domains, which, however, ignores the fatal local-imbalance domain gap problem, i.e., some local features with larger domain gap are harder to transfer. Recently, some methods conduct alignment focusing on local regions to improve the efficiency of model learning. While this operation may cause a deficiency of critical information from contexts. To tackle this limitation, we propose a novel strategy to alleviate the domain gap imbalance considering the characteristics of medical images, namely Global-Local Union Alignment. Specifically, a feature-disentanglement style-transfer module first synthesizes the target-like source images to reduce the global domain gap. Then, a local feature mask is integrated to reduce the 'inter-gap' for local features by prioritizing those discriminative features with larger domain gap. This combination of global and local alignment can precisely localize the crucial regions in segmentation target while preserving the overall semantic consistency. We conduct a series of experiments with two cross-modality adaptation tasks, i,e. cardiac substructure and abdominal multi-organ segmentation. Experimental results indicate that our method achieves state-of-the-art performance in both tasks.

Subject(s)

Heart , Semantics , Humans , Image Processing, Computer-Assisted

8.

Generalized image outpainting with U-transformer.

Gao, Penglei; Yang, Xi; Zhang, Rui; Goulermas, John Y; Geng, Yujie; Yan, Yuyao; Huang, Kaizhu.

Neural Netw ; 162: 1-10, 2023 May.

Article in English | MEDLINE | ID: mdl-36878166

ABSTRACT

In this paper, we develop a novel transformer-based generative adversarial neural network called U-Transformer for generalized image outpainting problems. Different from most present image outpainting methods conducting horizontal extrapolation, our generalized image outpainting could extrapolate visual context all-side around a given image with plausible structure and details even for complicated scenery, building, and art images. Specifically, we design a generator as an encoder-to-decoder structure embedded with the popular Swin Transformer blocks. As such, our novel neural network can better cope with image long-range dependencies which are crucially important for generalized image outpainting. We propose additionally a U-shaped structure and multi-view Temporal Spatial Predictor (TSP) module to reinforce image self-reconstruction as well as unknown-part prediction smoothly and realistically. By adjusting the predicting step in the TSP module in the testing stage, we can generate arbitrary outpainting size given the input sub-image. We experimentally demonstrate that our proposed method could produce visually appealing results for generalized image outpainting against the state-of-the-art image outpainting approaches.

Subject(s)

Image Processing, Computer-Assisted , Neural Networks, Computer

9.

Deep learning-assisted ultra-accurate smartphone testing of paper-based colorimetric ELISA assays.

Duan, Sixuan; Cai, Tianyu; Zhu, Jia; Yang, Xi; Lim, Eng Gee; Huang, Kaizhu; Hoettges, Kai; Zhang, Quan; Fu, Hao; Guo, Qiang; Liu, Xinyu; Yang, Zuming; Song, Pengfei.

Anal Chim Acta ; 1248: 340868, 2023 Apr 01.

Article in English | MEDLINE | ID: mdl-36813452

ABSTRACT

Smartphone has long been considered as one excellent platform for disease screening and diagnosis, especially when combined with microfluidic paper-based analytical devices (µPADs) that feature low cost, ease of use, and pump-free operations. In this paper, we report a deep learning-assisted smartphone platform for ultra-accurate testing of paper-based microfluidic colorimetric enzyme-linked immunosorbent assay (c-ELISA). Different from existing smartphone-based µPAD platforms, whose sensing reliability is suffered from uncontrolled ambient lighting conditions, our platform is able to eliminate those random lighting influences for enhanced sensing accuracy. We first constructed a dataset that contains c-ELISA results (n = 2048) of rabbit IgG as the model target on µPADs under eight controlled lighting conditions. Those images are then used to train four different mainstream deep learning algorithms. By training with these images, the deep learning algorithms can well eliminate the influences of lighting conditions. Among them, the GoogLeNet algorithm gives the highest accuracy (>97%) in quantitative rabbit IgG concentration classification/prediction, which also provides 4% higher area under curve (AUC) value than that of the traditional curve fitting results analysis method. In addition, we fully automate the whole sensing process and achieve the "image in, answer out" to maximize the convenience of the smartphone. A simple and user-friendly smartphone application has been developed that controls the whole process. This newly developed platform further enhances the sensing performance of µPADs for use by laypersons in low-resource areas and can be facilely adapted to the real disease protein biomarkers detection by c-ELISA on µPADs.

Subject(s)

Deep Learning , Microfluidic Analytical Techniques , Smartphone , Colorimetry , Reproducibility of Results , Enzyme-Linked Immunosorbent Assay , Immunoglobulin G , Paper

10.

FastAdaBelief: Improving Convergence Rate for Belief-Based Adaptive Optimizers by Exploiting Strong Convexity.

Zhou, Yangfan; Huang, Kaizhu; Cheng, Cheng; Wang, Xuguang; Hussain, Amir; Liu, Xin.

IEEE Trans Neural Netw Learn Syst ; 34(9): 6515-6529, 2023 Sep.

Article in English | MEDLINE | ID: mdl-35271450

ABSTRACT

AdaBelief, one of the current best optimizers, demonstrates superior generalization ability over the popular Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is theoretically appealing in which it has a data-dependent O(âT) regret bound when objective functions are convex, where T is a time horizon. It remains, however, an open problem whether the convergence rate can be further improved without sacrificing its generalization ability. To this end, we make the first attempt in this work and design a novel optimization algorithm called FastAdaBelief that aims to exploit its strong convexity in order to achieve an even faster convergence rate. In particular, by adjusting the step size that better considers strong convexity and prevents fluctuation, our proposed FastAdaBelief demonstrates excellent generalization ability and superior convergence. As an important theoretical contribution, we prove that FastAdaBelief attains a data-dependent O(logT) regret bound, which is substantially lower than AdaBelief in strongly convex cases. On the empirical side, we validate our theoretical analysis with extensive experiments in scenarios of strong convexity and nonconvexity using three popular baseline models. Experimental results are very encouraging: FastAdaBelief converges the quickest in comparison to all mainstream algorithms while maintaining an excellent generalization ability, in cases of both strong convexity or nonconvexity. FastAdaBelief is, thus, posited as a new benchmark model for the research community.

11.

Multi-Scope Feature Extraction for Intracranial Aneurysm 3D Point Cloud Completion.

Ma, Wuwei; Yang, Xi; Wang, Qiufeng; Huang, Kaizhu; Huang, Xiaowei.

Cells ; 11(24)2022 12 17.

Article in English | MEDLINE | ID: mdl-36552872

ABSTRACT

3D point clouds are gradually becoming more widely used in the medical field, however, they are rarely used for 3D representation of intracranial vessels and aneurysms due to the time-consuming data reconstruction. In this paper, we simulate the incomplete intracranial vessels (including aneurysms) in the actual collection from different angles, then propose Multi-Scope Feature Extraction Network (MSENet) for Intracranial Aneurysm 3D Point Cloud Completion. MSENet adopts a multi-scope feature extraction encoder to extract the global features from the incomplete point cloud. This encoder utilizes different scopes to fuse the neighborhood information for each point fully. Then a folding-based decoder is applied to obtain the complete 3D shape. To enable the decoder to intuitively match the original geometric structure, we engage the original points coordinates input to perform residual linking. Finally, we merge and sample the complete but coarse point cloud from the decoder to obtain the final refined complete 3D point cloud shape. We conduct extensive experiments on both 3D intracranial aneurysm datasets and general 3D vision PCN datasets. The results demonstrate the effectiveness of the proposed method on three evaluation metrics compared to baseline: our model increases the F-score to 0.379 (+21.1%)/0.320 (+7.7%), reduces Chamfer Distance score to 0.998 (-33.8%)/0.974 (-6.4%), and reduces the Earth Mover's Distance to 2.750 (17.8%)/2.858 (-0.8%).

Subject(s)

Intracranial Aneurysm , Humans

12.

Learning Disentangled Graph Convolutional Networks Locally and Globally.

Guo, Jingwei; Huang, Kaizhu; Yi, Xinping; Zhang, Rui.

IEEE Trans Neural Netw Learn Syst ; PP2022 Aug 23.

Article in English | MEDLINE | ID: mdl-35969544

ABSTRACT

Graph convolutional networks (GCNs) emerge as the most successful learning models for graph-structured data. Despite their success, existing GCNs usually ignore the entangled latent factors typically arising in real-world graphs, which results in nonexplainable node representations. Even worse, while the emphasis has been placed on local graph information, the global knowledge of the entire graph is lost to a certain extent. In this work, to address these issues, we propose a novel framework for GCNs, termed LGD-GCN, taking advantage of both local and global information for disentangling node representations in the latent space. Specifically, we propose to represent a disentangled latent continuous space with a statistical mixture model, by leveraging neighborhood routing mechanism locally. From the latent space, various new graphs can then be disentangled and learned, to overall reflect the hidden structures with respect to different factors. On the one hand, a novel regularizer is designed to encourage interfactor diversity for model expressivity in the latent space. On the other hand, the factor-specific information is encoded globally via employing a message passing along these new graphs, in order to strengthen intrafactor consistency. Extensive evaluations on both synthetic and five benchmark datasets show that LGD-GCN brings significant performance gains over the recent competitive models in both disentangling and node classification. Particularly, LGD-GCN is able to outperform averagely the disentangled state-of-the-arts by 7.4% on social network datasets.

13.

Multi-Model Running Latency Optimization in an Edge Computing Paradigm.

Li, Peisong; Wang, Xinheng; Huang, Kaizhu; Huang, Yi; Li, Shancang; Iqbal, Muddesar.

Sensors (Basel) ; 22(16)2022 Aug 15.

Article in English | MEDLINE | ID: mdl-36015856

ABSTRACT

Recent advances in both lightweight deep learning algorithms and edge computing increasingly enable multiple model inference tasks to be conducted concurrently on resource-constrained edge devices, allowing us to achieve one goal collaboratively rather than getting high quality in each standalone task. However, the high overall running latency for performing multi-model inferences always negatively affects the real-time applications. To combat latency, the algorithms should be optimized to minimize the latency for multi-model deployment without compromising the safety-critical situation. This work focuses on the real-time task scheduling strategy for multi-model deployment and investigating the model inference using an open neural network exchange (ONNX) runtime engine. Then, an application deployment strategy is proposed based on the container technology and inference tasks are scheduled to different containers based on the scheduling strategies. Experimental results show that the proposed solution is able to significantly reduce the overall running latency in real-time applications.

Subject(s)

Neural Networks, Computer , Running , Algorithms

14.

Guest Editorial: Advances in Deep Learning for Clinical and Healthcare Applications.

Ieracitano, Cosimo; Morabito, Francesco Carlo; Squartini, Stefano; Huang, Kaizhu; Li, Xuelong; Mahmud, Mufti.

Cognit Comput ; : 1-3, 2022 Aug 17.

Article in English | MEDLINE | ID: mdl-35991007

15.

A Novel 3D Unsupervised Domain Adaptation Framework for Cross-Modality Medical Image Segmentation.

Yao, Kai; Su, Zixian; Huang, Kaizhu; Yang, Xi; Sun, Jie; Hussain, Amir; Coenen, Frans.

IEEE J Biomed Health Inform ; 26(10): 4976-4986, 2022 10.

Article in English | MEDLINE | ID: mdl-35324451

ABSTRACT

We consider the problem of volumetric (3D) unsupervised domain adaptation (UDA) in cross-modality medical image segmentation, aiming to perform segmentation on the unannotated target domain (e.g. MRI) with the help of labeled source domain (e.g. CT). Previous UDA methods in medical image analysis usually suffer from two challenges: 1) they focus on processing and analyzing data at 2D level only, thus missing semantic information from the depth level; 2) one-to-one mapping is adopted during the style-transfer process, leading to insufficient alignment in the target domain. Different from the existing methods, in our work, we conduct a first of its kind investigation on multi-style image translation for complete image alignment to alleviate the domain shift problem, and also introduce 3D segmentation in domain adaptation tasks to maintain semantic consistency at the depth level. In particular, we develop an unsupervised domain adaptation framework incorporating a novel quartet self-attention module to efficiently enhance relationships between widely separated features in spatial regions on a higher dimension, leading to a substantial improvement in segmentation accuracy in the unlabeled target domain. In two challenging cross-modality tasks, specifically brain structures and multi-organ abdominal segmentation, our model is shown to outperform current state-of-the-art methods by a significant margin, demonstrating its potential as a benchmark resource for the biomedical and health informatics research community.

Subject(s)

Abdomen , Magnetic Resonance Imaging , Brain/diagnostic imaging , Humans , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods

16.

Analyzing Cell-Scaffold Interaction through Unsupervised 3D Nuclei Segmentation.

Yao, Kai; Sun, Jie; Huang, Kaizhu; Jing, Linzhi; Liu, Hang; Huang, Dejian; Jude, Curran.

Int J Bioprint ; 8(1): 495, 2022.

Article in English | MEDLINE | ID: mdl-35187282

ABSTRACT

Fibrous scaffolds have been extensively used in three-dimensional (3D) cell culture systems to establish in vitro models in cell biology, tissue engineering, and drug screening. It is a common practice to characterize cell behaviors on such scaffolds using confocal laser scanning microscopy (CLSM). As a noninvasive technology, CLSM images can be utilized to describe cell-scaffold interaction under varied morphological features, biomaterial composition, and internal structure. Unfortunately, such information has not been fully translated and delivered to researchers due to the lack of effective cell segmentation methods. We developed herein an end-to-end model called Aligned Disentangled Generative Adversarial Network (AD-GAN) for 3D unsupervised nuclei segmentation of CLSM images. AD-GAN utilizes representation disentanglement to separate content representation (the underlying nuclei spatial structure) from style representation (the rendering of the structure) and align the disentangled content in the latent space. The CLSM images collected from fibrous scaffold-based culturing A549, 3T3, and HeLa cells were utilized for nuclei segmentation study. Compared with existing commercial methods such as Squassh and CellProfiler, our AD-GAN can effectively and efficiently distinguish nuclei with the preserved shape and location information. Building on such information, we can rapidly screen cell-scaffold interaction in terms of adhesion, migration and proliferation, so as to improve scaffold design.

17.

Manifold adversarial training for supervised and semi-supervised learning.

Zhang, Shufei; Huang, Kaizhu; Zhu, Jianke; Liu, Yang.

Neural Netw ; 140: 282-293, 2021 Aug.

Article in English | MEDLINE | ID: mdl-33839600

ABSTRACT

We propose a new regularization method for deep learning based on the manifold adversarial training (MAT). Unlike previous regularization and adversarial training methods, MAT further considers the local manifold of latent representations. Specifically, MAT manages to build an adversarial framework based on how the worst perturbation could affect the statistical manifold in the latent space rather than the output space. Particularly, a latent feature space with the Gaussian Mixture Model (GMM) is first derived in a deep neural network. We then define the smoothness by the largest variation of Gaussian mixtures when a local perturbation is given around the input data point. On one hand, the perturbations are added in the way that would rough the statistical manifold of the latent space the worst. On the other hand, the model is trained to promote the manifold smoothness the most in the latent space. Importantly, since the latent space is more informative than the output space, the proposed MAT can learn a more robust and compact data representation, leading to further performance improvement. The proposed MAT is important in that it can be considered as a superset of one recently-proposed discriminative feature learning approach called center loss. We conduct a series of experiments in both supervised and semi-supervised learning on four benchmark data sets, showing that the proposed MAT can achieve remarkable performance, much better than those of the state-of-the-art approaches. In addition, we present a series of visualization which could generate further understanding or explanation on adversarial examples.

Subject(s)

Supervised Machine Learning/standards , Benchmarking

18.

A Multipath Fusion Strategy Based Single Shot Detector.

Qu, Shuyi; Huang, Kaizhu; Hussain, Amir; Goulermas, Yannis.

Sensors (Basel) ; 21(4)2021 Feb 15.

Article in English | MEDLINE | ID: mdl-33671859

ABSTRACT

Object detection has wide applications in intelligent systems and sensor applications. Compared with two stage detectors, recent one stage counterparts are capable of running more efficiently with comparable accuracy, which satisfy the requirement of real-time processing. To further improve the accuracy of one stage single shot detector (SSD), we propose a novel Multi-Path fusion Single Shot Detector (MPSSD). Different from other feature fusion methods, we exploit the connection among different scale representations in a pyramid manner. We propose feature fusion module to generate new feature pyramids based on multiscale features in SSD, and these pyramids are sent to our pyramid aggregation module for generating final features. These enhanced features have both localization and semantics information, thus improving the detection performance with little computation cost. A series of experiments on three benchmark datasets PASCAL VOC2007, VOC2012, and MS COCO demonstrate that our approach outperforms many state-of-the-art detectors both qualitatively and quantitatively. In particular, for input images with size 512 × 512, our method attains mean Average Precision (mAP) of 81.8% on VOC2007 test, 80.3% on VOC2012 test, and 33.1% mAP on COCO test-dev 2015.

19.

Automated Social Text Annotation With Joint Multilabel Attention Networks.

Dong, Hang; Wang, Wei; Huang, Kaizhu; Coenen, Frans.

IEEE Trans Neural Netw Learn Syst ; 32(5): 2224-2238, 2021 05.

Article in English | MEDLINE | ID: mdl-32584774

ABSTRACT

Automated social text annotation is the task of suggesting a set of tags for shared documents on social media platforms. The automated annotation process can reduce users' cognitive overhead in tagging and improve tag management for better search, browsing, and recommendation of documents. It can be formulated as a multilabel classification problem. We propose a novel deep learning-based method for this problem and design an attention-based neural network with semantic-based regularization, which can mimic users' reading and annotation behavior to formulate better document representation, leveraging the semantic relations among labels. The network separately models the title and the content of each document and injects an explicit, title-guided attention mechanism into each sentence. To exploit the correlation among labels, we propose two semantic-based loss regularizers, i.e., similarity and subsumption, which enforce the output of the network to conform to label semantics. The model with the semantic-based loss regularizers is referred to as the joint multilabel attention network (JMAN). We conducted a comprehensive evaluation study and compared JMAN to the state-of-the-art baseline models, using four large, real-world social media data sets. In terms of F1 , JMAN significantly outperformed bidirectional gated recurrent unit (Bi-GRU) relatively by around 12.8%-78.6% and the hierarchical attention network (HAN) by around 3.9%-23.8%. The JMAN model demonstrates advantages in convergence and training speed. Further improvement of performance was observed against latent Dirichlet allocation (LDA) and support vector machine (SVM). When applying the semantic-based loss regularizers, the performance of HAN and Bi-GRU in terms of F1 was also boosted. It is also found that dynamic update of the label semantic matrices (JMANd) has the potential to further improve the performance of JMAN but at the cost of substantial memory and warrants further study.

20.

Compressing Deep Networks by Neuron Agglomerative Clustering.

Wang, Li-Na; Liu, Wenxue; Liu, Xiang; Zhong, Guoqiang; Roy, Partha Pratim; Dong, Junyu; Huang, Kaizhu.

Sensors (Basel) ; 20(21)2020 Oct 23.

Article in English | MEDLINE | ID: mdl-33114078

ABSTRACT

In recent years, deep learning models have achieved remarkable successes in various applications, such as pattern recognition, computer vision, and signal processing. However, high-performance deep architectures are often accompanied by a large storage space and long computational time, which make it difficult to fully exploit many deep neural networks (DNNs), especially in scenarios in which computing resources are limited. In this paper, to tackle this problem, we introduce a method for compressing the structure and parameters of DNNs based on neuron agglomerative clustering (NAC). Specifically, we utilize the agglomerative clustering algorithm to find similar neurons, while these similar neurons and the connections linked to them are then agglomerated together. Using NAC, the number of parameters and the storage space of DNNs are greatly reduced, without the support of an extra library or hardware. Extensive experiments demonstrate that NAC is very effective for the neuron agglomeration of both the fully connected and convolutional layers, which are common building blocks of DNNs, delivering similar or even higher network accuracy. Specifically, on the benchmark CIFAR-10 and CIFAR-100 datasets, using NAC to compress the parameters of the original VGGNet by 92.96% and 81.10%, respectively, the compact network obtained still outperforms the original networks.

Subject(s)

Cluster Analysis , Data Compression , Neural Networks, Computer , Neurons , Algorithms

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL