Search | VHL Regional Portal

1.

Federated Learning With Only Positive Labels by Exploring Label Correlations.

An, Xuming; Wang, Dui; Shen, Li; Luo, Yong; Hu, Han; Du, Bo; Wen, Yonggang; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; PP2024 May 14.

Article in English | MEDLINE | ID: mdl-38743541

ABSTRACT

Federated learning (FL) aims to collaboratively learn a model by using the data from multiple users under privacy constraints. In this article, we study the multilabel classification (MLC) problem under the FL setting, where trivial solution and extremely poor performance may be obtained, especially when only positive data with respect to a single class label is provided for each client. This issue can be addressed by adding a specially designed regularizer on the server side. Although effective sometimes, the label correlations are simply ignored and thus suboptimal performance may be obtained. Besides, it is expensive and unsafe to exchange user's private embeddings between server and clients frequently, especially when training model in the contrastive way. To remedy these drawbacks, we propose a novel and generic method termed federated averaging (FedAvg) by exploring label correlations (FedALCs). Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training. To further improve the safety and also reduce the communication overhead, we propose a variant to learn fixed class embedding for each client, so that the server and clients only need to exchange class embeddings once. Extensive experiments on multiple popular datasets demonstrate that our FedALC can significantly outperform the existing counterparts.

2.

Automatic Transformation Search Against Deep Leakage From Gradients.

Gao, Wei; Zhang, Xu; Guo, Shangwei; Zhang, Tianwei; Xiang, Tao; Qiu, Han; Wen, Yonggang; Liu, Yang.

IEEE Trans Pattern Anal Mach Intell ; 45(9): 10650-10668, 2023 Sep.

Article in English | MEDLINE | ID: mdl-37030873

ABSTRACT

Collaborative learning has gained great popularity due to its benefit of data privacy protection: participants can jointly train a Deep Learning model without sharing their training sets. However, recent works discovered that an adversary can fully recover the sensitive training samples from the shared gradients. Such reconstruction attacks pose severe threats to collaborative learning. Hence, effective mitigation solutions are urgently desired. In this paper, we systematically analyze existing reconstruction attacks and propose to leverage data augmentation to defeat these attacks: by preprocessing sensitive images with carefully-selected transformation policies, it becomes infeasible for the adversary to extract training samples from the corresponding gradients. We first design two new metrics to quantify the impacts of transformations on data privacy and model usability. With the two metrics, we design a novel search method to automatically discover qualified policies from a given data augmentation library. Our defense method can be further combined with existing collaborative training systems without modifying the training protocols. We conduct comprehensive experiments on various system settings. Evaluation results demonstrate that the policies discovered by our method can defeat state-of-the-art reconstruction attacks in collaborative learning, with high efficiency and negligible impact on the model performance.

3.

Distributed Energy Trading and Scheduling Among Microgrids via Multiagent Reinforcement Learning.

Gao, Guanyu; Wen, Yonggang; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; 34(12): 10638-10652, 2023 Dec.

Article in English | MEDLINE | ID: mdl-35552143

ABSTRACT

Renewable energy technologies empower microgrids to generate electricity to supply themselves and trade with others. Under this paradigm, microgrids have become autonomous entities that must intelligently determine their policies for energy trading and scheduling. Many factors influence a microgrid's decision-making, such as the complex microgrid infrastructure, the uncertain energy yield and demand, and the competition among the energy market players. These factors are usually hard to precisely model, and deriving the optimal policy for a microgrid is challenging. We propose a multiagent reinforcement learning (MARL) approach with an attention mechanism to learn the optimal policies for the microgrids without complex system modeling. We model each microgrid as an autonomous agent, which learns how to schedule energy resources and trade with others by collaborating with other agents. We adopt attention mechanism to enable intelligently selecting contextual information for the training of each agent. After training, an agent can make control decisions using only its local information, which can well preserve the microgrids' privacy and reduce the communication overhead among microgrids to facilitate distributed control. We implement a simulation environment and evaluate the performances of our proposed method using real-world datasets. The experimental results show that our method can significantly reduce the cost of the microgrids compared with the baseline methods.

4.

Machine learning for a sustainable energy future.

Yao, Zhenpeng; Lum, Yanwei; Johnston, Andrew; Mejia-Mendoza, Luis Martin; Zhou, Xin; Wen, Yonggang; Aspuru-Guzik, Alán; Sargent, Edward H; Seh, Zhi Wei.

Nat Rev Mater ; 8(3): 202-215, 2023.

Article in English | MEDLINE | ID: mdl-36277083

ABSTRACT

Transitioning from fossil fuels to renewable energy sources is a critical global challenge; it demands advances - at the materials, devices and systems levels - for the efficient harvesting, storage, conversion and management of renewable energy. Energy researchers have begun to incorporate machine learning (ML) techniques to accelerate these advances. In this Perspective, we highlight recent advances in ML-driven energy research, outline current and future challenges, and describe what is required to make the best use of ML techniques. We introduce a set of key performance indicators with which to compare the benefits of different ML-accelerated workflows for energy research. We discuss and evaluate the latest advances in applying ML to the development of energy harvesting (photovoltaics), storage (batteries), conversion (electrocatalysis) and management (smart grids). Finally, we offer an overview of potential research areas in the energy field that stand to benefit further from the application of ML.

5.

Not All Instances Contribute Equally: Instance-Adaptive Class Representation Learning for Few-Shot Visual Recognition.

Han, Mengya; Zhan, Yibing; Luo, Yong; Du, Bo; Hu, Han; Wen, Yonggang; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; PP2022 Sep 22.

Article in English | MEDLINE | ID: mdl-36136920

ABSTRACT

Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances. Many few-shot visual recognition methods adopt the metric-based meta-learning paradigm by comparing the query representation with class representations to predict the category of query instance. However, the current metric-based methods generally treat all instances equally and consequently often obtain biased class representation, considering not all instances are equally significant when summarizing the instance-level representations for the class-level representation. For example, some instances may contain unrepresentative information, such as too much background and information of unrelated concepts, which skew the results. To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition. Specifically, we develop an adaptive instance revaluing network (AIRN) with the capability to address the biased representation issue when generating the class representation, by learning and assigning adaptive weights for different instances according to their relative significance in the support set of corresponding class. In addition, we design an improved bilinear instance representation and incorporate two novel structural losses, i.e., intraclass instance clustering loss and interclass representation distinguishing loss, to further regulate the instance revaluation process and refine the class representation. We conduct extensive experiments on four commonly adopted few-shot benchmarks: miniImageNet, tieredImageNet, CIFAR-FS, and FC100 datasets. The experimental results compared with the state-of-the-art approaches demonstrate the superiority of our ICRL-Net.

6.

Machine Learning: An Advanced Platform for Materials Development and State Prediction in Lithium-Ion Batteries.

Lv, Chade; Zhou, Xin; Zhong, Lixiang; Yan, Chunshuang; Srinivasan, Madhavi; Seh, Zhi Wei; Liu, Chuntai; Pan, Hongge; Li, Shuzhou; Wen, Yonggang; Yan, Qingyu.

Adv Mater ; 34(25): e2101474, 2022 Jun.

Article in English | MEDLINE | ID: mdl-34490683

ABSTRACT

Lithium-ion batteries (LIBs) are vital energy-storage devices in modern society. However, the performance and cost are still not satisfactory in terms of energy density, power density, cycle life, safety, etc. To further improve the performance of batteries, traditional "trial-and-error" processes require a vast number of tedious experiments. Computational chemistry and artificial intelligence (AI) can significantly accelerate the research and development of novel battery systems. Herein, a heterogeneous category of AI technology for predicting and discovering battery materials and estimating the state of the battery system is reviewed. Successful examples, the challenges of deploying AI in real-world scenarios, and an integrated framework are analyzed and outlined. The state-of-the-art research about the applications of ML in the property prediction and battery discovery, including electrolyte and electrode materials, are further summarized. Meanwhile, the prediction of battery states is also provided. Finally, various existing challenges and the framework to tackle the challenges on the further development of machine learning for rechargeable LIBs are proposed.

7.

Intelligent Trainer for Dyna-Style Model-Based Deep Reinforcement Learning.

Dong, Linsen; Li, Yuanlong; Zhou, Xin; Wen, Yonggang; Guan, Kyle.

IEEE Trans Neural Netw Learn Syst ; 32(6): 2758-2771, 2021 Jun.

Article in English | MEDLINE | ID: mdl-32866102

ABSTRACT

Model-based reinforcement learning (MBRL) has been proposed as a promising alternative solution to tackle the high sampling cost challenge in the canonical RL, by leveraging a system dynamics model to generate synthetic data for policy training purpose. The MBRL framework, nevertheless, is inherently limited by the convoluted process of jointly optimizing control policy, learning system dynamics, and sampling data from two sources controlled by complicated hyperparameters. As such, the training process involves overwhelmingly manual tuning and is prohibitively costly. In this research, we propose a "reinforcement on reinforcement" (RoR) architecture to decompose the convoluted tasks into two decoupled layers of RL. The inner layer is the canonical MBRL training process which is formulated as a Markov decision process, called training process environment (TPE). The outer layer serves as an RL agent, called intelligent trainer, to learn an optimal hyperparameter configuration for the inner TPE. This decomposition approach provides much-needed flexibility to implement different trainer designs, referred to "train the trainer." In our research, we propose and optimize two alternative trainer designs: 1) an unihead trainer and 2) a multihead trainer. Our proposed RoR framework is evaluated for five tasks in the OpenAI gym. Compared with three other baseline methods, our proposed intelligent trainer methods have a competitive performance in autotuning capability, with up to 56% expected sampling cost saving without knowing the best parameter configurations in advance. The proposed trainer framework can be easily extended to tasks that require costly hyperparameter tuning.

8.

Lab-on-Mask for Remote Respiratory Monitoring.

Pan, Liang; Wang, Cong; Jin, Haoran; Li, Jie; Yang, Le; Zheng, Yuanjin; Wen, Yonggang; Tan, Ban Hock; Loh, Xian Jun; Chen, Xiaodong.

ACS Mater Lett ; 2: 1178-1181, 2020.

Article in English | MEDLINE | ID: mdl-34192277

ABSTRACT

A smart mask integrated with a remote, noncontact multiplexed sensor system, or "Lab-on-Mask" (LOM) is designed for monitoring respiratory diseases, such as the COVID-19. This LOM can monitor the heart rate, blood oxygen saturation, blood pressure, and body temperature associated with symptoms of pneumonia caused by coronaviruses in real time. Because of this remote monitoring system, frontline healthcare staff can minimize the exposure they face from close contact with the patients and reduce the risks of being infected.

9.

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning.

Li, Yuanlong; Wen, Yonggang; Tao, Dacheng; Guan, Kyle.

IEEE Trans Cybern ; 50(5): 2002-2013, 2020 May.

Article in English | MEDLINE | ID: mdl-31352360

ABSTRACT

Data center (DC) plays an important role to support services, such as e-commerce and cloud computing. The resulting energy consumption from this growing market has drawn significant attention, and noticeably almost half of the energy cost is used to cool the DC to a particular temperature. It is thus an critical operational challenge to curb the cooling energy cost without sacrificing the thermal safety of a DC. The existing solutions typically follow a two-step approach, in which the system is first modeled based on expert knowledge and, thus, the operational actions are determined with heuristics and/or best practices. These approaches are often hard to generalize and might result in suboptimal performances due to intrinsic model errors for large-scale systems. In this paper, we propose optimizing the DC cooling control via the emerging deep reinforcement learning (DRL) framework. Compared to the existing approaches, our solution lends itself an end-to-end cooling control algorithm (CCA) via an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm, in which an evaluation network is trained to predict the DC energy cost along with resulting cooling effects, and a policy network is trained to gauge optimized control settings. Moreover, we introduce a de-underestimation (DUE) validation mechanism for the critic network to reduce the potential underestimation of the risk caused by neural approximation. Our proposed algorithm is evaluated on an EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. The resulting numerical results show that the proposed CCA can achieve up to 11% cooling cost reduction on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study of conservative nature, the proposed algorithm can achieve about 15% cooling energy savings on the NSCC data trace. Our pioneering approach can shed new light on the application of DRL to optimize and automate DC operations and management, potentially revolutionizing digital infrastructure management with intelligence.

10.

Transferring Knowledge Fragments for Learning Distance Metric from a Heterogeneous Domain.

Luo, Yong; Wen, Yonggang; Liu, Tongliang; Tao, Dacheng.

IEEE Trans Pattern Anal Mach Intell ; 41(4): 1013-1026, 2019 Apr.

Article in English | MEDLINE | ID: mdl-29993977

ABSTRACT

The goal of transfer learning is to improve the performance of target learning task by leveraging information (or transferring knowledge) from other related tasks. In this paper, we examine the problem of transfer distance metric learning (DML), which usually aims to mitigate the label information deficiency issue in the target DML. Most of the current Transfer DML (TDML) methods are not applicable to the scenario where data are drawn from heterogeneous domains. Some existing heterogeneous transfer learning (HTL) approaches can learn target distance metric by usually transforming the samples of source and target domain into a common subspace. However, these approaches lack flexibility in real-world applications, and the learned transformations are often restricted to be linear. This motivates us to develop a general flexible heterogeneous TDML (HTDML) framework. In particular, any (linear/nonlinear) DML algorithms can be employed to learn the source metric beforehand. Then the pre-learned source metric is represented as a set of knowledge fragments to help target metric learning. We show how generalization error in the target domain could be reduced using the proposed transfer strategy, and develop novel algorithm to learn either linear or nonlinear target metric. Extensive experiments on various applications demonstrate the effectiveness of the proposed method.

11.

Heterogeneous Multitask Metric Learning Across Multiple Domains.

Luo, Yong; Wen, Yonggang; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; 29(9): 4051-4064, 2018 09.

Article in English | MEDLINE | ID: mdl-28981432

ABSTRACT

Distance metric learning plays a crucial role in diverse machine learning algorithms and applications. When the labeled information in a target domain is limited, transfer metric learning (TML) helps to learn the metric by leveraging the sufficient information from other related domains. Multitask metric learning (MTML), which can be regarded as a special case of TML, performs transfer across all related domains. Current TML tools usually assume that the same feature representation is exploited for different domains. However, in real-world applications, data may be drawn from heterogeneous domains. Heterogeneous transfer learning approaches can be adopted to remedy this drawback by deriving a metric from the learned transformation across different domains. However, they are often limited in that only two domains can be handled. To appropriately handle multiple domains, we develop a novel heterogeneous MTML (HMTML) framework. In HMTML, the metrics of all different domains are learned together. The transformations derived from the metrics are utilized to induce a common subspace, and the high-order covariance among the predictive structures of these domains is maximized in this subspace. There do exist a few heterogeneous transfer learning approaches that deal with multiple domains, but the high-order statistics (correlation information), which can only be exploited by simultaneously examining all domains, is ignored in these approaches. Compared with them, the proposed HMTML can effectively explore such high-order information, thus obtaining more reliable feature transformations and metrics. Effectiveness of our method is validated by the extensive and intensive experiments on text categorization, scene classification, and social image annotation.

12.

Can We Speculate Running Application With Server Power Consumption Trace?

Li, Yuanlong; Hu, Han; Wen, Yonggang; Zhang, Jun.

IEEE Trans Cybern ; 48(5): 1500-1512, 2018 May.

Article in English | MEDLINE | ID: mdl-28541919

ABSTRACT

In this paper, we propose to detect the running applications in a server by classifying the observed power consumption series for the purpose of data center energy consumption monitoring and analysis. Time series classification problem has been extensively studied with various distance measurements developed; also recently the deep learning-based sequence models have been proved to be promising. In this paper, we propose a novel distance measurement and build a time series classification algorithm hybridizing nearest neighbor and long short term memory (LSTM) neural network. More specifically, first we propose a new distance measurement termed as local time warping (LTW), which utilizes a user-specified index set for local warping, and is designed to be noncommutative and nondynamic programming. Second, we hybridize the 1-nearest neighbor (1NN)-LTW and LSTM together. In particular, we combine the prediction probability vector of 1NN-LTW and LSTM to determine the label of the test cases. Finally, using the power consumption data from a real data center, we show that the proposed LTW can improve the classification accuracy of dynamic time warping (DTW) from about 84% to 90%. Our experimental results prove that the proposed LTW is competitive on our data set compared with existed DTW variants and its noncommutative feature is indeed beneficial. We also test a linear version of LTW and find out that it can perform similar to state-of-the-art DTW-based method while it runs as fast as the linear runtime lower bound methods like LB_Keogh for our problem. With the hybrid algorithm, for the power series classification task we achieve an accuracy up to about 93%. Our research can inspire more studies on time series distance measurement and the hybrid of the deep learning models with other traditional models.

13.

Facial Age Estimation With Age Difference.

Hu, Zhenzhen; Wen, Yonggang; Wang, Jianfeng; Wang, Meng; Hong, Richang; Yan, Shuicheng.

IEEE Trans Image Process ; 26(7): 3087-3097, 2017 Jul.

Article in English | MEDLINE | ID: mdl-27913345

ABSTRACT

Age estimation based on the human face remains a significant problem in computer vision and pattern recognition. In order to estimate an accurate age or age group of a facial image, most of the existing algorithms require a huge face data set attached with age labels. This imposes a constraint on the utilization of the immensely unlabeled or weakly labeled training data, e.g., the huge amount of human photos in the social networks. These images may provide no age label, but it is easy to derive the age difference for an image pair of the same person. To improve the age estimation accuracy, we propose a novel learning scheme to take advantage of these weakly labeled data through the deep convolutional neural networks. For each image pair, Kullback-Leibler divergence is employed to embed the age difference information. The entropy loss and the cross entropy loss are adaptively applied on each image to make the distribution exhibit a single peak value. The combination of these losses is designed to drive the neural network to understand the age gradually from only the age difference information. We also contribute a data set, including more than 100 000 face images attached with their taken dates. Each image is both labeled with the timestamp and people identity. Experimental results on two aging face databases show the advantages of the proposed age difference learning system, and the state-of-the-art performance is gained.

Subject(s)

Aging/physiology , Face/diagnostic imaging , Image Processing, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Adolescent , Adult , Aged , Algorithms , Child , Child, Preschool , Databases, Factual , Female , Humans , Infant , Infant, Newborn , Male , Middle Aged , Neural Networks, Computer , Young Adult

14.

Multiview vector-valued manifold regularization for multilabel image classification.

Luo, Yong; Tao, Dacheng; Xu, Chang; Xu, Chao; Liu, Hong; Wen, Yonggang.

IEEE Trans Neural Netw Learn Syst ; 24(5): 709-22, 2013 May.

Article in English | MEDLINE | ID: mdl-24808422

ABSTRACT

In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g., pedestrian, bicycle, and tree) and is properly characterized by multiple visual features (e.g., color, texture, and shape). Currently, available tools ignore either the label relationship or the view complementarily. Motivated by the success of the vector-valued function that constructs matrix-valued kernels to explore the multilabel structure in the output space, we introduce multiview vector-valued manifold regularization (MV(3)MR) to integrate multiple features. MV(3)MR exploits the complementary property of different features and discovers the intrinsic local geometry of the compact support shared by different features under the theme of manifold regularization. We conduct extensive experiments on two challenging, but popular, datasets, PASCAL VOC' 07 and MIR Flickr, and validate the effectiveness of the proposed MV(3)MR for image classification.

Subject(s)

Image Interpretation, Computer-Assisted , Neural Networks, Computer , Pattern Recognition, Automated/methods , Algorithms , Database Management Systems , Models, Statistical , Reproducibility of Results

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL