Búsqueda | Portal Regional de la BVS

Coordination as inference in multi-agent reinforcement learning.

Li, Zhiyuan; Wu, Lijun; Su, Kaile; Wu, Wei; Jing, Yulin; Wu, Tong; Duan, Weiwei; Yue, Xiaofeng; Tong, Xiyi; Han, Yizhou.

Neural Netw ; 172: 106101, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38232426

RESUMEN

The Centralized Training and Decentralized Execution (CTDE) paradigm, where a centralized critic is allowed to access global information during the training phase while maintaining the learned policies executed with only local information in a decentralized way, has achieved great progress in recent years. Despite the progress, CTDE may suffer from the issue of Centralized-Decentralized Mismatch (CDM): the suboptimality of one agent's policy can exacerbate policy learning of other agents through the centralized joint critic. In contrast to centralized learning, the cooperative model that most closely resembles the way humans cooperate in nature is fully decentralized, i.e. Independent Learning (IL). However, there are still two issues that need to be addressed before agents coordinate through IL: (1) how agents are aware of the presence of other agents, and (2) how to coordinate with other agents to improve joint policy under IL. In this paper, we propose an inference-based coordinated MARL method: Deep Motor System (DMS). DMS first presents the idea of individual intention inference where agents are allowed to disentangle other agents from their environment. Secondly, causal inference was introduced to enhance coordination by reasoning each agent's effect on others' behavior. The proposed model was extensively experimented on a series of Multi-Agent MuJoCo and StarCraftII tasks. Results show that the proposed method outperforms independent learning algorithms and the coordination behavior among agents can be learned even without the CTDE paradigm compared to the state-of-the-art baselines including IPPO and HAPPO.

Asunto(s)

Algoritmos , Intención , Humanos , Políticas , Solución de Problemas , Refuerzo en Psicología

Iterated Clique Reductions in Vertex Weighted Coloring for Large Sparse Graphs.

Fan, Yi; Zhang, Zaijun; Yu, Quan; Lai, Yongxuan; Su, Kaile; Wang, Yiyuan; Pan, Shiwei; Latecki, Longin Jan.

Entropy (Basel) ; 25(10)2023 Sep 24.

Artículo en Inglés | MEDLINE | ID: mdl-37895498

RESUMEN

The Minimum Vertex Weighted Coloring (MinVWC) problem is an important generalization of the classic Minimum Vertex Coloring (MinVC) problem which is NP-hard. Given a simple undirected graph G=(V,E), the MinVC problem is to find a coloring s.t. any pair of adjacent vertices are assigned different colors and the number of colors used is minimized. The MinVWC problem associates each vertex with a positive weight and defines the weight of a color to be the weight of its heaviest vertices, then the goal is the find a coloring that minimizes the sum of weights over all colors. Among various approaches, reduction is an effective one. It tries to obtain a subgraph whose optimal solutions can conveniently be extended into optimal ones for the whole graph, without costly branching. In this paper, we propose a reduction algorithm based on maximal clique enumeration. More specifically our algorithm utilizes a certain proportion of maximal cliques and obtains lower bounds in order to perform reductions. It alternates between clique sampling and graph reductions and consists of three successive procedures: promising clique reductions, better bound reductions and post reductions. Experimental results show that our algorithm returns considerably smaller subgraphs for numerous large benchmark graphs, compared to the most recent method named RedLS. Also, we evaluate individual impacts and some practical properties of our algorithm. Furthermore, we have a theorem which indicates that the reduction effects of our algorithm are equivalent to that of a counterpart which enumerates all maximal cliques in the whole graph if the run time is sufficiently long.

µ-law SGAN for generating spectra with more details in speech enhancement.

Li, Hongfeng; Xu, Yanyan; Ke, Dengfeng; Su, Kaile.

Neural Netw ; 136: 17-27, 2021 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-33422929

RESUMEN

The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time-frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training framework for spectrum generation, named µ-law spectrum generative adversarial networks (µ-law SGAN). We introduce a trainable µ-law spectrum compression layer (USCL) into the proposed discriminator to compress the dynamic range of the spectrum. As a result, the compressed spectrum can display more detailed information. In addition, we use the spectrum transformed by USCL to regularize the generator's training, so that the generator can pay more attention to the details of the spectrum. Experimental results on the open dataset Voice Bank + DEMAND show that µ-law SGAN is an effective generative adversarial architecture for speech enhancement. Moreover, visual spectrogram analysis suggests that µ-law SGAN pays more attention to the enhancement of low energy harmonics and consonants.

Asunto(s)

Aprendizaje Profundo , Redes Neurales de la Computación , Percepción del Habla/fisiología , Software de Reconocimiento del Habla , Compresión de Datos/métodos , Humanos , Habla/fisiología

Deep Neural Networks with Multistate Activation Functions.

Cai, Chenghao; Xu, Yanyan; Ke, Dengfeng; Su, Kaile.

Comput Intell Neurosci ; 2015: 721367, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26448739

RESUMEN

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

Asunto(s)

Algoritmos , Redes Neurales de la Computación , Procesos Estocásticos , Entropía , Humanos , Modelos Lineales , Modelos Logísticos , Modelos Psicológicos , Habla

Clause states based configuration checking in local search for satisfiability.

Luo, Chuan; Cai, Shaowei; Su, Kaile; Wu, Wei.

IEEE Trans Cybern ; 45(5): 1014-27, 2015 May.

Artículo en Inglés | MEDLINE | ID: mdl-25134096

RESUMEN

Two-mode stochastic local search (SLS) and focused random walk (FRW) are the two most influential paradigms of SLS algorithms for the propositional satisfiability (SAT) problem. Recently, an interesting idea called configuration checking (CC) was proposed to handle the cycling problem in SLS. The CC idea has been successfully used to improve SLS algorithms for SAT, resulting in state-of-the-art solvers. Previous CC strategies for SAT are based on neighboring variables, and prove successful in two-mode SLS algorithms. However, this kind of neighboring variables based CC strategy is not suitable for improving FRW algorithms. In this paper, we propose a new CC strategy which is based on clause states. We apply this clause states based CC (CSCC) strategy to both two-mode SLS and FRW paradigms. Our experiments show that the CSCC strategy is effective on both paradigms. Furthermore, our developed FRW algorithms based on CSCC achieve state-of-the-art performance on a broad range of random SAT benchmarks.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA