Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 134
Filter
1.
ISA Trans ; : 1-15, 2024 Aug 28.
Article in English | MEDLINE | ID: mdl-39261266

ABSTRACT

Global Nash equilibrium is an optimal solution for each player in a graphical game. This paper proposes an iterative adaptive dynamic programming-based algorithm to solve the global Nash equilibrium solution for optimal containment control problem with robustness analysis to the iterative error. The containment control problem is transferred into the graphical game formulation. Sufficient conditions are given to decouple the Hamilton-Jacobi equations, which guarantee the solvability of the global Nash equilibrium solution. The iterative algorithm is designed to obtain the solution without any knowledge of system dynamics. Conditions of iterative error for global stability are given with rigorous proof. Compared with existing works, the design procedures of control gain and coupling strength are separated, which avoids trivial cases in the design procedure. The robustness analysis exactly quantifies the effect of the iterative error caused by various sources in engineering practice. The theoretical results are validated by two numerical examples with marginally stable and unstable dynamics of the leader.

2.
ISA Trans ; : 1-14, 2024 Sep 16.
Article in English | MEDLINE | ID: mdl-39299846

ABSTRACT

This article studies the problem of formation tracking control in multi-agent systems, achieved in finite time, under challenging conditions such as strong nonlinearity, aperiodic intermittent communication, and time-delay effects, all within a hybrid impulsive framework. The impulses are categorized as either stabilizing control impulses or disruptive impulses. Furthermore, by integrating Lyapunov-based stability theory, graph theory, and the linear matrix inequality (LMI) method, new stability criteria are established. These criteria ensure finite-time intermittent formation tracking while considering weak Lyapunov inequality conditions, intermittent communication rates, and time-varying gain strengths. Additionally, the approach manages an indefinite number of impulsive moments and adjusts the control domain's width based on the average impulsive interval and state-dependent control width. Numerical simulations are provided to validate the applicability and effectiveness of the proposed formation tracking control protocols.

3.
Neural Netw ; 179: 106566, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39089157

ABSTRACT

This paper studies an optimal synchronous control protocol design for nonlinear multi-agent systems under partially known dynamics and uncertain external disturbance. Under some mild assumptions, Hamilton-Jacobi-Isaacs equation is derived by the performance index function and system dynamics, which serves as an equivalent formulation. Distributed policy iteration adaptive dynamic programming is developed to obtain the numerical solution to the Hamilton-Jacobi-Isaacs equation. Three theoretical results are given about the proposed algorithm. First, the iterative variables is proved to converge to the solution to Hamilton-Jacobi-Isaacs equation. Second, the L2-gain performance of the closed loop system is achieved. As a special case, the origin of the nominal system is asymptotically stable. Third, the obtained control protocol constitutes an Nash equilibrium solution. Neural network-based implementation is designed following the main results. Finally, two numerical examples are provided to verify the effectiveness of the proposed method.


Subject(s)
Algorithms , Neural Networks, Computer , Nonlinear Dynamics , Computer Simulation
4.
Neural Netw ; 179: 106625, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39168072

ABSTRACT

In this paper, a smoothing approximation-based adaptive neurodynamic approach is proposed for a nonsmooth resource allocation problem (NRAP) with multiple constraints. The smoothing approximation method is combined with multi-agent systems to avoid the introduction of set-valued subgradient terms, thereby facilitating the practical implementation of the neurodynamic approach. In addition, using the adaptive penalty technique, private inequality constraints are processed, which eliminates the need for additional quantitative estimation of penalty parameters and significantly reduces the computational cost. Moreover, to reduce the impact of smoothing approximation on the convergence of the neurodynamic approach, time-varying control parameters are introduced. Due to the parallel computing characteristics of multi-agent systems, the neurodynamic approach proposed in this paper is completely distributed. Theoretical proof shows that the state solution of the neurodynamic approach converges to the optimal solution of NRAP. Finally, two application examples are used to validate the feasibility of the neurodynamic approach.


Subject(s)
Resource Allocation , Neural Networks, Computer , Algorithms , Computer Simulation , Nonlinear Dynamics , Humans
5.
Neural Netw ; 179: 106547, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39068677

ABSTRACT

Centralized Training with Decentralized Execution (CTDE) is a prevalent paradigm in the field of fully cooperative Multi-Agent Reinforcement Learning (MARL). Existing algorithms often encounter two major problems: independent strategies tend to underestimate the potential value of actions, leading to the convergence on sub-optimal Nash Equilibria (NE); some communication paradigms introduce added complexity to the learning process, complicating the focus on the essential elements of the messages. To address these challenges, we propose a novel method called Optimistic Sequential Soft Actor Critic with Motivational Communication (OSSMC). The key idea of OSSMC is to utilize a greedy-driven approach to explore the potential value of individual policies, named optimistic Q-values, which serve as an upper bound for the Q-value of the current policy. We then integrate a sequential update mechanism with optimistic Q-value for agents, aiming to ensure monotonic improvement in the joint policy optimization process. Moreover, we establish motivational communication modules for each agent to disseminate motivational messages to promote cooperative behaviors. Finally, we employ a value regularization strategy from the Soft Actor Critic (SAC) method to maximize entropy and improve exploration capabilities. The performance of OSSMC was rigorously evaluated against a series of challenging benchmark sets. Empirical results demonstrate that OSSMC not only surpasses current baseline algorithms but also exhibits a more rapid convergence rate.


Subject(s)
Algorithms , Motivation , Reinforcement, Psychology , Communication , Humans , Neural Networks, Computer , Cooperative Behavior
6.
Heliyon ; 10(12): e32122, 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-39021935

ABSTRACT

The importance of the dependencies between water and power systems is more acutely perceived when challenges emerge. As both energy and water supply are limited, efficient use is a must for any sustainable future, especially in rural areas. Although important, a modeling tool that can analyze water-energy systems interdependencies in rural systems, at the architectural level highlighting the physical interconnections and synergies of these systems, is still lacking. We present a multi-agent system model that captures the features of both systems, at the same levels of fidelity and resolution, with coordinated operations and contingency components represented. Unlike other models, ours captures architectural features of both systems and technical constraints of the systems' components, which is critical to capture physical intricacies of the interplay between systems components and shed light on the impacts of disruptions of either system on the other. This model, which includes multiple infrastructure components, shows the importance of a holistic understanding of the systems, for cooperation across systems physical boundaries and enhanced benefits at larger scales. This study looks to investigate water-power resource management in an irrigation system via the analysis of physical links and highlight strengths and vulnerabilities. The effects of water shortage, water re-allocation and load shedding are analyzed through scenarios designed to illustrate the utility of such a model. Results highlights the importance of inter-reservoir relationships for alleviating effects of disruption and unforeseen rise in energy demand. Water storage is also critical, helping to mitigate the impacts of water scarcity, and by extension, to keep the energy system unaffected. It can be a viable part of the solution to compensate for the negative impact of shortage for both resources.

7.
ISA Trans ; 151: 33-40, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38876951

ABSTRACT

This paper is concerned with the secure output consensus problem for the heterogeneous multi-agent systems under the event-triggered scheme in the presence of the denial-of-service attack. Without detecting the attack, the hold-input controller update strategy is adopted when some transmission data may be lost due to the effect of the attack. Based on the tolerable duration of the attack, a novel edge-based event-triggered scheme is developed. The scheme can avoid continuous communication and exclude Zeno behavior. With the aid of the switched system theory, output consensus is preserved. An example shows the effectiveness.

8.
Neural Netw ; 178: 106432, 2024 Oct.
Article in English | MEDLINE | ID: mdl-38901092

ABSTRACT

In the realm of fully cooperative multi-agent reinforcement learning (MARL), effective communication can induce implicit cooperation among agents and improve overall performance. In current communication strategies, agents are allowed to exchange local observations or latent embeddings, which can augment individual local policy inputs and mitigate uncertainty in local decision-making processes. Unfortunately, in previous communication schemes, agents may potentially receive irrelevant information, which increases training difficulty and leads to poor performance in complex settings. Furthermore, most existing works lack the consideration of the impact of small coalitions formed by agents in the multi-agent system. To address these challenges, we propose HyperComm, a novel framework that uses the hypergraph to model the multi-agent system, improving the accuracy and specificity of communication among agents. Our approach brings the concept of hypergraph for the first time in multi-agent communication for MARL. Within this framework, each agent can communicate more effectively with other agents within the same hyperedge, leading to better cooperation in environments with multiple agents. Compared to those state-of-the-art communication-based approaches, HyperComm demonstrates remarkable performance in scenarios involving a large number of agents.


Subject(s)
Communication , Reinforcement, Psychology , Humans , Decision Making/physiology , Neural Networks, Computer , Computer Simulation , Algorithms
9.
Neural Netw ; 175: 106270, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38569458

ABSTRACT

This paper addresses the predefined-time distributed optimization of nonlinear multi-agent system using a hierarchical control approach. Considering unknown nonlinear functions and external disturbances, we propose a two-layer hierarchical control framework. At the first layer, a predefined-time distributed estimator is employed to produce optimal consensus trajectories. At the second layer, a neural-network-based predefined-time disturbance observer is introduced to estimate the disturbance, with neural networks used to approximate the unknown nonlinear functions. A neural-network-based anti-disturbance sliding mode control mechanism is presented to ensure that the system trajectories can track the optimal trajectories within a predefined time. The feasibility of this hierarchical control framework is verified by utilizing the Lyapunov method. Numerical simulations are conducted separately using models of robotic arms and mobile robots to validate the effectiveness of the proposed method.


Subject(s)
Algorithms , Computer Simulation , Neural Networks, Computer , Nonlinear Dynamics , Robotics , Time Factors
10.
Neural Netw ; 174: 106129, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38508044

ABSTRACT

Multi-task multi-agent systems (MASs) are challenging to model because they involve heterogeneous agents with different behavior patterns that need to cooperate across various tasks. Existing networks for single-agent policies are not suitable for this setting, as they cannot share policies among agents without losing task-specific performance. We propose a novel framework called Role-based Multi-Agent Transformer (RoMAT), which uses a sequence modeling technique and a role-based actor to enable agents to adapt to different tasks and roles in MASs. RoMAT has a modular model architecture, where backbone networks are shared by all agents, but a small part of the parameters (role-based actor) is independent, depending on the agents' exclusive structures. We evaluate RoMAT on several benchmark tasks and show that it can capture the behavior patterns of heterogeneous agents and achieve better performance and generalization than other methods in both single and multi-task settings.


Subject(s)
Benchmarking , Generalization, Psychological , Policy
11.
Neural Netw ; 174: 106243, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38531123

ABSTRACT

Generative Flow Networks (GFlowNets) aim to generate diverse trajectories from a distribution in which the final states of the trajectories are proportional to the reward, serving as a powerful alternative to reinforcement learning for exploratory control tasks. However, the individual-flow matching constraint in GFlowNets limits their applications for multi-agent systems, especially continuous joint-control problems. In this paper, we propose a novel Multi-Agent generative Continuous Flow Networks (MACFN) method to enable multiple agents to perform cooperative exploration for various compositional continuous objects. Technically, MACFN trains decentralized individual-flow-based policies in a centralized global-flow-based matching fashion. During centralized training, MACFN introduces a continuous flow decomposition network to deduce the flow contributions of each agent in the presence of only global rewards. Then agents can deliver actions solely based on their assigned local flow in a decentralized way, forming a joint policy distribution proportional to the rewards. To guarantee the expressiveness of continuous flow decomposition, we theoretically derive a consistency condition on the decomposition network. Experimental results demonstrate that the proposed method yields results superior to the state-of-the-art counterparts and better exploration capability. Our code is available at https://github.com/isluoshuang/MACFN.


Subject(s)
Learning , Policy , Reinforcement, Psychology , Reward
12.
Sensors (Basel) ; 24(6)2024 Mar 08.
Article in English | MEDLINE | ID: mdl-38544027

ABSTRACT

The integration of the Internet of Things (IoT) and artificial intelligence (AI) is critical to the advancement of ambient intelligence (AmI), as it enables systems to understand contextual information and react accordingly. While many solutions focus on user-centric services that provide enhanced comfort and support, few expand on scenarios in which multiple users are present simultaneously, leaving a significant gap in service provisioning. To address this problem, this paper presents a multi-agent system in which software agents, aware of context, advocate for their users' preferences and negotiate service settings to achieve solutions that satisfy everyone, taking into account users' flexibility. The proposed negotiation algorithm is illustrated through a smart lighting use case, and the results are analyzed in terms of the concrete preferences defined by the user and the selected settings resulting from the negotiation in regard to user flexibility.

13.
ISA Trans ; 148: 412-421, 2024 May.
Article in English | MEDLINE | ID: mdl-38423837

ABSTRACT

In this paper, a Distributed Discrete-time Exponential Sliding Mode Consensus (DDESMC) protocol is proposed for the leader-follower consensus of a Discrete Multi-Agent System (DMAS). The proposed protocol ensures not only the minimal consensus effort and reaching time for the consensus among the agents but also the minimum consensus deviation in the order of O(T3). The consensus stability of DMAS with the proposed protocol is analyzed using Lyapunov theory and the maximum number of reaching steps required for achieving the consensus among all agents is calculated. The proposed protocol is validated in a simulation and experimental setup comprised of multiple 2-Degree of Freedom (DOF) robotic arms where one of the robotic arms is real, and others are virtual. Further, the consensus performance compared with the existing protocol in the literature, and it is inferred that the proposed protocol outperforms it in terms of time and effort required to achieve the consensus and the deviation from the consensus.

14.
ISA Trans ; 147: 1-12, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38342650

ABSTRACT

This paper mainly studies the consensus control strategy for a novel heuristic nonlinear multi-agent system. Compared with most existing related researches, firstly, the novel heuristic nonlinear multi-agent system has the ability to construct its communication network topology heuristically, and can withstand long-term DOS(Denial of Service) attacks, with the advantages of high practicality and security. Secondly, in order to control the multi-agent system, a control protocol based on both saturation effect and impulse control mechanism is studied, which has the advantages of high efficiency, low cost and wide applicability. Thirdly, for the multi-agent system, its dynamic model is constructed and analyzed by Lyapunov stability theory and matrix measure theory, and some sufficient conditions for achieving consensus are obtained. Finally, through two simulation experiments and some corresponding comparative analysis, the correctness, efficiency, and superiority of the theories proposed in this paper were verified.

15.
ISA Trans ; 147: 90-100, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38342651

ABSTRACT

This study addresses the fault detection (FD) problem in heterogeneous multi-agent systems (HMASs) with unknown system models. A novel data-driven FD scheme is proposed by properly combining hardware and temporal redundant information to accelerate the generation of fault detectors while ensuring detection accuracy. The computational burden associated with the FD scheme is alleviated by applying a two-step order reduction algorithm. Additionally, an optimization problem is formulated, simplified and solved to achieve a compromise between sensitivity to faults and robustness to disturbances, further enhancing the detection performance of agents. Through a series of examples and comparative experiments, the effectiveness and improvements of the proposed approach are demonstrated.

16.
Sensors (Basel) ; 24(2)2024 Jan 17.
Article in English | MEDLINE | ID: mdl-38257681

ABSTRACT

Although the formation control of multi-agent systems has been widely investigated from various aspects, the problem is still not well resolved, especially for the case of distributed output-feedback formation controller design without input information exchange among neighboring agents. Using relative output information, this paper presents a novel distributed reduced-order estimation of the formation error at a predefined time. Based on the proposed distributed observer, a neural-network-based formation controller is then designed for multi-agent systems with connected graphs. The results are verified by both theoretical demonstration and simulation example.

17.
Neural Netw ; 172: 106101, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38232426

ABSTRACT

The Centralized Training and Decentralized Execution (CTDE) paradigm, where a centralized critic is allowed to access global information during the training phase while maintaining the learned policies executed with only local information in a decentralized way, has achieved great progress in recent years. Despite the progress, CTDE may suffer from the issue of Centralized-Decentralized Mismatch (CDM): the suboptimality of one agent's policy can exacerbate policy learning of other agents through the centralized joint critic. In contrast to centralized learning, the cooperative model that most closely resembles the way humans cooperate in nature is fully decentralized, i.e. Independent Learning (IL). However, there are still two issues that need to be addressed before agents coordinate through IL: (1) how agents are aware of the presence of other agents, and (2) how to coordinate with other agents to improve joint policy under IL. In this paper, we propose an inference-based coordinated MARL method: Deep Motor System (DMS). DMS first presents the idea of individual intention inference where agents are allowed to disentangle other agents from their environment. Secondly, causal inference was introduced to enhance coordination by reasoning each agent's effect on others' behavior. The proposed model was extensively experimented on a series of Multi-Agent MuJoCo and StarCraftII tasks. Results show that the proposed method outperforms independent learning algorithms and the coordination behavior among agents can be learned even without the CTDE paradigm compared to the state-of-the-art baselines including IPPO and HAPPO.


Subject(s)
Algorithms , Intention , Humans , Policy , Problem Solving , Reinforcement, Psychology
18.
Neural Netw ; 169: 673-684, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37972511

ABSTRACT

This paper considers a class of multi-agent distributed convex optimization with a common set of constraints and provides several continuous-time neurodynamic approaches. In problem transformation, l1 and l2 penalty methods are used respectively to cast the linear consensus constraint into the objective function, which avoids introducing auxiliary variables and only involves information exchange among primal variables in the process of solving the problem. For nonsmooth cost functions, two differential inclusions with projection operator are proposed. Without convexity of the differential inclusions, the asymptotic behavior and convergence properties are explored. For smooth cost functions, by harnessing the smoothness of l2 penalty function, finite- and fixed-time convergent algorithms are provided via a specifically designed average consensus estimator. Finally, several numerical examples in the multi-agent simulation environment are conducted to illustrate the effectiveness of the proposed neurodynamic approaches.


Subject(s)
Algorithms , Neural Networks, Computer , Computer Simulation , Consensus
19.
R Soc Open Sci ; 10(12): 230736, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38094273

ABSTRACT

This paper addresses the problem of robust fault estimation for multi-agent systems (MASs) under communication constraints. Taking into account the possible data packet loss (DPL) in the information interaction of each subsystem, MASs are remodelled as switching systems by introducing a variable sampling strategy. Then, using the local information among agents, a novel intermediate observer design method based on switching scheme is proposed to estimate faults of MASs. Combining Lyapunov's criterion and linear matrix inequality, sufficient conditions for the intermediate observer to be exponentially stable and have H∞ performance against bounded disturbances and the DPL are given. Finally, some simulations are provided to verify the effectiveness of the proposed method.

20.
Viruses ; 15(12)2023 11 23.
Article in English | MEDLINE | ID: mdl-38140541

ABSTRACT

This study proposes a modification of the GeoCity model previously developed by the authors, detailing the age structure of the population, personal schedule on weekdays and working days, and individual health characteristics of the agents. This made it possible to build a more realistic model of the functioning of the city and its residents. The developed model made it possible to simulate the spread of three types of strain of the COVID-19 virus, and to analyze the adequacy of this model in the case of unhindered spread of the virus among city residents. Calculations based on the proposed model show that SARS-CoV 2 spreads mainly from contacts in workplaces and transport, and schoolchildren and preschool children are the recipients, not the initiators of the epidemic. The simulations showed that fluctuations in the dynamics of various indicators of the spread of SARS-CoV 2 were associated with the difference in the daily schedule on weekdays and weekends. The results of the calculations showed that the daily schedules of people strongly influence the spread of SARS-CoV 2. Under assumptions of the model, the results show that for the more contagious "rapid" strains of SARS-CoV 2 (omicron), immunocompetent people become a significant source of infection. For the less contagious "slow strains" (alpha) of SARS-CoV 2, the most active source of infection is immunocompromised individuals (pregnant women). The more contagious, or "fast" strain of the SARS-CoV 2 virus (omicron), spreads faster in public transport. For less contagious, or "slow" strains of the virus (alpha), the greatest infection occurs due to work and educational contacts.


Subject(s)
COVID-19 , Epidemics , Pregnancy , Child, Preschool , Humans , Female , Child , COVID-19/epidemiology , SARS-CoV-2 , Immunocompromised Host , Transportation
SELECTION OF CITATIONS
SEARCH DETAIL