Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-37037246

ABSTRACT

Recently, multiagent reinforcement learning (MARL) has shown great potential for learning cooperative policies in multiagent systems (MASs). However, a noticeable drawback of current MARL is the low sample efficiency, which causes a huge amount of interactions with environment. Such amount of interactions greatly hinders the real-world application of MARL. Fortunately, effectively incorporating experience knowledge can assist MARL to quickly find effective solutions, which can significantly alleviate the drawback. In this article, a novel multiexperience-assisted reinforcement learning (MEARL) method is proposed to improve the learning efficiency of MASs. Specifically, monotonicity-constrained reward shaping is innovatively designed using expert experience to provide additional individual rewards to guide multiagent learning efficiently, with the invariance guarantee of the team optimization objective. Furthermore, a reward distribution estimator is specially developed to model an implicated reward distribution of environment by using transition experience from environment, containing collected samples (state-action pair, reward, and next state). This estimator can predict the expectation reward of each agent for the taken action to accurately estimate the state value function and accelerate its convergence. Besides, the performance of MEARL is evaluated on two multiagent environment platforms: our designed unmanned aerial vehicle combat (UAV-C) and StarCraft II Micromanagement (SCII-M). Simulation results demonstrate that the proposed MEARL can greatly improve the learning efficiency and performance of MASs and is superior to the state-of-the-art methods in multiagent tasks.

2.
IEEE Trans Neural Netw Learn Syst ; 34(11): 8235-8249, 2023 Nov.
Article in English | MEDLINE | ID: mdl-35180087

ABSTRACT

In this article, a novel method, called attention enhanced reinforcement learning (AERL), is proposed to address issues including complex interaction, limited communication range, and time-varying communication topology for multi agent cooperation. AERL includes a communication enhanced network (CEN), a graph spatiotemporal long short-term memory network (GST-LSTM), and parameters sharing multi-pseudo critic proximal policy optimization (PS-MPC-PPO). Specifically, CEN based on graph attention mechanism is designed to enlarge the agents' communication range and to deal with complex interaction among the agents. GST-LSTM, which replaces the standard fully connected (FC) operator in LSTM with graph attention operator, is designed to capture the temporal dependence while maintaining the spatial structure learned by CEN. PS-MPC-PPO, which extends proximal policy optimization (PPO) in multi agent systems with parameters' sharing to scale to environments with a large number of agents in training, is designed with multi-pseudo critics to mitigate the bias problem in training and accelerate the convergence process. Simulation results for three groups of representative scenarios including formation control, group containment, and predator-prey games demonstrate the effectiveness and robustness of AERL.

3.
IEEE Trans Cybern ; 52(7): 6809-6821, 2022 Jul.
Article in English | MEDLINE | ID: mdl-33301412

ABSTRACT

This article presents a new command-filtered composite adaptive neural control scheme for uncertain nonlinear systems. Compared with existing works, this approach focuses on achieving finite-time convergent composite adaptive control for the higher-order nonlinear system with unknown nonlinearities, parameter uncertainties, and external disturbances. First, radial basis function neural networks (NNs) are utilized to approximate the unknown functions of the considered uncertain nonlinear system. By constructing the prediction errors from the serial-parallel nonsmooth estimation models, the prediction errors and the tracking errors are fused to update the weights of the NNs. Afterward, the composite adaptive neural backstepping control scheme is proposed via nonsmooth command filter and adaptive disturbance estimation techniques. The proposed control scheme ensures that high-precision tracking performances and NN approximation performances can be achieved simultaneously. Meanwhile, it can avoid the singularity problem in the finite-time backstepping framework. Moreover, it is proved that all signals in the closed-loop control system can be convergent in finite time. Finally, simulation results are given to illustrate the effectiveness of the proposed control scheme.


Subject(s)
Neural Networks, Computer , Nonlinear Dynamics , Computer Simulation , Feedback
4.
IEEE Trans Cybern ; 51(5): 2504-2517, 2021 May.
Article in English | MEDLINE | ID: mdl-31329154

ABSTRACT

This paper presents a novel robust adaptive tracking control method for a hypersonic vehicle in a cruise flight stage based on interval type-2 fuzzy-logic system (IT2-FLS) and small-gain approach. After the input-output linearization, the vehicle model can be decomposed into two uncertain subsystems by considering matching disturbances and parametric uncertainties. For each subsystem, an interval type-2 Takagi-Sugeno-Kang fuzzy logic system (IT2-TSK-FLS) is then employed to approximate the unavailable model information. Following the idea of a small-gain approach, a composite feedback form for each subsystem is constructed, based on which the final robust adaptive tracking control law is developed. Rigorous stability analysis shows that all signals in the derived closed-loop system are kept uniformly ultimately bounded (UUB). The main contribution of this paper is that the proposed control law for the hypersonic vehicle is with only two adaptive parameters in total which can greatly alleviate the computation and storage burden in practice; meanwhile its superiority over the conventional minimal-learning-parameter (MLP)-based one is specifically illustrated. Comparative numerical simulations of three cases demonstrate the effectiveness of our proposed control method with respect to complicated uncertainties.

5.
IEEE Trans Neural Netw Learn Syst ; 32(6): 2358-2372, 2021 Jun.
Article in English | MEDLINE | ID: mdl-32673195

ABSTRACT

Generating collision-free, time-efficient paths in an uncertain dynamic environment poses huge challenges for the formation control with collision avoidance (FCCA) problem in a leader-follower structure. In particular, the followers have to take both formation maintenance and collision avoidance into account simultaneously. Unfortunately, most of the existing works are simple combinations of methods dealing with the two problems separately. In this article, a new method based on deep reinforcement learning (RL) is proposed to solve the problem of FCCA. Especially, the learning-based policy is extended to the field of formation control, which involves a two-stage training framework: an imitation learning (IL) and later an RL. In the IL stage, a model-guided method consisting of a consensus theory-based formation controller and an optimal reciprocal collision avoidance strategy is designed to speed up training and increase efficiency. In the RL stage, a compound reward function is presented to guide the training. In addition, we design a formation-oriented network structure to perceive the environment. Long short-term memory is adopted to enable the network structure to perceive the information of obstacles of an uncertain number, and a transfer training approach is adopted to improve the generalization of the network in different scenarios. Numerous representative simulations are conducted, and our method is further deployed to an experimental platform based on a multiomnidirectional-wheeled car system. The effectiveness and practicability of our proposed method are validated through both the simulation and experiment results.

6.
Article in English | MEDLINE | ID: mdl-23002393

ABSTRACT

Coronary heart disease (CHD) is the leading causes of morbidity and mortality in China. The diagnosis of CHD in Traditional Chinese Medicine (TCM) was mainly based on experience in the past. In this paper, we proposed four MI-based association algorithms to analyze phenotype networks of CHD, and established scale of syndromes to automatically generate the diagnosis of patients based on their phenotypes. We also compared the change of core syndromes that CHD were combined with other diseases, and presented the different phenotype spectra.

7.
Article in English | MEDLINE | ID: mdl-22567030

ABSTRACT

Coronary artery disease (CAD) is the leading causes of deaths in the world. The differentiation of syndrome (ZHENG) is the criterion of diagnosis and therapeutic in TCM. Therefore, syndrome prediction in silico can be improving the performance of treatment. In this paper, we present a Bayesian network framework to construct a high-confidence syndrome predictor based on the optimum subset, that is, collected by Support Vector Machine (SVM) feature selection. Syndrome of CAD can be divided into asthenia and sthenia syndromes. According to the hierarchical characteristics of syndrome, we firstly label every case three types of syndrome (asthenia, sthenia, or both) to solve several syndromes with some patients. On basis of the three syndromes' classes, we design SVM feature selection to achieve the optimum symptom subset and compare this subset with Markov blanket feature select using ROC. Using this subset, the six predictors of CAD's syndrome are constructed by the Bayesian network technique. We also design Naïve Bayes, C4.5 Logistic, Radial basis function (RBF) network compared with Bayesian network. In a conclusion, the Bayesian network method based on the optimum symptoms shows a practical method to predict six syndromes of CAD in TCM.

8.
Pharm Biol ; 49(5): 445-55, 2011 May.
Article in English | MEDLINE | ID: mdl-21501098

ABSTRACT

CONTEXT: Chinese herbal medicine (CHM) is a complex multicomponent complex system that interacts with multiple targets and functions via multiple pathways based on the whole human system. Therefore, identification of key constituents of Chinese herbals (CH) not only plays a critical role in the quality control of CHM, but also paves a basis for redevelopment of them. OBJECTIVE: Identification of key constituents in volatile oil (VO) of Ligusticum chuanxiong Hort (Umbelliferae) (LCH), which is a CHM clinically used in China thousands of years ago. MATERIALS AND METHODS: The VO of LCH was pharmacologically demonstrated to have blood vessel activity (BVA) in vitro and chemically investigated by gas chromatography-mass spectrometry (GC-MS) analysis. Data mining approaches were used to bridge the gap between chemical constituents (CCS) and bioactivities as well as contribute to select key constituents of LCH automatically. RESULTS: Thirteen effective constituents of LCH with significant association with BVA were identified. CONCLUSION: The combination of 13 key constituents would accurately predict the bioactivities of blood vessel of LCH. Furthermore, the strategy presented here paves a strong basis for identification of key constituents of CH and elucidation of material basis of CHM.


Subject(s)
Data Mining/methods , Drugs, Chinese Herbal/analysis , Ligusticum/chemistry , Oils, Volatile/analysis , Animals , Gas Chromatography-Mass Spectrometry , Male , Oils, Volatile/pharmacology , Rats , Rats, Sprague-Dawley , Vasodilation/drug effects
9.
IEEE Trans Syst Man Cybern B Cybern ; 39(3): 788-99, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19336336

ABSTRACT

This paper addresses the robust trajectory tracking problem for a redundantly actuated omnidirectional mobile manipulator in the presence of uncertainties and disturbances. The development of control algorithms is based on sliding mode control (SMC) technique. First, a dynamic model is derived based on the practical omnidirectional mobile manipulator system. Then, a SMC scheme, based on the fixed large upper boundedness of the system dynamics (FLUBSMC), is designed to ensure trajectory tracking of the closed-loop system. However, the FLUBSMC scheme has inherent deficiency, which needs computing the upper boundedness of the system dynamics, and may cause high noise amplification and high control cost, particularly for the complex dynamics of the omnidirectional mobile manipulator system. Therefore, a robust neural network (NN)-based sliding mode controller (NNSMC), which uses an NN to identify the unstructured system dynamics directly, is further proposed to overcome the disadvantages of FLUBSMC and reduce the online computing burden of conventional NN adaptive controllers. Using learning ability of NN, NNSMC can coordinately control the omnidirectional mobile platform and the mounted manipulator with different dynamics effectively. The stability of the closed-loop system, the convergence of the NN weight-updating process, and the boundedness of the NN weight estimation errors are all strictly guaranteed. Then, in order to accelerate the NN learning efficiency, a partitioned NN structure is applied. Finally, simulation examples are given to demonstrate the proposed NNSMC approach can guarantee the whole system's convergence to the desired manifold with prescribed performance.

10.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 25(5): 1003-8, 2008 Oct.
Article in Chinese | MEDLINE | ID: mdl-19024435

ABSTRACT

Mutual information can measure arbitrary statistical dependencies. It has been applied to many kinds of fields widely. But when mutual information is used as the correlation measure, the features with more values are apt to be chosen. To solve this problem, a novel definition of correlation degree is proposed in this paper. It can avoid the shortcoming of selecting more value features when mutual information acted as the measure, and it can avoid the shortcoming of selecting less value features when correlation degree coefficients acted as the measure. In the method using the novel definition, the number of selected features is determined by the correct classification rate of Support Vector Machine. At last, the efficiency of the method is illustrated through analyzing the symptoms combination of seven essential elements of the syndrome corresponding to stroke.


Subject(s)
Computing Methodologies , Data Interpretation, Statistical , Medicine, Chinese Traditional/methods , Diagnosis, Differential , Humans , Medicine, Chinese Traditional/standards , Models, Statistical , Stroke/diagnosis
SELECTION OF CITATIONS
SEARCH DETAIL
...