Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38889021

RESUMO

This article proposes a data-driven model-free inverse Q -learning algorithm for continuous-time linear quadratic regulators (LQRs). Using an agent's trajectories of states and optimal control inputs, the algorithm reconstructs its cost function that captures the same trajectories. This article first poses a model-based inverse value iteration scheme using the agent's system dynamics. Then, an online model-free inverse Q -learning algorithm is developed to recover the agent's cost function only using the demonstrated trajectories. It is more efficient than the existing inverse reinforcement learning (RL) algorithms as it avoids the repetitive RL in inner loops. The proposed algorithms do not need initial stabilizing control policies and solve for unbiased solutions. The proposed algorithm's asymptotic stability, convergence, and robustness are guaranteed. Theoretical analysis and simulation examples show the effectiveness and advantages of the proposed algorithms.

2.
IEEE Trans Cybern ; 54(3): 1391-1402, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37906478

RESUMO

This article proposes a data-efficient model-free reinforcement learning (RL) algorithm using Koopman operators for complex nonlinear systems. A high-dimensional data-driven optimal control of the nonlinear system is developed by lifting it into the linear system model. We use a data-driven model-based RL framework to derive an off-policy Bellman equation. Building upon this equation, we deduce the data-efficient RL algorithm, which does not need a Koopman-built linear system model. This algorithm preserves dynamic information while reducing the required data for optimal control learning. Numerical and theoretical analyses of the Koopman eigenfunctions for dataset truncation are discussed in the proposed model-free data-efficient RL algorithm. We validate our framework on the excitation control of the power system.

3.
IEEE Trans Cybern ; 54(2): 728-738, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38133983

RESUMO

This article addresses the problem of learning the objective function of linear discrete-time systems that use static output-feedback (OPFB) control by designing inverse reinforcement learning (RL) algorithms. Most of the existing inverse RL methods require the availability of states and state-feedback control from the expert or demonstrated system. In contrast, this article considers inverse RL in a more general case where the demonstrated system uses static OPFB control with only input-output measurements available. We first develop a model-based inverse RL algorithm to reconstruct an input-output objective function of a demonstrated discrete-time system using its system dynamics and the OPFB gain. This objective function infers the demonstrations and OPFB gain of the demonstrated system. Then, an input-output Q -function is built for the inverse RL problem upon the state reconstruction technique. Given demonstrated inputs and outputs, a data-driven inverse Q -learning algorithm reconstructs the objective function without the knowledge of the demonstrated system dynamics or the OPFB gain. This algorithm yields unbiased solutions even though exploration noises exist. Convergence properties and the nonunique solution nature of the proposed algorithms are studied. Numerical simulation examples verify the effectiveness of the proposed methods.

4.
IEEE Trans Cybern ; 53(4): 2275-2287, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34623292

RESUMO

This article investigates differential graphical games for linear multiagent systems with a leader on fixed communication graphs. The objective is to make each agent synchronize to the leader and, meanwhile, optimize a performance index, which depends on the control policies of its own and its neighbors. To this end, a distributed adaptive Nash equilibrium solution is proposed for the differential graphical games. This solution, in contrast to the existing ones, is not only Nash but also fully distributed in the sense that each agent only uses local information of its own and its immediate neighbors without using any global information of the communication graph. Moreover, the asymptotic stability and global Nash equilibrium properties are analyzed for the proposed distributed adaptive Nash equilibrium solution. As an illustrative example, the differential graphical game solution is applied to the microgrid secondary control problem to achieve fully distributed voltage synchronization with optimized performance.

5.
IEEE Trans Cybern ; 53(7): 4555-4566, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36264741

RESUMO

This article considers autonomous systems whose behaviors seek to optimize an objective function. This goes beyond standard applications of condition-based maintenance, which seeks to detect faults or failures in nonoptimizing systems. Normal agents optimize a known accepted objective function, whereas abnormal or misbehaving agents may optimize a renegade objective that does not conform to the accepted one. We provide a unified framework for anomaly detection and correction in optimizing autonomous systems described by differential equations using inverse reinforcement learning (RL). We first define several types of anomalies and false alarms, including noise anomaly, objective function anomaly, intention (control gain) anomaly, abnormal behaviors, noise-anomaly false alarms, and objective false alarms. We then propose model-free inverse RL algorithms to reconstruct the objective functions and intentions for given system behaviors. The inverse RL procedure for anomaly detection and correction has the training phase, detection phase, and correction phase. First, inverse RL in the training phase infers the objective function and intention of the normal behavior system using offline stored data. Second, in the detection phase, inverse RL infers the objective function and intention for online observed test system behaviors using online observation data. They are then compared with that of the nominal system to identify anomalies. Third, correction is executed for the anomalous system to learn the normal objective and intention. Simulations and experiments on a quadrotor unmanned aerial vehicle (UAV) verify the proposed methods.


Assuntos
Aprendizagem , Reforço Psicológico , Algoritmos
6.
Artigo em Inglês | MEDLINE | ID: mdl-36315539

RESUMO

This article studies a distributed minmax strategy for multiplayer games and develops reinforcement learning (RL) algorithms to solve it. The proposed minmax strategy is distributed, in the sense that it finds each player's optimal control policy without knowing all the other players' policies. Each player obtains its distributed control policy by solving a distributed algebraic Riccati equation in a multiplayer noncooperative game. This policy is found against the worst policies of all the other players. We guarantee the existence of distributed minmax solutions and study their L2 and asymptotic stabilities. Under mild conditions, the resulting minmax control policies are shown to improve robust gain and phase margins of multiplayer systems compared to the standard linear-quadratic regulator controller. Distributed minmax solutions are found using both model-based policy iteration and data-driven off-policy RL algorithms. Simulation examples verify the proposed formulation and its computational efficiency over the nondistributed Nash solutions.

7.
Artigo em Inglês | MEDLINE | ID: mdl-35786561

RESUMO

This article proposes a data-driven inverse reinforcement learning (RL) control algorithm for nonzero-sum multiplayer games in linear continuous-time differential dynamical systems. The inverse RL problem in the games is solved by a learner reconstructing the unknown expert players' cost functions from demonstrated expert's optimal state and control input trajectories. The learner, thus, obtains the same control feedback gains and trajectories as the expert, only using data along system trajectories without knowing system dynamics. This article first proposes a model-based inverse RL policy iteration framework that has: 1) policy evaluation step for reconstructing cost matrices using Lyapunov functions; 2) state-reward weight improvement step using inverse optimal control (IOC); and 3) policy improvement step using optimal control. Based on the model-based policy iteration algorithm, this article further develops an online data-driven off-policy inverse RL algorithm without knowing any knowledge of system dynamics or expert control gains. Rigorous convergence and stability analysis of the algorithms are provided. It shows that the off-policy inverse RL algorithm guarantees unbiased solutions while probing noises are added to satisfy the persistence of excitation (PE) condition. Finally, two different simulation examples validate the effectiveness of the proposed algorithms.

8.
IEEE Trans Cybern ; 50(3): 1240-1250, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30908252

RESUMO

Resilient and robust distributed control protocols for multiagent systems under attacks on sensors and actuators are designed. A distributed H∞ control protocol is designed to attenuate the disturbance or attack effects. However, the H∞ controller is too conservative in the presence of attacks. Therefore, it is augmented with a distributed adaptive compensator to mitigate the adverse effects of attacks. The proposed controller can make the synchronization error arbitrarily small in the presence of faulty attacks, and satisfy global L2 -gain performance in the presence of malicious attacks or disturbances. A significant advantage of the proposed method is that it requires no restriction on the number of agents or agents' neighbors under attacks on sensors and/or actuators, and it recovers even compromised agents under attacks on actuators. Simulation examples verify the effectiveness of the proposed method.

9.
IET Nanobiotechnol ; 12(6): 757-763, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30104449

RESUMO

Chondroitin sulphate is a sulphated glycosaminoglycan biopolymer composed over 100 individual sugars. Chondroitin sulphate nanoparticles (NPs) loaded with catechin were prepared by an ionic gelation method using AlCl3 and optimised for polymer and cross-linking agent concentration, curing time and stirring speed. Zeta potential, particle size, loading efficiency, and release efficiency over 24 h (RE24%) were evaluated. The surface morphology of NPs was investigated by scanning electron microscopy and their thermal behaviour by differential scanning calorimetric. Antioxidant effect of NPs was determined by chelating activity of iron ions. The cell viability of mesenchymal stem cells was determined by 3-[4, 5-dimethylthiazol-2-yl]-2, 5-diphenyl tetrazolium bromide assay and the calcification of osteoblasts was studied by Alizarin red staining. The optimised NPs showed particle size of 176 nm, zeta potential of -20.8 mV, loading efficiency of 93.3% and RE24% of 80.6%. The chatechin loaded chondroitin sulphate NPs showed 70-fold more antioxidant activity, 3-fold proliferation effect and higher calcium precipitation in osteoblasts than free catechin.


Assuntos
Alumínio/química , Sulfatos de Condroitina/síntese química , Portadores de Fármacos/síntese química , Composição de Medicamentos/métodos , Flavonoides/administração & dosagem , Nanopartículas/química , Chá/química , Calcificação Fisiológica/efeitos dos fármacos , Catequina/administração & dosagem , Catequina/isolamento & purificação , Catequina/farmacocinética , Sobrevivência Celular/efeitos dos fármacos , Células Cultivadas , Sulfatos de Condroitina/química , Reagentes de Ligações Cruzadas/química , Portadores de Fármacos/química , Liberação Controlada de Fármacos , Flavonoides/isolamento & purificação , Flavonoides/farmacocinética , Humanos , Íons , Células-Tronco Mesenquimais/efeitos dos fármacos , Células-Tronco Mesenquimais/fisiologia , Osteoblastos/efeitos dos fármacos , Osteoblastos/fisiologia , Tamanho da Partícula
10.
IEEE Trans Cybern ; 48(11): 3197-3207, 2018 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-29989978

RESUMO

This paper investigates optimal robust output containment problem of general linear heterogeneous multiagent systems (MAS) with completely unknown dynamics. A model-based algorithm using offline policy iteration (PI) is first developed, where the -copy internal model principle is utilized to address the system parameter variations. This offline PI algorithm requires the nominal model of each agent, which may not be available in most real-world applications. To address this issue, a discounted performance function is introduced to express the optimal robust output containment problem as an optimal output-feedback design problem with bounded -gain. To solve this problem online in real time, a Bellman equation is first developed to evaluate a certain control policy and find the updated control policies, simultaneously, using only the state/output information measured online. Then, using this Bellman equation, a model-free off-policy integral reinforcement learning algorithm is proposed to solve the optimal robust output containment problem of heterogeneous MAS, in real time, without requiring any knowledge of the system dynamics. Simulation results are provided to verify the effectiveness of the proposed method.

11.
IEEE Trans Cybern ; 47(8): 2099-2109, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28060716

RESUMO

This paper studies the output containment control of linear heterogeneous multi-agent systems, where the system dynamics and even the state dimensions can generally be different. Since the states can have different dimensions, standard results from state containment control do not apply. Therefore, the control objective is to guarantee the convergence of the output of each follower to the dynamic convex hull spanned by the outputs of leaders. This can be achieved by making certain output containment errors go to zero asymptotically. Based on this formulation, two different control protocols, namely, full-state feedback and static output-feedback, are designed based on internal model principles. Sufficient local conditions for the existence of the proposed control protocols are developed in terms of stabilizing the local followers' dynamics and satisfying a certain H∞ criterion. Unified design procedures to solve the proposed two control protocols are presented by formulation and solution of certain local state-feedback and static output-feedback problems, respectively. Numerical simulations are given to validate the proposed control protocols.

12.
Caspian J Intern Med ; 5(4): 227-31, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25489435

RESUMO

BACKGROUND: The antibiotic resistance of nosocomial organisms is rapidly increasing. The purpose of this study was to determine the frequency of bacterial agents isolated from patients with nosocomial infection. METHODS: This study was performed in the different wards of teaching hospitals of Mazandaran University of Medical Sciences (northern Iran). The study population consists of the patients with the symptoms of nosocomial infection admitted in these hospitals in 2012. The patient data (including age, sex, type of infection, type of isolated organisms and their antibiotic susceptibility) were collected and analyzed. RESULTS: The total number of hospitalizations was 57122 and the number of nosocomial infection was 592. The overall prevalence of nosocomial infection was 1.03% that was mostly in Burn unit and intensive care unit. The most common nosocomial infection was wound infection (44.6%) and the most common organisms were Pseudomonas aeruginosa and Acinetobacter. CONCLUSION: Given the increasing numbers of nosocomial infection in this region, especially infection with Pseudomonas aeruginosa, it is necessary to make a precise reporting and improve the procedures of infection control in hospitals.

13.
Caspian J Intern Med ; 5(2): 127-9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24778791

RESUMO

BACKGROUND: Brucellosis can involve almost any organ system and may present with a broad spectrum of clinical presentations. In this study, we present a case of deep vein thrombosis due to human brucellosis. CASE PRESENTATION: A 15- year old boy presented with acute pain and swelling in his left thigh in June 2011, when he complained of fever, chills and lower extremity pain in which he could barely walk. In family history, his older brother had brucellosis 3 weeks ago and appropriate medication was given. The tubal standard agglutination test (wright test) and 2ME test were positive (in a titer of 1/1280 and 1/640, respectively). Peripheral venous doppler ultrasound of left lower extremity showed that common iliac, femoral, external iliac, superficial and deep femoral vein and popliteal vein were enlarged and contained with echogenous clot. He was treated with rifampicin 600 mg once a day, doxycycline 100 mg twice a day (both for three months) and amikacin 500 mg twice a day (for 2 weeks) accompanied with anti-coagulant. Ten days after the onset of this treatment, thrombophlebitis was cured. The follow up of the patient showed no abnormality after approximately one year later. CONCLUSION: In brucellosis endemic areas, the clinicians who encounter patients with deep vein thrombosis and current history of a febrile illness, should consider the likelihood of brucellosis.

14.
Caspian J Intern Med ; 3(1): 372-6, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-26557289

RESUMO

BACKGROUND: Allergic diseases including asthma, allergic rhinitis (AR) and eczema are common chronic diseases in children. The purpose of this study was to determine the prevalence of asthma, AR and eczema in Sari, Iran. METHODS: This study was carried out on all elementary schools selected as a cluster from February 2010 to July 2010 in Sari, North of Iran. A questionnaire was provided according to International Study of Asthma and Allergies in Childhood (ISAAC) protocol. Asthma, AR, eczema and their combinations were recorded. RESULTS: Out of the 1818 cases, 646 (35%) subjects had allergic disorder; 223 (12%) had asthma, 318 (17%) had AR and 105 (6%) had eczema The prevalence of allergic disorder in boys (65%) was higher than the girls (40%) (p<0.05). CONCLUSION: The results show that around one - third of the elementary school children have allergic disorders. The prevalence in males is higher than the females.

15.
Caspian J Intern Med ; 3(1): 377-81, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-26557290

RESUMO

BACKGROUND: The clinical manifestations and outcome of influenza infection differ between various patients in the world. The purpose of this study was to assess the clinical manifestations of patients with confirmed or suspected novel H1N1 flu infection in Sari, North of Iran. METHODS: From September 2009 to January 2010, the patients' data were collected by retrospective chart review of medical records. Laboratory confirmation included a positive RT-PCR (reverse transcriptase-polymerase-chain-reaction assay) from a nasal or pharyngeal swab sample. RESULTS: Nearly 80% of established patients were in age group of 15-45 years. Approximately 14.6% of female cases were pregnant There was no significant difference in clinical and laboratory characteristics of patients with confirmed H1N1 virus infection to total cases with Influenza Like Illness (ILI). Thirty nine (95.1%) of the established patients had a combination of fever plus sore throat or cough. Relative lymphopenia was reported in 36.6%. Pneumonia was the most common complication. Acute pericarditis evolved in one case and aseptic meningitis was reported in another. CONCLUSION: Precise collecting information of clinical manifestations, risk factors and other characteristics of flu, can help to the early infection detection, timely treatment of patients and proper preventive measurements.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...