Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
2.
Front Robot AI ; 8: 733104, 2021.
Article in English | MEDLINE | ID: mdl-34977161

ABSTRACT

Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In optimal control theory, safety requirements are often expressed in terms of state and/or control constraints. In recent years, reinforcement learning approaches that rely on persistent excitation have been combined with a barrier transformation to learn the optimal control policies under state constraints. To soften the excitation requirements, model-based reinforcement learning methods that rely on exact model knowledge have also been integrated with the barrier transformation framework. The objective of this paper is to develop safe reinforcement learning method for deterministic nonlinear systems, with parametric uncertainties in the model, to learn approximate constrained optimal policies without relying on stringent excitation conditions. To that end, a model-based reinforcement learning technique that utilizes a novel filtered concurrent learning method, along with a barrier transformation, is developed in this paper to realize simultaneous learning of unknown model parameters and approximate optimal state-constrained control policies for safety-critical systems.

3.
IEEE Trans Neural Netw Learn Syst ; 30(6): 1716-1730, 2019 Jun.
Article in English | MEDLINE | ID: mdl-30369450

ABSTRACT

A function approximation method is developed which aims to approximate a function in a small neighborhood of a state that travels within a compact set. The method provides a novel approximation strategy for the efficient approximation of nonlinear functions for real-time simulations and experiments. The development is based on the theory of universal reproducing kernel Hilbert spaces over the n -dimensional Euclidean space. Several theorems are introduced which support the development of this state following (StaF) method. In particular, it is shown that there is a bound on the number of kernel functions required for the maintenance of an accurate function approximation as a state moves through a compact set. In addition, a weight update law, based on gradient descent, is introduced where arbitrarily close accuracy can be achieved provided the weight update law is iterated at a sufficient frequency, as detailed in Theorem 4. An experience-based approximation method is presented which utilizes the samples of the estimations of the ideal weights to generate a global approximation of a function. The experience-based approximation interpolates the samples of the weight estimates using radial basis functions. To illustrate the StaF method, the method is utilized for derivative estimation, function approximation, and is applied to an adaptive dynamic programming problem where it is demonstrated that the stability is maintained with a reduced number of basis functions.

4.
IEEE Trans Neural Netw Learn Syst ; 29(6): 2154-2166, 2018 06.
Article in English | MEDLINE | ID: mdl-29771668

ABSTRACT

An infinite-horizon optimal regulation problem for a control-affine deterministic system is solved online using a local state following (StaF) kernel and a regional model-based reinforcement learning (R-MBRL) method to approximate the value function. Unlike traditional methods such as R-MBRL that aim to approximate the value function over a large compact set, the StaF kernel approach aims to approximate the value function in a local neighborhood of the state that travels within a compact set. In this paper, the value function is approximated using a state-dependent convex combination of the StaF-based and the R-MBRL-based approximations. As the state enters a neighborhood containing the origin, the value function transitions from being approximated by the StaF approach to the R-MBRL approach. Semiglobal uniformly ultimately bounded (SGUUB) convergence of the system states to the origin is established using a Lyapunov-based analysis. Simulation results are provided for two, three, six, and ten-state dynamical systems to demonstrate the scalability and performance of the developed method.

5.
IEEE Trans Neural Netw Learn Syst ; 28(3): 753-758, 2017 03.
Article in English | MEDLINE | ID: mdl-26863674

ABSTRACT

This brief paper provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for control-affine continuous-time nonlinear systems with unknown drift dynamics. To relax the persistence of excitation condition, model-based reinforcement learning is implemented using a concurrent-learning-based system identifier to simulate experience by evaluating the Bellman error over unexplored areas of the state space. Tracking of the desired trajectory and convergence of the developed policy to a neighborhood of the optimal policy are established via Lyapunov-based stability analysis. Simulation results demonstrate the effectiveness of the developed technique.

6.
IEEE Trans Cybern ; 46(7): 1679-90, 2016 07.
Article in English | MEDLINE | ID: mdl-26241989

ABSTRACT

An upper motor neuron lesion (UMNL) can be caused by various neurological disorders or trauma and leads to disabilities. Neuromuscular electrical stimulation (NMES) is a technique that is widely used for rehabilitation and restoration of motor function for people suffering from UMNL. Typically, stability analysis for closed-loop NMES ignores the modulated implementation of NMES. However, electrical stimulation must be applied to muscle as a modulated series of pulses. In this paper, a muscle activation model with an amplitude modulated control input is developed to capture the discontinuous nature of muscle activation, and an identification-based closed-loop NMES controller is designed and analyzed for the uncertain amplitude modulated muscle activation model. Semi-global uniformly ultimately bounded tracking is guaranteed. The stability of the closed-loop system is analyzed with Lyapunov-based methods, and a pulse frequency related gain condition is obtained. Experiments are performed with five able-bodied subjects to demonstrate the interplay between the control gains and the pulse frequency, and results are provided which indicate that control gains should be increased to maintain stability if the stimulation pulse frequency is decreased to mitigate muscle fatigue. For the first time, this paper brings together an analysis of the controller and modulation scheme.


Subject(s)
Electric Stimulation Therapy , Electric Stimulation , Motor Neuron Disease , Motor Neurons , Algorithms , Electric Stimulation Therapy/standards , Extremities/physiopathology , Humans , Motor Neuron Disease/physiopathology , Motor Neuron Disease/therapy , Motor Neurons/physiology , Muscle Fatigue
7.
IEEE Trans Neural Netw Learn Syst ; 26(8): 1645-58, 2015 Aug.
Article in English | MEDLINE | ID: mdl-25312943

ABSTRACT

An approximate online equilibrium solution is developed for an N -player nonzero-sum game subject to continuous-time nonlinear unknown dynamics and an infinite horizon quadratic cost. A novel actor-critic-identifier structure is used, wherein a robust dynamic neural network is used to asymptotically identify the uncertain system with additive disturbances, and a set of critic and actor NNs are used to approximate the value functions and equilibrium policies, respectively. The weight update laws for the actor neural networks (NNs) are generated using a gradient-descent method, and the critic NNs are generated by least square regression, which are both based on the modified Bellman error that is independent of the system dynamics. A Lyapunov-based stability analysis shows that uniformly ultimately bounded tracking is achieved, and a convergence analysis demonstrates that the approximate control policies converge to a neighborhood of the optimal solutions. The actor, critic, and identifier structures are implemented in real time continuously and simultaneously. Simulations on two and three player games illustrate the performance of the developed method.


Subject(s)
Neural Networks, Computer , Nonlinear Dynamics , Uncertainty , Algorithms , Least-Squares Analysis , Signal Processing, Computer-Assisted
SELECTION OF CITATIONS
SEARCH DETAIL
...