RESUMO
Optimal tracking in switched systems with fixed mode sequence and free final time is studied in this article. In the optimal control problem formulation, the switching times and the final time are treated as parameters. For solving the optimal control problem, approximate dynamic programming (ADP) is used. The ADP solution uses an inner loop to converge to the optimal policy at each time step. In order to decrease the computational burden of the solution, a new method is introduced, which uses evolving suboptimal policies (not the optimal policies), to learn the optimal solution. The effectiveness of the proposed solutions is evaluated through numerical simulations.
Assuntos
Aprendizagem , Redes Neurais de ComputaçãoRESUMO
Two approximate solutions for optimal control of switched systems with autonomous subsystems and continuous-time dynamics are presented. The first solution formulates a policy iteration (PI) algorithm for the switched systems with recursive least squares. To reduce the computational burden imposed by the PI algorithm, a second solution, called single loop PI, is presented. Online and concurrent training algorithms are discussed for implementing each solution. At last, effectiveness of the presented algorithms is evaluated through numerical simulations.