Pesquisa | Portal Regional da BVS

Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics.

Massi, Elisa; Barthélemy, Jeanne; Mailly, Juliane; Dromnelle, Rémi; Canitrot, Julien; Poniatowski, Esther; Girard, Benoît; Khamassi, Mehdi.

Front Neurorobot ; 16: 864380, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35812782

RESUMO

Experience replay is widely used in AI to bootstrap reinforcement learning (RL) by enabling an agent to remember and reuse past experiences. Classical techniques include shuffled-, reversed-ordered- and prioritized-memory buffers, which have different properties and advantages depending on the nature of the data and problem. Interestingly, recent computational neuroscience work has shown that these techniques are relevant to model hippocampal reactivations recorded during rodent navigation. Nevertheless, the brain mechanisms for orchestrating hippocampal replay are still unclear. In this paper, we present recent neurorobotics research aiming to endow a navigating robot with a neuro-inspired RL architecture (including different learning strategies, such as model-based (MB) and model-free (MF), and different replay techniques). We illustrate through a series of numerical simulations how the specificities of robotic experimentation (e.g., autonomous state decomposition by the robot, noisy perception, state transition uncertainty, non-stationarity) can shed new lights on which replay techniques turn out to be more efficient in different situations. Finally, we close the loop by raising new hypotheses for neuroscience from such robotic models of hippocampal replay.

Combining Evolutionary and Adaptive Control Strategies for Quadruped Robotic Locomotion.

Massi, Elisa; Vannucci, Lorenzo; Albanese, Ugo; Capolei, Marie Claire; Vandesompele, Alexander; Urbain, Gabriel; Sabatini, Angelo Maria; Dambre, Joni; Laschi, Cecilia; Tolu, Silvia; Falotico, Egidio.

Front Neurorobot ; 13: 71, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31555118

RESUMO

In traditional robotics, model-based controllers are usually needed in order to bring a robotic plant to the next desired state, but they present critical issues when the dimensionality of the control problem increases and disturbances from the external environment affect the system behavior, in particular during locomotion tasks. It is generally accepted that the motion control of quadruped animals is performed by neural circuits located in the spinal cord that act as a Central Pattern Generator and can generate appropriate locomotion patterns. This is thought to be the result of evolutionary processes that have optimized this network. On top of this, fine motor control is learned during the lifetime of the animal thanks to the plastic connections of the cerebellum that provide descending corrective inputs. This research aims at understanding and identifying the possible advantages of using learning during an evolution-inspired optimization for finding the best locomotion patterns in a robotic locomotion task. Accordingly, we propose a comparative study between two bio-inspired control architectures for quadruped legged robots where learning takes place either during the evolutionary search or only after that. The evolutionary process is carried out in a simulated environment, on a quadruped legged robot. To verify the possibility of overcoming the reality gap, the performance of both systems has been analyzed by changing the robot dynamics and its interaction with the external environment. Results show better performance metrics for the robotic agent whose locomotion method has been discovered by applying the adaptive module during the evolutionary exploration for the locomotion trajectories. Even when the motion dynamics and the interaction with the environment is altered, the locomotion patterns found on the learning robotic system are more stable, both in the joint and in the task space.

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA