Búsqueda | Portal Regional de la BVS

Learning to express reward prediction error-like dopaminergic activity requires plastic representations of time.

Cone, Ian; Clopath, Claudia; Shouval, Harel Z.

Nat Commun ; 15(1): 5856, 2024 Jul 12.

Artículo en Inglés | MEDLINE | ID: mdl-38997276

RESUMEN

The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference learning (TD) learning, whereby certain units signal reward prediction errors (RPE). The TD algorithm has been traditionally mapped onto the dopaminergic system, as firing properties of dopamine neurons can resemble RPEs. However, certain predictions of TD learning are inconsistent with experimental results, and previous implementations of the algorithm have made unscalable assumptions regarding stimulus-specific fixed temporal bases. We propose an alternate framework to describe dopamine signaling in the brain, FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, dopamine release is similar, but not identical to RPE, leading to predictions that contrast to those of TD. While FLEX itself is a general theoretical framework, we describe a specific, biophysically plausible implementation, the results of which are consistent with a preponderance of both existing and reanalyzed experimental data.

Asunto(s)

Algoritmos , Dopamina , Neuronas Dopaminérgicas , Recompensa , Neuronas Dopaminérgicas/fisiología , Neuronas Dopaminérgicas/metabolismo , Dopamina/metabolismo , Animales , Aprendizaje/fisiología , Modelos Neurológicos , Humanos , Refuerzo en Psicología , Encéfalo/fisiología , Encéfalo/metabolismo , Plasticidad Neuronal/fisiología , Factores de Tiempo

Latent representations in hippocampal network model co-evolve with behavioral exploration of task structure.

Cone, Ian; Clopath, Claudia.

Nat Commun ; 15(1): 687, 2024 Jan 23.

Artículo en Inglés | MEDLINE | ID: mdl-38263408

RESUMEN

To successfully learn real-life behavioral tasks, animals must pair actions or decisions to the task's complex structure, which can depend on abstract combinations of sensory stimuli and internal logic. The hippocampus is known to develop representations of this complex structure, forming a so-called "cognitive map". However, the precise biophysical mechanisms driving the emergence of task-relevant maps at the population level remain unclear. We propose a model in which plateau-based learning at the single cell level, combined with reinforcement learning in an agent, leads to latent representational structures codependently evolving with behavior in a task-specific manner. In agreement with recent experimental data, we show that the model successfully develops latent structures essential for task-solving (cue-dependent "splitters") while excluding irrelevant ones. Finally, our model makes testable predictions concerning the co-dependent interactions between split representations and split behavioral policy during their evolution.

Asunto(s)

Hipocampo , Aprendizaje , Animales , Biofisica , Políticas , Refuerzo en Psicología

Learning to Express Reward Prediction Error-like Dopaminergic Activity Requires Plastic Representations of Time.

Cone, Ian; Clopath, Claudia; Shouval, Harel Z.

Res Sq ; 2023 Sep 19.

Artículo en Inglés | MEDLINE | ID: mdl-37790466

RESUMEN

The dominant theoretical framework to account for reinforcement learning in the brain is temporal difference (TD) reinforcement learning. The TD framework predicts that some neuronal elements should represent the reward prediction error (RPE), which means they signal the difference between the expected future rewards and the actual rewards. The prominence of the TD theory arises from the observation that firing properties of dopaminergic neurons in the ventral tegmental area appear similar to those of RPE model-neurons in TD learning. Previous implementations of TD learning assume a fixed temporal basis for each stimulus that might eventually predict a reward. Here we show that such a fixed temporal basis is implausible and that certain predictions of TD learning are inconsistent with experiments. We propose instead an alternative theoretical framework, coined FLEX (Flexibly Learned Errors in Expected Reward). In FLEX, feature specific representations of time are learned, allowing for neural representations of stimuli to adjust their timing and relation to rewards in an online manner. In FLEX dopamine acts as an instructive signal which helps build temporal models of the environment. FLEX is a general theoretical framework that has many possible biophysical implementations. In order to show that FLEX is a feasible approach, we present a specific biophysically plausible model which implements the principles of FLEX. We show that this implementation can account for various reinforcement learning paradigms, and that its results and predictions are consistent with a preponderance of both existing and reanalyzed experimental data.

Correction: Learning precise spatiotemporal sequences via biophysically realistic learning rules in a modular, spiking network.

Cone, Ian; Shouval, Harel Z.

Elife ; 122023 Mar 13.

Artículo en Inglés | MEDLINE | ID: mdl-36912878

Learning precise spatiotemporal sequences via biophysically realistic learning rules in a modular, spiking network.

Cone, Ian; Shouval, Harel Z.

Elife ; 102021 03 18.

Artículo en Inglés | MEDLINE | ID: mdl-33734085

RESUMEN

Multiple brain regions are able to learn and express temporal sequences, and this functionality is an essential component of learning and memory. We propose a substrate for such representations via a network model that learns and recalls discrete sequences of variable order and duration. The model consists of a network of spiking neurons placed in a modular microcolumn based architecture. Learning is performed via a biophysically realistic learning rule that depends on synaptic 'eligibility traces'. Before training, the network contains no memory of any particular sequence. After training, presentation of only the first element in that sequence is sufficient for the network to recall an entire learned representation of the sequence. An extended version of the model also demonstrates the ability to successfully learn and recall non-Markovian sequences. This model provides a possible framework for biologically plausible sequence learning and memory, in agreement with recent experimental results.

Asunto(s)

Potenciales de Acción/fisiología , Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Neuronas/fisiología , Fenómenos Biofísicos , Análisis Espacio-Temporal

Behavioral Time Scale Plasticity of Place Fields: Mathematical Analysis.

Cone, Ian; Shouval, Harel Z.

Front Comput Neurosci ; 15: 640235, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33732128

RESUMEN

Traditional synaptic plasticity experiments and models depend on tight temporal correlations between pre- and postsynaptic activity. These tight temporal correlations, on the order of tens of milliseconds, are incompatible with significantly longer behavioral time scales, and as such might not be able to account for plasticity induced by behavior. Indeed, recent findings in hippocampus suggest that rapid, bidirectional synaptic plasticity which modifies place fields in CA1 operates at behavioral time scales. These experimental results suggest that presynaptic activity generates synaptic eligibility traces both for potentiation and depression, which last on the order of seconds. These traces can be converted to changes in synaptic efficacies by the activation of an instructive signal that depends on naturally occurring or experimentally induced plateau potentials. We have developed a simple mathematical model that is consistent with these observations. This model can be fully analyzed to find the fixed points of induced place fields and how these fixed points depend on system parameters such as the size and shape of presynaptic place fields, the animal's velocity during induction, and the parameters of the plasticity rule. We also make predictions about the convergence time to these fixed points, both for induced and pre-existing place fields.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA