Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
Add more filters










Publication year range
1.
J Neurosci ; 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38969504

ABSTRACT

Dopamine release in the nucleus accumbens core (NAcC) is generally considered to be a proxy for phasic firing of dopamine neurons in the ventral tegmental area (VTADA). Thus, dopamine release in NAcC is hypothesized to reflect a unitary role in reward prediction error signalling. However, recent studies revealed more diverse roles of dopamine neurons, which support an emerging idea that dopamine regulates learning differently in distinct circuits. To understand whether the NAcC might regulate a unique component of learning, we recorded dopamine release in NAcC while male rats performed a backward conditioning task where a reward is followed by a cue. We used this task because we can delineate different components of learning, which include sensory-specific inhibitory and general excitatory components. Further, we have shown that VTADA neurons are necessary for both the specific and general components of backward associations. Here, we found that dopamine release in NAcC increased to the reward across learning, while reducing to the cue that followed as it became more expected. This mirrors the dopamine prediction error signal seen during forward conditioning and cannot be accounted for temporal-difference reinforcement learning (TDRL). Subsequent tests allowed us to dissociate these learning components and revealed that dopamine release in NAcC reflects the general excitatory component of backward associations, but not their sensory-specific component. These results emphasize the importance of examining distinct functions of different dopamine projections in reinforcement learning.Significance Statement Dopamine regulates reinforcement learning. While it was previously believed that this system contributed to simple value assignment to reward cues, we now know dopamine plays increasingly diverse roles in reinforcement learning. How these diverse roles are achieved in distinct circuits is not fully understood. By using behavioural tasks that examine distinctive components of learning separately, we reveal that NAcC dopamine release contributes to a unique component of learning. Thus, the present study supports a distinct role of NAcC in reinforcement learning, consistent with the idea that different dopamine systems serve different learning functions. Examining the roles of different dopamine projections is an important approach to identify neuronal mechanisms underlying the reinforcement-learning deficits observed in schizophrenia and drug addiction.

2.
Nat Neurosci ; 2024 May 13.
Article in English | MEDLINE | ID: mdl-38741021

ABSTRACT

Dopamine neurons in the ventral tegmental area support intracranial self-stimulation (ICSS), yet the cognitive representations underlying this phenomenon remain unclear. Here, 20-Hz stimulation of dopamine neurons, which approximates a physiologically relevant prediction error, was not sufficient to support ICSS beyond a continuously reinforced schedule and did not endow cues with a general or specific value. However, 50-Hz stimulation of dopamine neurons was sufficient to drive robust ICSS and was represented as a specific reward to motivate behavior. The frequency dependence of this effect is due to the rate (not the number) of action potentials produced by dopamine neurons, which differently modulates dopamine release downstream.

3.
Nat Neurosci ; 27(4): 728-736, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38396258

ABSTRACT

To make adaptive decisions, we build an internal model of the associative relationships in an environment and use it to make predictions and inferences about specific available outcomes. Detailed, identity-specific cue-reward memories are a core feature of such cognitive maps. Here we used fiber photometry, cell-type and pathway-specific optogenetic manipulation, Pavlovian cue-reward conditioning and decision-making tests in male and female rats, to reveal that ventral tegmental area dopamine (VTADA) projections to the basolateral amygdala (BLA) drive the encoding of identity-specific cue-reward memories. Dopamine is released in the BLA during cue-reward pairing; VTADA→BLA activity is necessary and sufficient to link the identifying features of a reward to a predictive cue but does not assign general incentive properties to the cue or mediate reinforcement. These data reveal a dopaminergic pathway for the learning that supports adaptive decision-making and help explain how VTADA neurons achieve their emerging multifaceted role in learning.


Subject(s)
Basolateral Nuclear Complex , Rats , Male , Female , Animals , Basolateral Nuclear Complex/physiology , Dopamine , Learning/physiology , Reward , Reinforcement, Psychology , Cues
4.
Trends Cogn Sci ; 28(1): 18-29, 2024 01.
Article in English | MEDLINE | ID: mdl-37758590

ABSTRACT

Despite the physiological complexity of the hypothalamus, its role is typically restricted to initiation or cessation of innate behaviors. For example, theories of lateral hypothalamus argue that it is a switch to turn feeding 'on' and 'off' as dictated by higher-order structures that render when feeding is appropriate. However, recent data demonstrate that the lateral hypothalamus is critical for learning about food-related cues. Furthermore, the lateral hypothalamus opposes learning about information that is neutral or distal to food. This reveals the lateral hypothalamus as a unique arbitrator of learning capable of shifting behavior toward or away from important events. This has relevance for disorders characterized by changes in this balance, including addiction and schizophrenia. Generally, this suggests that hypothalamic function is more complex than increasing or decreasing innate behaviors.


Subject(s)
Hypothalamic Area, Lateral , Hypothalamus , Humans , Hypothalamic Area, Lateral/physiology , Hypothalamus/physiology , Learning/physiology , Cues , Cognition , Reward
5.
Elife ; 112022 08 23.
Article in English | MEDLINE | ID: mdl-35997072

ABSTRACT

Quantitative descriptions of animal behavior are essential to study the neural substrates of cognitive and emotional processes. Analyses of naturalistic behaviors are often performed by hand or with expensive, inflexible commercial software. Recently, machine learning methods for markerless pose estimation enabled automated tracking of freely moving animals, including in labs with limited coding expertise. However, classifying specific behaviors based on pose data requires additional computational analyses and remains a significant challenge for many groups. We developed BehaviorDEPOT (DEcoding behavior based on POsitional Tracking), a simple, flexible software program that can detect behavior from video timeseries and can analyze the results of experimental assays. BehaviorDEPOT calculates kinematic and postural statistics from keypoint tracking data and creates heuristics that reliably detect behaviors. It requires no programming experience and is applicable to a wide range of behaviors and experimental designs. We provide several hard-coded heuristics. Our freezing detection heuristic achieves above 90% accuracy in videos of mice and rats, including those wearing tethered head-mounts. BehaviorDEPOT also helps researchers develop their own heuristics and incorporate them into the software's graphical interface. Behavioral data is stored framewise for easy alignment with neural data. We demonstrate the immediate utility and flexibility of BehaviorDEPOT using popular assays including fear conditioning, decision-making in a T-maze, open field, elevated plus maze, and novel object exploration.


Subject(s)
Behavior, Animal , Software , Animals , Biomechanical Phenomena , Machine Learning , Rats
6.
Nat Neurosci ; 25(8): 1071-1081, 2022 08.
Article in English | MEDLINE | ID: mdl-35902648

ABSTRACT

Studies investigating the neural mechanisms by which associations between cues and predicted outcomes control behavior often use associative learning frameworks to understand the neural control of behavior. These frameworks do not always account for the full range of effects that novelty can have on behavior and future associative learning. Here, in mice, we show that dopamine in the nucleus accumbens core is evoked by novel, neutral stimuli, and the trajectory of this response over time tracked habituation to these stimuli. Habituation to novel cues before associative learning reduced future associative learning, a phenomenon known as latent inhibition. Crucially, trial-by-trial dopamine response patterns tracked this phenomenon. Optogenetic manipulation of dopamine responses to the cue during the habituation period bidirectionally influenced future associative learning. Thus, dopamine signaling in the nucleus accumbens core has a causal role in novelty-based learning in a way that cannot be predicted based on purely associative factors.


Subject(s)
Dopamine , Nucleus Accumbens , Animals , Conditioning, Classical/physiology , Cues , Dopamine/physiology , Memory , Mice , Nucleus Accumbens/physiology
7.
Curr Biol ; 32(14): 3210-3218.e3, 2022 07 25.
Article in English | MEDLINE | ID: mdl-35752165

ABSTRACT

For over two decades, phasic activity in midbrain dopamine neurons was considered synonymous with the prediction error in temporal-difference reinforcement learning.1-4 Central to this proposal is the notion that reward-predictive stimuli become endowed with the scalar value of predicted rewards. When these cues are subsequently encountered, their predictive value is compared to the value of the actual reward received, allowing for the calculation of prediction errors.5,6 Phasic firing of dopamine neurons was proposed to reflect this computation,1,2 facilitating the backpropagation of value from the predicted reward to the reward-predictive stimulus, thus reducing future prediction errors. There are two critical assumptions of this proposal: (1) that dopamine errors can only facilitate learning about scalar value and not more complex features of predicted rewards, and (2) that the dopamine signal can only be involved in anticipatory cue-reward learning in which cues or actions precede rewards. Recent work7-15 has challenged the first assumption, demonstrating that phasic dopamine signals across species are involved in learning about more complex features of the predicted outcomes, in a manner that transcends this value computation. Here, we tested the validity of the second assumption. Specifically, we examined whether phasic midbrain dopamine activity would be necessary for backward conditioning-when a neutral cue reliably follows a rewarding outcome.16-20 Using a specific Pavlovian-to-instrumental transfer (PIT) procedure,21-23 we show rats learn both excitatory and inhibitory components of a backward association, and that this association entails knowledge of the specific identity of the reward and cue. We demonstrate that brief optogenetic inhibition of VTADA neurons timed to the transition between the reward and cue reduces both of these components of backward conditioning. These findings suggest VTADA neurons are capable of facilitating associations between contiguously occurring events, regardless of the content of those events. We conclude that these data may be in line with suggestions that the VTADA error acts as a universal teaching signal. This may provide insight into why dopamine function has been implicated in myriad psychological disorders that are characterized by very distinct reinforcement-learning deficits.


Subject(s)
Dopamine , Reward , Animals , Cues , Dopamine/physiology , Dopaminergic Neurons/physiology , Learning/physiology , Rats , Reinforcement, Psychology
8.
Behav Brain Res ; 417: 113587, 2022 01 24.
Article in English | MEDLINE | ID: mdl-34543677

ABSTRACT

Prior experience changes the way we learn about our environment. Stress predisposes individuals to developing psychological disorders, just as positive experiences protect from this eventuality (Kirkpatrick & Heller, 2014; Koenigs & Grafman, 2009; Pechtel & Pizzagalli, 2011). Yet current models of how the brain processes information often do not consider a role for prior experience. The considerable literature that examines how stress impacts the brain is an exception to this. This research demonstrates that stress can bias the interpretation of ambiguous events towards being aversive in nature, owed to changes in amygdala physiology (Holmes et al., 2013; Perusini et al., 2016; Rau et al., 2005; Shors et al., 1992). This is thought to be an important model for how people develop anxiety disorders, like post-traumatic stress disorder (PTSD; Rau et al., 2005). However, more recent evidence suggests that experience with reward learning can also change the neural circuits that are involved in learning about fear (Sharpe et al., 2021). Specifically, the lateral hypothalamus, a region typically restricted to modulating feeding and reward behavior, can be recruited to encode fear memories after experience with reward learning. This review discusses the literature on how stress and reward change the way we acquire and encode memories for aversive events, offering a testable model of how these regions may interact to promote either adaptive or maladaptive fear memories.


Subject(s)
Amygdala/physiology , Fear/physiology , Hypothalamic Area, Lateral/physiology , Memory/physiology , Reward , Brain/physiology , Humans , Learning/physiology
9.
Neuropsychopharmacology ; 47(3): 628-640, 2022 02.
Article in English | MEDLINE | ID: mdl-34588607

ABSTRACT

Schizophrenia is a severe psychiatric disorder affecting 21 million people worldwide. People with schizophrenia suffer from symptoms including psychosis and delusions, apathy, anhedonia, and cognitive deficits. Strikingly, schizophrenia is characterised by a learning paradox involving difficulties learning from rewarding events, whilst simultaneously 'overlearning' about irrelevant or neutral information. While dysfunction in dopaminergic signalling has long been linked to the pathophysiology of schizophrenia, a cohesive framework that accounts for this learning paradox remains elusive. Recently, there has been an explosion of new research investigating how dopamine contributes to reinforcement learning, which illustrates that midbrain dopamine contributes in complex ways to reinforcement learning, not previously envisioned. This new data brings new possibilities for how dopamine signalling contributes to the symptomatology of schizophrenia. Building on recent work, we present a new neural framework for how we might envision specific dopamine circuits contributing to this learning paradox in schizophrenia in the context of models of reinforcement learning. Further, we discuss avenues of preclinical research with the use of cutting-edge neuroscience techniques where aspects of this model may be tested. Ultimately, it is hoped that this review will spur to action more research utilising specific reinforcement learning paradigms in preclinical models of schizophrenia, to reconcile seemingly disparate symptomatology and develop more efficient therapeutics.


Subject(s)
Psychotic Disorders , Schizophrenia , Dopamine/physiology , Humans , Psychotic Disorders/psychology , Reinforcement, Psychology , Reward
10.
Front Behav Neurosci ; 15: 745388, 2021.
Article in English | MEDLINE | ID: mdl-34671247

ABSTRACT

Higher-order conditioning involves learning causal links between multiple events, which then allows one to make novel inferences. For example, observing a correlation between two events (e.g., a neighbor wearing a particular sports jersey), later helps one make new predictions based on this knowledge (e.g., the neighbor's wife's favorite sports team). This type of learning is important because it allows one to benefit maximally from previous experiences and perform adaptively in complex environments where many things are ambiguous or uncertain. Two procedures in the lab are often used to probe this kind of learning, second-order conditioning (SOC) and sensory preconditioning (SPC). In second-order conditioning (SOC), we first teach subjects that there is a relationship between a stimulus and an outcome (e.g., a tone that predicts food). Then, an additional stimulus is taught to precede the predictive stimulus (e.g., a light leads to the food-predictive tone). In sensory preconditioning (SPC), this order of training is reversed. Specifically, the two neutral stimuli (i.e., light and tone) are first paired together and then the tone is paired separately with food. Interestingly, in both SPC and SOC, humans, rodents, and even insects, and other invertebrates will later predict that both the light and tone are likely to lead to food, even though they only experienced the tone directly paired with food. While these processes are procedurally similar, a wealth of research suggests they are associatively and neurobiologically distinct. However, midbrain dopamine, a neurotransmitter long thought to facilitate basic Pavlovian conditioning in a relatively simplistic manner, appears critical for both SOC and SPC. These findings suggest dopamine may contribute to learning in ways that transcend differences in associative and neurological structure. We discuss how research demonstrating that dopamine is critical to both SOC and SPC places it at the center of more complex forms of cognition (e.g., spatial navigation and causal reasoning). Further, we suggest that these more sophisticated learning procedures, coupled with recent advances in recording and manipulating dopamine neurons, represent a new path forward in understanding dopamine's contribution to learning and cognition.

11.
Nat Neurosci ; 24(3): 391-400, 2021 03.
Article in English | MEDLINE | ID: mdl-33589832

ABSTRACT

Experimental research controls for past experience, yet prior experience influences how we learn. Here, we tested whether we could recruit a neural population that usually encodes rewards to encode aversive events. Specifically, we found that GABAergic neurons in the lateral hypothalamus (LH) were not involved in learning about fear in naïve rats. However, if these rats had prior experience with rewards, LH GABAergic neurons became important for learning about fear. Interestingly, inhibition of these neurons paradoxically enhanced learning about neutral sensory information, regardless of prior experience, suggesting that LH GABAergic neurons normally oppose learning about irrelevant information. These experiments suggest that prior experience shapes the neural circuits recruited for future learning in a highly specific manner, reopening the neural boundaries we have drawn for learning of particular types of information from work in naïve subjects.


Subject(s)
Conditioning, Classical/physiology , Fear/physiology , GABAergic Neurons/physiology , Hypothalamic Area, Lateral/physiology , Learning/physiology , Animals , Cues , Female , Male , Neural Pathways/physiology , Rats , Rats, Long-Evans , Rats, Transgenic , Reward
12.
J Neurosci ; 41(2): 342-353, 2021 01 13.
Article in English | MEDLINE | ID: mdl-33219006

ABSTRACT

Substance use disorders (SUDs) are characterized by maladaptive behavior. The ability to properly adjust behavior according to changes in environmental contingencies necessitates the interlacing of existing memories with updated information. This can be achieved by assigning learning in different contexts to compartmentalized "states." Though not often framed this way, the maladaptive behavior observed in individuals with SUDs may result from a failure to properly encode states because of drug-induced neural alterations. Previous studies found that the dorsomedial striatum (DMS) is important for behavioral flexibility and state encoding, suggesting the DMS may be an important substrate for these effects. Here, we recorded DMS neural activity in cocaine-experienced male rats during a decision-making task where blocks of trials represented distinct states to probe whether the encoding of state and state-related information is affected by prior drug exposure. We found that DMS medium spiny neurons (MSNs) and fast-spiking interneurons (FSIs) encoded such information and that prior cocaine experience disrupted the evolution of representations both within trials and across recording sessions. Specifically, DMS MSNs and FSIs from cocaine-experienced rats demonstrated higher classification accuracy of trial-specific rules, defined by response direction and value, compared with those drawn from sucrose-experienced rats, and these overly strengthened trial-type representations were related to slower switching behavior and reaction times. These data show that prior cocaine experience paradoxically increases the encoding of state-specific information and rules in the DMS and suggest a model in which abnormally specific and persistent representation of rules throughout trials in DMS slows value-based decision-making in well trained subjects.SIGNIFICANCE STATEMENT Substance use disorders (SUDs) may result from a failure to properly encode rules guiding situationally appropriate behavior. The dorsomedial striatum (DMS) is thought to be important for such behavioral flexibility and encoding that defines the situation or "state." This suggests that the DMS may be an important substrate for the maladaptive behavior observed in SUDs. In the current study, we show that prior cocaine experience results in over-encoding of state-specific information and rules in the DMS, which may impair normal adaptive decision-making in the task, akin to what is observed in SUDs.


Subject(s)
Cocaine-Related Disorders/psychology , Cocaine/pharmacology , Decision Making/drug effects , Neostriatum/drug effects , Animals , Choice Behavior/drug effects , Interneurons/drug effects , Male , Neurons/drug effects , Odorants , Psychomotor Performance/drug effects , Rats , Rats, Long-Evans , Reaction Time/drug effects , Reward , Self Administration , Sucrose/pharmacology
13.
Elife ; 92020 08 24.
Article in English | MEDLINE | ID: mdl-32831173

ABSTRACT

The orbitofrontal cortex (OFC) is necessary for inferring value in tests of model-based reasoning, including in sensory preconditioning. This involvement could be accounted for by representation of value or by representation of broader associative structure. We recently reported neural correlates of such broader associative structure in OFC during the initial phase of sensory preconditioning (Sadacca et al., 2018). Here, we used optogenetic inhibition of OFC to test whether these correlates might be necessary for value inference during later probe testing. We found that inhibition of OFC during cue-cue learning abolished value inference during the probe test, inference subsequently shown in control rats to be sensitive to devaluation of the expected reward. These results demonstrate that OFC must be online during cue-cue learning, consistent with the argument that the correlates previously observed are not simply downstream readouts of sensory processing and instead contribute to building the associative model supporting later behavior.


Subject(s)
Conditioning, Psychological/physiology , Learning/physiology , Prefrontal Cortex/physiology , Animals , Cues , Female , Male , Optogenetics , Rats , Rats, Long-Evans
14.
Nat Commun ; 11(1): 106, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31913274

ABSTRACT

Dopamine neurons are proposed to signal the reward prediction error in model-free reinforcement learning algorithms. This term represents the unpredicted or 'excess' value of the rewarding event, value that is then added to the intrinsic value of any antecedent cues, contexts or events. To support this proposal, proponents cite evidence that artificially-induced dopamine transients cause lasting changes in behavior. Yet these studies do not generally assess learning under conditions where an endogenous prediction error would occur. Here, to address this, we conducted three experiments where we optogenetically activated dopamine neurons while rats were learning associative relationships, both with and without reward. In each experiment, the antecedent cues failed to acquire value and instead entered into associations with the later events, whether valueless cues or valued rewards. These results show that in learning situations appropriate for the appearance of a prediction error, dopamine transients support associative, rather than model-free, learning.


Subject(s)
Dopamine/metabolism , Dopaminergic Neurons/physiology , Learning , Animals , Behavior, Animal , Conditioning, Classical , Cues , Female , Male , Models, Neurological , Rats , Reward
15.
Nat Neurosci ; 23(2): 176-178, 2020 02.
Article in English | MEDLINE | ID: mdl-31959935

ABSTRACT

Reward-evoked dopamine transients are well established as prediction errors. However, the central tenet of temporal difference accounts-that similar transients evoked by reward-predictive cues also function as errors-remains untested. In the present communication we addressed this by showing that optogenetically shunting dopamine activity at the start of a reward-predicting cue prevents second-order conditioning without affecting blocking. These results indicate that cue-evoked transients function as temporal-difference prediction errors rather than reward predictions.


Subject(s)
Association Learning/physiology , Brain/physiology , Dopamine/metabolism , Animals , Conditioning, Operant/physiology , Cues , Dopaminergic Neurons/physiology , Rats , Rats, Long-Evans , Rats, Transgenic , Reward
16.
Annu Rev Psychol ; 70: 53-76, 2019 01 04.
Article in English | MEDLINE | ID: mdl-30260745

ABSTRACT

Making decisions in environments with few choice options is easy. We select the action that results in the most valued outcome. Making decisions in more complex environments, where the same action can produce different outcomes in different conditions, is much harder. In such circumstances, we propose that accurate action selection relies on top-down control from the prelimbic and orbitofrontal cortices over striatal activity through distinct thalamostriatal circuits. We suggest that the prelimbic cortex exerts direct influence over medium spiny neurons in the dorsomedial striatum to represent the state space relevant to the current environment. Conversely, the orbitofrontal cortex is argued to track a subject's position within that state space, likely through modulation of cholinergic interneurons.


Subject(s)
Cerebral Cortex/physiology , Corpus Striatum/physiology , Decision Making/physiology , Executive Function/physiology , Models, Psychological , Animals , Humans
17.
Psychol Rev ; 125(5): 822-843, 2018 10.
Article in English | MEDLINE | ID: mdl-30299142

ABSTRACT

Theories of functioning in the medial prefrontal cortex are distinct across appetitively and aversively motivated procedures. In the appetitive domain, it is argued that the medial prefrontal cortex is important for producing adaptive behavior when circumstances change. This view advocates a role for this region in using higher-order information to bias performance appropriate to that circumstance. Conversely, literature born out of aversive studies has led to the theory that the prelimbic region of the medial prefrontal cortex is necessary for the expression of conditioned fear, whereas the infralimbic region is necessary for a decrease in responding following extinction. Here, the argument is that these regions are primed to increase or decrease fear responses and that this tendency is gated by subcortical inputs. However, we believe the data from aversive studies can be explained by a supraordinate role for the medial prefrontal cortex in behavioral flexibility, in line with the appetitive literature. Using a dichotomy between the voluntary control of behavior and the execution of well-trained responses, we attempt to reconcile these theories. We argue that the prelimbic region exerts voluntary control over behavior via top-down modulation of stimulus-response pathways according to task demands, contextual cues, and how well a stimulus predicts an outcome. Conversely, the infralimbic region promotes responding based on the strength of stimulus-response pathways determined by experience with reinforced contingencies. This system resolves the tension between executing voluntary actions sensitive to recent changes in contingencies, and responses that reflect the animal's experience across the long run. (PsycINFO Database Record (c) 2018 APA, all rights reserved).


Subject(s)
Attention/physiology , Behavior, Animal/physiology , Conditioning, Classical/physiology , Prefrontal Cortex/physiology , Animals , Rats
18.
Nat Neurosci ; 21(10): 1493, 2018 Oct.
Article in English | MEDLINE | ID: mdl-30018354

ABSTRACT

In the version of this article initially published, the laser activation at the start of cue X in experiment 1 was described in the first paragraph of the Results and in the third paragraph of the Experiment 1 section of the Methods as lasting 2 s; in fact, it lasted only 1 s. The error has been corrected in the HTML and PDF versions of the article.

19.
Neuropsychopharmacology ; 43(8): 1-2, 2018 07.
Article in English | MEDLINE | ID: mdl-29520057

ABSTRACT

We have long known that dopamine encodes the predictive relationship between cues and rewards. But what about relief learning? In this issue of Neuropsychopharmacology, Mayer et al. show that the same circuits encoding rewarding events also encode relief from aversive events. And this appears to be in a manner distinct from encoding of the aversive event itself. So does dopamine only contribute to learning about positive events? And are these events encoded in the same way regardless of how that positive experience came about? Not quite. Turns out, the devil is in the details.


Subject(s)
Dopamine , Neurons , Brain , Learning , Reward
20.
Neurobiol Learn Mem ; 153(Pt B): 131-136, 2018 09.
Article in English | MEDLINE | ID: mdl-29269085

ABSTRACT

The phasic dopamine error signal is currently argued to be synonymous with the prediction error in Sutton and Barto (1987, 1998) model-free reinforcement learning algorithm (Schultz et al., 1997). This theory argues that phasic dopamine reflects a cached-value signal that endows reward-predictive cues with the scalar value inherent in reward. Such an interpretation does not envision a role for dopamine in more complex cognitive representations between events which underlie many forms of associative learning, restricting the role dopamine can play in learning. The cached-value hypothesis of dopamine makes three concrete predictions about when a phasic dopamine response should be seen and what types of learning this signal should be able to promote. We discuss these predictions in light of recent evidence which we believe provide particularly strong tests of their validity. In doing so, we find that while the phasic dopamine signal conforms to a cached-value account in some circumstances, other evidence demonstrate that this signal is not restricted to a model-free cached-value reinforcement learning signal. In light of this evidence, we argue that the phasic dopamine signal functions more generally to signal violations of expectancies to drive real-world associations between events.


Subject(s)
Association Learning/physiology , Brain/physiology , Dopamine/physiology , Models, Neurological , Reward , Animals , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...