Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2024 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-39345642

RESUMEN

Limits on information processing capacity impose limits on task performance. We show that animals achieve performance on a perceptual decision task that is near-optimal given their capacity limits, as measured by policy complexity (the mutual information between states and actions). This behavioral profile could be achieved by reinforcement learning with a penalty on high complexity policies, realized through modulation of dopaminergic learning signals. In support of this hypothesis, we find that policy complexity suppresses midbrain dopamine responses to reward outcomes, thereby reducing behavioral sensitivity to these outcomes. Our results suggest that policy compression shapes basic mechanisms of reinforcement learning in the brain.

2.
Nat Commun ; 15(1): 7093, 2024 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-39154025

RESUMEN

Perceptual decisions should depend on sensory evidence. However, such decisions are also influenced by past choices and outcomes. These choice history biases may reflect advantageous strategies to exploit temporal regularities of natural environments. However, it is unclear whether and how observers can adapt their choice history biases to different temporal regularities, to exploit the multitude of temporal correlations that exist in nature. Here, we show that male mice adapt their perceptual choice history biases to different temporal regularities of visual stimuli. This adaptation was slow, evolving over hundreds of trials across several days. It occurred alongside a fast non-adaptive choice history bias, limited to a few trials. Both fast and slow trial history effects are well captured by a normative reinforcement learning algorithm with multi-trial belief states, comprising both current trial sensory and previous trial memory states. We demonstrate that dorsal striatal dopamine tracks predictions of the model and behavior, suggesting that striatal dopamine reports reward predictions associated with adaptive choice history biases. Our results reveal the adaptive nature of perceptual choice history biases and shed light on their underlying computational principles and neural correlates.


Asunto(s)
Conducta de Elección , Cuerpo Estriado , Dopamina , Animales , Masculino , Dopamina/metabolismo , Ratones , Cuerpo Estriado/metabolismo , Cuerpo Estriado/fisiología , Conducta de Elección/fisiología , Ratones Endogámicos C57BL , Toma de Decisiones/fisiología , Recompensa , Estimulación Luminosa , Percepción Visual/fisiología , Refuerzo en Psicología
3.
PLoS Comput Biol ; 20(4): e1011516, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38626219

RESUMEN

When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning.


Asunto(s)
Ganglios Basales , Dopamina , Modelos Neurológicos , Recompensa , Dopamina/metabolismo , Dopamina/fisiología , Incertidumbre , Animales , Ganglios Basales/fisiología , Conducta Exploratoria/fisiología , Refuerzo en Psicología , Neuronas Dopaminérgicas/fisiología , Biología Computacional , Simulación por Computador , Masculino , Algoritmos , Toma de Decisiones/fisiología , Conducta Animal/fisiología , Ratas
4.
Cell Rep ; 41(2): 111470, 2022 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-36223748

RESUMEN

Goal-directed navigation requires learning to accurately estimate location and select optimal actions in each location. Midbrain dopamine neurons are involved in reward value learning and have been linked to reward location learning. They are therefore ideally placed to provide teaching signals for goal-directed navigation. By imaging dopamine neural activity as mice learned to actively navigate a closed-loop virtual reality corridor to obtain reward, we observe phasic and pre-reward ramping dopamine activity, which are modulated by learning stage and task engagement. A Q-learning model incorporating position inference recapitulates our results, displaying prediction errors resembling phasic and ramping dopamine neural activity. The model predicts that ramping is followed by improved task performance, which we confirm in our experimental data, indicating that the dopamine ramp may have a teaching effect. Our results suggest that midbrain dopamine neurons encode phasic and ramping reward prediction error signals to improve goal-directed navigation.


Asunto(s)
Dopamina , Neuronas Dopaminérgicas , Animales , Dopamina/fisiología , Objetivos , Mesencéfalo/fisiología , Ratones , Recompensa
5.
J Neurosci ; 41(34): 7197-7205, 2021 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-34253628

RESUMEN

The striatum plays critical roles in visually-guided decision-making and receives dense axonal projections from midbrain dopamine neurons. However, the roles of striatal dopamine in visual decision-making are poorly understood. We trained male and female mice to perform a visual decision task with asymmetric reward payoff, and we recorded the activity of dopamine axons innervating striatum. Dopamine axons in the dorsomedial striatum (DMS) responded to contralateral visual stimuli and contralateral rewarded actions. Neural responses to contralateral stimuli could not be explained by orienting behavior such as eye movements. Moreover, these contralateral stimulus responses persisted in sessions where the animals were instructed to not move to obtain reward, further indicating that these signals are stimulus-related. Lastly, we show that DMS dopamine signals were qualitatively different from dopamine signals in the ventral striatum (VS), which responded to both ipsilateral and contralateral stimuli, conforming to canonical prediction error signaling under sensory uncertainty. Thus, during visual decisions, DMS dopamine encodes visual stimuli and rewarded actions in a lateralized fashion, and could facilitate associations between specific visual stimuli and actions.SIGNIFICANCE STATEMENT While the striatum is central to goal-directed behavior, the precise roles of its rich dopaminergic innervation in perceptual decision-making are poorly understood. We found that in a visual decision task, dopamine axons in the dorsomedial striatum (DMS) signaled stimuli presented contralaterally to the recorded hemisphere, as well as the onset of rewarded actions. Stimulus-evoked signals persisted in a no-movement task variant. We distinguish the patterns of these signals from those in the ventral striatum (VS). Our results contribute to the characterization of region-specific dopaminergic signaling in the striatum and highlight a role in stimulus-action association learning.


Asunto(s)
Aprendizaje por Asociación/fisiología , Axones/fisiología , Conducta de Elección/fisiología , Cuerpo Estriado/fisiología , Neuronas Dopaminérgicas/fisiología , Estimulación Luminosa , Recompensa , Animales , Cuerpo Estriado/citología , Dominancia Cerebral , Dopamina/fisiología , Movimientos Oculares/fisiología , Femenino , Masculino , Ratones , Ratones Endogámicos C57BL , Fibras Nerviosas/ultraestructura
6.
Cell ; 182(1): 112-126.e18, 2020 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-32504542

RESUMEN

Every decision we make is accompanied by a sense of confidence about its likely outcome. This sense informs subsequent behavior, such as investing more-whether time, effort, or money-when reward is more certain. A neural representation of confidence should originate from a statistical computation and predict confidence-guided behavior. An additional requirement for confidence representations to support metacognition is abstraction: they should emerge irrespective of the source of information and inform multiple confidence-guided behaviors. It is unknown whether neural confidence signals meet these criteria. Here, we show that single orbitofrontal cortex neurons in rats encode statistical decision confidence irrespective of the sensory modality, olfactory or auditory, used to make a choice. The activity of these neurons also predicts two confidence-guided behaviors: trial-by-trial time investment and cross-trial choice strategy updating. Orbitofrontal cortex thus represents decision confidence consistent with a metacognitive process that is useful for mediating confidence-guided economic decisions.


Asunto(s)
Conducta/fisiología , Corteza Prefrontal/fisiología , Animales , Conducta de Elección/fisiología , Toma de Decisiones , Modelos Biológicos , Neuronas/fisiología , Ratas Long-Evans , Sensación/fisiología , Análisis y Desempeño de Tareas , Factores de Tiempo
7.
Elife ; 92020 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-32286227

RESUMEN

Learning from successes and failures often improves the quality of subsequent decisions. Past outcomes, however, should not influence purely perceptual decisions after task acquisition is complete since these are designed so that only sensory evidence determines the correct choice. Yet, numerous studies report that outcomes can bias perceptual decisions, causing spurious changes in choice behavior without improving accuracy. Here we show that the effects of reward on perceptual decisions are principled: past rewards bias future choices specifically when previous choice was difficult and hence decision confidence was low. We identified this phenomenon in six datasets from four laboratories, across mice, rats, and humans, and sensory modalities from olfaction and audition to vision. We show that this choice-updating strategy can be explained by reinforcement learning models incorporating statistical decision confidence into their teaching signals. Thus, reinforcement learning mechanisms are continually engaged to produce systematic adjustments of choices even in well-learned perceptual decisions in order to optimize behavior in an uncertain world.


Asunto(s)
Sesgo , Toma de Decisiones/fisiología , Refuerzo en Psicología , Animales , Conducta de Elección , Audición , Humanos , Ratones , Ratas , Olfato , Visión Ocular
8.
Neuron ; 105(1): 4-6, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31951527

RESUMEN

Fundamental research into early circuits of the neocortex provides insight into the etiology of mental illness. In this issue of Neuron, Chini et al. (2020) probe the consequences of combined genetic and environmental perturbation on emergent network activity in the prefrontal cortex, identifying a window for possible intervention.


Asunto(s)
Disfunción Cognitiva , Neocórtex , Animales , Ratones , Neuronas , Corteza Prefrontal
9.
Neuron ; 105(4): 700-711.e6, 2020 02 19.
Artículo en Inglés | MEDLINE | ID: mdl-31859030

RESUMEN

Deciding between stimuli requires combining their learned value with one's sensory confidence. We trained mice in a visual task that probes this combination. Mouse choices reflected not only present confidence and past rewards but also past confidence. Their behavior conformed to a model that combines signal detection with reinforcement learning. In the model, the predicted value of the chosen option is the product of sensory confidence and learned value. We found precise correlates of this variable in the pre-outcome activity of midbrain dopamine neurons and of medial prefrontal cortical neurons. However, only the latter played a causal role: inactivating medial prefrontal cortex before outcome strengthened learning from the outcome. Dopamine neurons played a causal role only after outcome, when they encoded reward prediction errors graded by confidence, influencing subsequent choices. These results reveal neural signals that combine reward value with sensory confidence and guide subsequent learning.


Asunto(s)
Conducta de Elección/fisiología , Neuronas Dopaminérgicas/metabolismo , Aprendizaje/fisiología , Corteza Prefrontal/metabolismo , Recompensa , Animales , Neuronas Dopaminérgicas/química , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Transgénicos , Optogenética/métodos , Corteza Prefrontal/química
10.
Cereb Cortex ; 29(5): 2196-2210, 2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-30796825

RESUMEN

Cortical activity is organized across multiple spatial and temporal scales. Most research on the dynamics of neuronal spiking is concerned with timescales of 1 ms-1 s, and little is known about spiking dynamics on timescales of tens of seconds and minutes. Here, we used frequency domain analyses to study the structure of individual neurons' spiking activity and its coupling to local population rate and to arousal level across 0.01-100 Hz frequency range. In mouse medial prefrontal cortex, the spiking dynamics of individual neurons could be quantitatively captured by a combination of interspike interval and firing rate power spectrum distributions. The relative strength of coherence with local population often differed across timescales: a neuron strongly coupled to population rate on fast timescales could be weakly coupled on slow timescales, and vice versa. On slow but not fast timescales, a substantial proportion of neurons showed firing anticorrelated with the population. Infraslow firing rate changes were largely determined by arousal rather than by local factors, which could explain the timescale dependence of individual neurons' population coupling strength. These observations demonstrate how neurons simultaneously partake in fast local dynamics, and slow brain-wide dynamics, extending our understanding of infraslow cortical activity beyond the mesoscale resolution of fMRI.


Asunto(s)
Potenciales de Acción/fisiología , Neuronas/fisiología , Corteza Prefrontal/fisiología , Animales , Femenino , Masculino , Ratones Endogámicos C57BL , Modelos Neurológicos , Procesamiento de Señales Asistido por Computador , Factores de Tiempo
11.
Cell Rep ; 20(10): 2513-2524, 2017 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-28877482

RESUMEN

Research in neuroscience increasingly relies on the mouse, a mammalian species that affords unparalleled genetic tractability and brain atlases. Here, we introduce high-yield methods for probing mouse visual decisions. Mice are head-fixed, facilitating repeatable visual stimulation, eye tracking, and brain access. They turn a steering wheel to make two alternative choices, forced or unforced. Learning is rapid thanks to intuitive coupling of stimuli to wheel position. The mouse decisions deliver high-quality psychometric curves for detection and discrimination and conform to the predictions of a simple probabilistic observer model. The task is readily paired with two-photon imaging of cortical activity. Optogenetic inactivation reveals that the task requires mice to use their visual cortex. Mice are motivated to perform the task by fluid reward or optogenetic stimulation of dopamine neurons. This stimulation elicits a larger number of trials and faster learning. These methods provide a platform to accurately probe mouse vision and its neural basis.


Asunto(s)
Conducta de Elección/fisiología , Neuronas Dopaminérgicas/metabolismo , Psicofísica/métodos , Corteza Visual/metabolismo , Corteza Visual/fisiología , Animales , Femenino , Masculino , Ratones , Estimulación Luminosa
12.
Curr Opin Neurobiol ; 43: 139-148, 2017 04.
Artículo en Inglés | MEDLINE | ID: mdl-28390863

RESUMEN

The phasic dopamine reward prediction error response is a major brain signal underlying learning, approach and decision making. This dopamine response consists of two components that reflect, initially, stimulus detection from physical impact and, subsequenttly, reward valuation; dopamine activations by punishers reflect physical impact rather than aversiveness. The dopamine reward signal is distinct from earlier reported and recently confirmed phasic changes with behavioural activation. Optogenetic activation of dopamine neurones in monkeys causes value learning and biases economic choices. The dopamine reward signal conforms to formal economic utility and thus constitutes a utility prediction error signal. In these combined ways, the dopamine reward prediction error signal constitutes a potential neuronal substrate for the crucial economic decision variable of utility.


Asunto(s)
Conducta/fisiología , Dopamina/metabolismo , Neuronas Dopaminérgicas/fisiología , Animales , Encéfalo/fisiología , Toma de Decisiones/fisiología , Aprendizaje/fisiología , Recompensa
13.
Curr Biol ; 27(6): 821-832, 2017 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-28285994

RESUMEN

Central to the organization of behavior is the ability to predict the values of outcomes to guide choices. The accuracy of such predictions is honed by a teaching signal that indicates how incorrect a prediction was ("reward prediction error," RPE). In several reinforcement learning contexts, such as Pavlovian conditioning and decisions guided by reward history, this RPE signal is provided by midbrain dopamine neurons. In many situations, however, the stimuli predictive of outcomes are perceptually ambiguous. Perceptual uncertainty is known to influence choices, but it has been unclear whether or how dopamine neurons factor it into their teaching signal. To cope with uncertainty, we extended a reinforcement learning model with a belief state about the perceptually ambiguous stimulus; this model generates an estimate of the probability of choice correctness, termed decision confidence. We show that dopamine responses in monkeys performing a perceptually ambiguous decision task comply with the model's predictions. Consequently, dopamine responses did not simply reflect a stimulus' average expected reward value but were predictive of the trial-to-trial fluctuations in perceptual accuracy. These confidence-dependent dopamine responses emerged prior to monkeys' choice initiation, raising the possibility that dopamine impacts impending decisions, in addition to encoding a post-decision teaching signal. Finally, by manipulating reward size, we found that dopamine neurons reflect both the upcoming reward size and the confidence in achieving it. Together, our results show that dopamine responses convey teaching signals that are also appropriate for perceptual decisions.


Asunto(s)
Conducta de Elección , Toma de Decisiones , Neuronas Dopaminérgicas/fisiología , Macaca/fisiología , Mesencéfalo/fisiología , Percepción , Refuerzo en Psicología , Animales , Dopamina/fisiología , Macaca/psicología , Masculino , Modelos Animales , Recompensa
14.
Elife ; 52016 10 27.
Artículo en Inglés | MEDLINE | ID: mdl-27787196

RESUMEN

Economic theories posit reward probability as one of the factors defining reward value. Individuals learn the value of cues that predict probabilistic rewards from experienced reward frequencies. Building on the notion that responses of dopamine neurons increase with reward probability and expected value, we asked how dopamine neurons in monkeys acquire this value signal that may represent an economic decision variable. We found in a Pavlovian learning task that reward probability-dependent value signals arose from experienced reward frequencies. We then assessed neuronal response acquisition during choices among probabilistic rewards. Here, dopamine responses became sensitive to the value of both chosen and unchosen options. Both experiments showed also the novelty responses of dopamine neurones that decreased as learning advanced. These results show that dopamine neurons acquire predictive value signals from the frequency of experienced rewards. This flexible and fast signal reflects a specific decision variable and could update neuronal decision mechanisms.


Asunto(s)
Conducta de Elección , Neuronas Dopaminérgicas/fisiología , Aprendizaje , Recompensa , Animales , Señales (Psicología) , Haplorrinos
15.
Cell ; 166(6): 1564-1571.e6, 2016 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-27610576

RESUMEN

Optogenetic studies in mice have revealed new relationships between well-defined neurons and brain functions. However, there are currently no means to achieve the same cell-type specificity in monkeys, which possess an expanded behavioral repertoire and closer anatomical homology to humans. Here, we present a resource for cell-type-specific channelrhodopsin expression in Rhesus monkeys and apply this technique to modulate dopamine activity and monkey choice behavior. These data show that two viral vectors label dopamine neurons with greater than 95% specificity. Infected neurons were activated by light pulses, indicating functional expression. The addition of optical stimulation to reward outcomes promoted the learning of reward-predicting stimuli at the neuronal and behavioral level. Together, these results demonstrate the feasibility of effective and selective stimulation of dopamine neurons in non-human primates and a resource that could be applied to other cell types in the monkey brain.


Asunto(s)
Conducta de Elección/fisiología , Neuronas Dopaminérgicas/metabolismo , Optogenética/métodos , Animales , Dependovirus/genética , Dopamina/metabolismo , Regulación de la Expresión Génica , Vectores Genéticos/genética , Macaca mulatta , Regiones Promotoras Genéticas/genética , Rodopsina/genética
16.
PLoS One ; 11(3): e0151180, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26959638

RESUMEN

A key experimental approach in neuroscience involves measuring neuronal activity in behaving animals with extracellular chronic recordings. Such chronic recordings were initially made with single electrodes and tetrodes, and are now increasingly performed with high-density, high-count silicon probes. A common way to achieve long-term chronic recording is to attach the probes to microdrives that progressively advance them into the brain. Here we report, however, that such microdrives are not strictly necessary. Indeed, we obtained high-quality recordings in both head-fixed and freely moving mice for several months following the implantation of immobile chronic probes. Probes implanted into the primary visual cortex yielded well-isolated single units whose spike waveform and orientation tuning were highly reproducible over time. Although electrode drift was not completely absent, stable waveforms occurred in at least 70% of the neurons tested across consecutive days. Thus, immobile silicon probes represent a straightforward and reliable technique to obtain stable, long-term population recordings in mice, and to follow the activity of populations of well-isolated neurons over multiple days.


Asunto(s)
Microelectrodos , Silicio , Potenciales de Acción/fisiología , Animales , Conducta Animal , Encéfalo/fisiología , Electrodos Implantados , Electrofisiología/instrumentación , Femenino , Masculino
17.
J Comp Neurol ; 524(8): 1699-711, 2016 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-26272220

RESUMEN

Rewards are defined by their behavioral functions in learning (positive reinforcement), approach behavior, economic choices, and emotions. Dopamine neurons respond to rewards with two components, similar to higher order sensory and cognitive neurons. The initial, rapid, unselective dopamine detection component reports all salient environmental events irrespective of their reward association. It is highly sensitive to factors related to reward and thus detects a maximal number of potential rewards. It also senses aversive stimuli but reports their physical impact rather than their aversiveness. The second response component processes reward value accurately and starts early enough to prevent confusion with unrewarded stimuli and objects. It codes reward value as a numeric, quantitative utility prediction error, consistent with formal concepts of economic decision theory. Thus, the dopamine reward signal is fast, highly sensitive and appropriate for driving and updating economic decisions.


Asunto(s)
Encéfalo/fisiología , Neuronas Dopaminérgicas/fisiología , Recompensa , Animales , Conducta de Elección/fisiología , Dopamina/metabolismo , Humanos , Aprendizaje/fisiología
18.
J Neurosci ; 35(7): 3146-54, 2015 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-25698750

RESUMEN

Economic choices are largely determined by two principal elements, reward value (utility) and probability. Although nonlinear utility functions have been acknowledged for centuries, nonlinear probability weighting (probability distortion) was only recently recognized as a ubiquitous aspect of real-world choice behavior. Even when outcome probabilities are known and acknowledged, human decision makers often overweight low probability outcomes and underweight high probability outcomes. Whereas recent studies measured utility functions and their corresponding neural correlates in monkeys, it is not known whether monkeys distort probability in a manner similar to humans. Therefore, we investigated economic choices in macaque monkeys for evidence of probability distortion. We trained two monkeys to predict reward from probabilistic gambles with constant outcome values (0.5 ml or nothing). The probability of winning was conveyed using explicit visual cues (sector stimuli). Choices between the gambles revealed that the monkeys used the explicit probability information to make meaningful decisions. Using these cues, we measured probability distortion from choices between the gambles and safe rewards. Parametric modeling of the choices revealed classic probability weighting functions with inverted-S shape. Therefore, the animals overweighted low probability rewards and underweighted high probability rewards. Empirical investigation of the behavior verified that the choices were best explained by a combination of nonlinear value and nonlinear probability distortion. Together, these results suggest that probability distortion may reflect evolutionarily preserved neuronal processing.


Asunto(s)
Conducta de Elección/fisiología , Probabilidad , Recompensa , Asunción de Riesgos , Animales , Condicionamiento Clásico , Señales (Psicología) , Juegos Experimentales , Macaca mulatta , Masculino
20.
Curr Biol ; 24(21): 2491-500, 2014 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-25283778

RESUMEN

BACKGROUND: Optimal choices require an accurate neuronal representation of economic value. In economics, utility functions are mathematical representations of subjective value that can be constructed from choices under risk. Utility usually exhibits a nonlinear relationship to physical reward value that corresponds to risk attitudes and reflects the increasing or decreasing marginal utility obtained with each additional unit of reward. Accordingly, neuronal reward responses coding utility should robustly reflect this nonlinearity. RESULTS: In two monkeys, we measured utility as a function of physical reward value from meaningful choices under risk (that adhered to first- and second-order stochastic dominance). The resulting nonlinear utility functions predicted the certainty equivalents for new gambles, indicating that the functions' shapes were meaningful. The monkeys were risk seeking (convex utility function) for low reward and risk avoiding (concave utility function) with higher amounts. Critically, the dopamine prediction error responses at the time of reward itself reflected the nonlinear utility functions measured at the time of choices. In particular, the reward response magnitude depended on the first derivative of the utility function and thus reflected the marginal utility. Furthermore, dopamine responses recorded outside of the task reflected the marginal utility of unpredicted reward. Accordingly, these responses were sufficient to train reinforcement learning models to predict the behaviorally defined expected utility of gambles. CONCLUSIONS: These data suggest a neuronal manifestation of marginal utility in dopamine neurons and indicate a common neuronal basis for fundamental explanatory constructs in animal learning theory (prediction error) and economic decision theory (marginal utility).


Asunto(s)
Conducta de Elección , Dopamina/fisiología , Macaca mulatta/psicología , Recompensa , Asunción de Riesgos , Animales , Dopamina/metabolismo , Masculino , Probabilidad , Procesos Estocásticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA