Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 129
Filter
1.
PLoS Comput Biol ; 20(5): e1012119, 2024 May.
Article in English | MEDLINE | ID: mdl-38748770

ABSTRACT

Computational cognitive models have been used extensively to formalize cognitive processes. Model parameters offer a simple way to quantify individual differences in how humans process information. Similarly, model comparison allows researchers to identify which theories, embedded in different models, provide the best accounts of the data. Cognitive modeling uses statistical tools to quantitatively relate models to data that often rely on computing/estimating the likelihood of the data under the model. However, this likelihood is computationally intractable for a substantial number of models. These relevant models may embody reasonable theories of cognition, but are often under-explored due to the limited range of tools available to relate them to data. We contribute to filling this gap in a simple way using artificial neural networks (ANNs) to map data directly onto model identity and parameters, bypassing the likelihood estimation. We test our instantiation of an ANN as a cognitive model fitting tool on classes of cognitive models with strong inter-trial dependencies (such as reinforcement learning models), which offer unique challenges to most methods. We show that we can adequately perform both parameter estimation and model identification using our ANN approach, including for models that cannot be fit using traditional likelihood-based methods. We further discuss our work in the context of the ongoing research leveraging simulation-based approaches to parameter estimation and model identification, and how these approaches broaden the class of cognitive models researchers can quantitatively investigate.


Subject(s)
Cognition , Computational Biology , Computer Simulation , Neural Networks, Computer , Humans , Cognition/physiology , Computational Biology/methods , Likelihood Functions , Algorithms , Models, Neurological
2.
bioRxiv ; 2024 Jan 26.
Article in English | MEDLINE | ID: mdl-38328176

ABSTRACT

Computational cognitive modeling is an important tool for understanding the processes supporting human and animal decision-making. Choice data in decision-making tasks are inherently noisy, and separating noise from signal can improve the quality of computational modeling. Common approaches to model decision noise often assume constant levels of noise or exploration throughout learning (e.g., the ϵ-softmax policy). However, this assumption is not guaranteed to hold - for example, a subject might disengage and lapse into an inattentive phase for a series of trials in the middle of otherwise low-noise performance. Here, we introduce a new, computationally inexpensive method to dynamically infer the levels of noise in choice behavior, under a model assumption that agents can transition between two discrete latent states (e.g., fully engaged and random). Using simulations, we show that modeling noise levels dynamically instead of statically can substantially improve model fit and parameter estimation, especially in the presence of long periods of noisy behavior, such as prolonged attentional lapses. We further demonstrate the empirical benefits of dynamic noise estimation at the individual and group levels by validating it on four published datasets featuring diverse populations, tasks, and models. Based on the theoretical and empirical evaluation of the method reported in the current work, we expect that dynamic noise estimation will improve modeling in many decision-making paradigms over the static noise estimation method currently used in the modeling literature, while keeping additional model complexity and assumptions minimal.

3.
bioRxiv ; 2024 Apr 02.
Article in English | MEDLINE | ID: mdl-37767088

ABSTRACT

Computational cognitive models have been used extensively to formalize cognitive processes. Model parameters offer a simple way to quantify individual differences in how humans process information. Similarly, model comparison allows researchers to identify which theories, embedded in different models, provide the best accounts of the data. Cognitive modeling uses statistical tools to quantitatively relate models to data that often rely on computing/estimating the likelihood of the data under the model. However, this likelihood is computationally intractable for a substantial number of models. These relevant models may embody reasonable theories of cognition, but are often under-explored due to the limited range of tools available to relate them to data. We contribute to filling this gap in a simple way using artificial neural networks (ANNs) to map data directly onto model identity and parameters, bypassing the likelihood estimation. We test our instantiation of an ANN as a cognitive model fitting tool on classes of cognitive models with strong inter-trial dependencies (such as reinforcement learning models), which offer unique challenges to most methods. We show that we can adequately perform both parameter estimation and model identification using our ANN approach, including for models that cannot be fit using traditional likelihood-based methods. We further discuss our work in the context of the ongoing research leveraging simulation-based approaches to parameter estimation and model identification, and how these approaches broaden the class of cognitive models researchers can quantitatively investigate.

4.
bioRxiv ; 2023 Nov 13.
Article in English | MEDLINE | ID: mdl-38014354

ABSTRACT

Dopamine release in the nucleus accumbens has been hypothesized to signal reward prediction error, the difference between observed and predicted reward, suggesting a biological implementation for reinforcement learning. Rigorous tests of this hypothesis require assumptions about how the brain maps sensory signals to reward predictions, yet this mapping is still poorly understood. In particular, the mapping is non-trivial when sensory signals provide ambiguous information about the hidden state of the environment. Previous work using classical conditioning tasks has suggested that reward predictions are generated conditional on probabilistic beliefs about the hidden state, such that dopamine implicitly reflects these beliefs. Here we test this hypothesis in the context of an instrumental task (a two-armed bandit), where the hidden state switches repeatedly. We measured choice behavior and recorded dLight signals reflecting dopamine release in the nucleus accumbens core. Model comparison based on the behavioral data favored models that used Bayesian updating of probabilistic beliefs. These same models also quantitatively matched the dopamine measurements better than non-Bayesian alternatives. We conclude that probabilistic belief computation plays a fundamental role in instrumental performance and associated mesolimbic dopamine signaling.

5.
Trends Cogn Sci ; 27(12): 1150-1164, 2023 12.
Article in English | MEDLINE | ID: mdl-37696690

ABSTRACT

Goals play a central role in human cognition. However, computational theories of learning and decision-making often take goals as given. Here, we review key empirical findings showing that goals shape the representations of inputs, responses, and outcomes, such that setting a goal crucially influences the central aspects of any learning process: states, actions, and rewards. We thus argue that studying goal selection is essential to advance our understanding of learning. By following existing literature in framing goal selection within a hierarchy of decision-making problems, we synthesize important findings on the principles underlying goal value attribution and exploration strategies. Ultimately, we propose that a goal-centric perspective will help develop more complete accounts of learning in both biological and artificial agents.


Subject(s)
Goals , Reinforcement, Psychology , Humans , Decision Making/physiology , Motivation , Learning
6.
Cogn Affect Behav Neurosci ; 23(5): 1346-1364, 2023 10.
Article in English | MEDLINE | ID: mdl-37656373

ABSTRACT

How does the similarity between stimuli affect our ability to learn appropriate response associations for them? In typical laboratory experiments learning is investigated under somewhat ideal circumstances, where stimuli are easily discriminable. This is not representative of most real-life learning, where overlapping "stimuli" can result in different "rewards" and may be learned simultaneously (e.g., you may learn over repeated interactions that a specific dog is friendly, but that a very similar looking one isn't). With two experiments, we test how humans learn in three stimulus conditions: one "best case" condition in which stimuli have idealized and highly discriminable visual and semantic representations, and two in which stimuli have overlapping representations, making them less discriminable. We find that, unsurprisingly, decreasing stimuli discriminability decreases performance. We develop computational models to test different hypotheses about how reinforcement learning (RL) and working memory (WM) processes are affected by different stimulus conditions. Our results replicate earlier studies demonstrating the importance of both processes to capture behavior. However, our results extend previous studies by demonstrating that RL, and not WM, is affected by stimulus distinctness: people learn slower and have higher across-stimulus value confusion at decision when stimuli are more similar to each other. These results illustrate strong effects of stimulus type on learning and demonstrate the importance of considering parallel contributions of different cognitive processes when studying behavior.


Subject(s)
Learning , Reinforcement, Psychology , Humans , Animals , Dogs , Learning/physiology , Reward , Memory
7.
PLoS Biol ; 21(7): e3002201, 2023 07.
Article in English | MEDLINE | ID: mdl-37459394

ABSTRACT

When observing the outcome of a choice, people are sensitive to the choice's context, such that the experienced value of an option depends on the alternatives: getting $1 when the possibilities were 0 or 1 feels much better than when the possibilities were 1 or 10. Context-sensitive valuation has been documented within reinforcement learning (RL) tasks, in which values are learned from experience through trial and error. Range adaptation, wherein options are rescaled according to the range of values yielded by available options, has been proposed to account for this phenomenon. However, we propose that other mechanisms-reflecting a different theoretical viewpoint-may also explain this phenomenon. Specifically, we theorize that internally defined goals play a crucial role in shaping the subjective value attributed to any given option. Motivated by this theory, we develop a new "intrinsically enhanced" RL model, which combines extrinsically provided rewards with internally generated signals of goal achievement as a teaching signal. Across 7 different studies (including previously published data sets as well as a novel, preregistered experiment with replication and control studies), we show that the intrinsically enhanced model can explain context-sensitive valuation as well as, or better than, range adaptation. Our findings indicate a more prominent role of intrinsic, goal-dependent rewards than previously recognized within formal models of human RL. By integrating internally generated signals of reward, standard RL theories should better account for human behavior, including context-sensitive valuation and beyond.


Subject(s)
Reinforcement, Psychology , Reward , Humans , Learning , Motivation
8.
Elife ; 122023 04 18.
Article in English | MEDLINE | ID: mdl-37070807

ABSTRACT

The ability to use past experience to effectively guide decision-making declines in older adulthood. Such declines have been theorized to emerge from either impairments of striatal reinforcement learning systems (RL) or impairments of recurrent networks in prefrontal and parietal cortex that support working memory (WM). Distinguishing between these hypotheses has been challenging because either RL or WM could be used to facilitate successful decision-making in typical laboratory tasks. Here we investigated the neurocomputational correlates of age-related decision-making deficits using an RL-WM task to disentangle these mechanisms, a computational model to quantify them, and magnetic resonance spectroscopy to link them to their molecular bases. Our results reveal that task performance is worse in older age, in a manner best explained by working memory deficits, as might be expected if cortical recurrent networks were unable to sustain persistent activity across multiple trials. Consistent with this, we show that older adults had lower levels of prefrontal glutamate, the excitatory neurotransmitter thought to support persistent activity, compared to younger adults. Individuals with the lowest prefrontal glutamate levels displayed the greatest impairments in working memory after controlling for other anatomical and metabolic factors. Together, our results suggest that lower levels of prefrontal glutamate may contribute to failures of working memory systems and impaired decision-making in older adulthood.


Subject(s)
Glutamic Acid , Memory, Short-Term , Humans , Aged , Learning , Reinforcement, Psychology , Task Performance and Analysis , Prefrontal Cortex/diagnostic imaging
9.
Psychol Methods ; 2023 Mar 27.
Article in English | MEDLINE | ID: mdl-36972080

ABSTRACT

Using Bayesian methods to apply computational models of cognitive processes, or Bayesian cognitive modeling, is an important new trend in psychological research. The rise of Bayesian cognitive modeling has been accelerated by the introduction of software that efficiently automates the Markov chain Monte Carlo sampling used for Bayesian model fitting-including the popular Stan and PyMC packages, which automate the dynamic Hamiltonian Monte Carlo and No-U-Turn Sampler (HMC/NUTS) algorithms that we spotlight here. Unfortunately, Bayesian cognitive models can struggle to pass the growing number of diagnostic checks required of Bayesian models. If any failures are left undetected, inferences about cognition based on the model's output may be biased or incorrect. As such, Bayesian cognitive models almost always require troubleshooting before being used for inference. Here, we present a deep treatment of the diagnostic checks and procedures that are critical for effective troubleshooting, but are often left underspecified by tutorial papers. After a conceptual introduction to Bayesian cognitive modeling and HMC/NUTS sampling, we outline the diagnostic metrics, procedures, and plots necessary to detect problems in model output with an emphasis on how these requirements have recently been changed and extended. Throughout, we explain how uncovering the exact nature of the problem is often the key to identifying solutions. We also demonstrate the troubleshooting process for an example hierarchical Bayesian model of reinforcement learning, including supplementary code. With this comprehensive guide to techniques for detecting, identifying, and overcoming problems in fitting Bayesian cognitive models, psychologists across subfields can more confidently build and use Bayesian cognitive models in their research. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

10.
J Neurosci ; 43(17): 3131-3143, 2023 04 26.
Article in English | MEDLINE | ID: mdl-36931706

ABSTRACT

Human learning and decision-making are supported by multiple systems operating in parallel. Recent studies isolating the contributions of reinforcement learning (RL) and working memory (WM) have revealed a trade-off between the two. An interactive WM/RL computational model predicts that although high WM load slows behavioral acquisition, it also induces larger prediction errors in the RL system that enhance robustness and retention of learned behaviors. Here, we tested this account by parametrically manipulating WM load during RL in conjunction with EEG in both male and female participants and administered two surprise memory tests. We further leveraged single-trial decoding of EEG signatures of RL and WM to determine whether their interaction predicted robust retention. Consistent with the model, behavioral learning was slower for associations acquired under higher load but showed parametrically improved future retention. This paradoxical result was mirrored by EEG indices of RL, which were strengthened under higher WM loads and predictive of more robust future behavioral retention of learned stimulus-response contingencies. We further tested whether stress alters the ability to shift between the two systems strategically to maximize immediate learning versus retention of information and found that induced stress had only a limited effect on this trade-off. The present results offer a deeper understanding of the cooperative interaction between WM and RL and show that relying on WM can benefit the rapid acquisition of choice behavior during learning but impairs retention.SIGNIFICANCE STATEMENT Successful learning is achieved by the joint contribution of the dopaminergic RL system and WM. The cooperative WM/RL model was productive in improving our understanding of the interplay between the two systems during learning, demonstrating that reliance on RL computations is modulated by WM load. However, the role of WM/RL systems in the retention of learned stimulus-response associations remained unestablished. Our results show that increased neural signatures of learning, indicative of greater RL computation, under high WM load also predicted better stimulus-response retention. This result supports a trade-off between the two systems, where degraded WM increases RL processing, which improves retention. Notably, we show that this cooperative interplay remains largely unaffected by acute stress.


Subject(s)
Learning , Memory, Short-Term , Male , Humans , Female , Memory, Short-Term/physiology , Learning/physiology , Reinforcement, Psychology , Choice Behavior , Cognition
11.
J Cogn Neurosci ; : 1-17, 2022 Nov 28.
Article in English | MEDLINE | ID: mdl-36473098

ABSTRACT

In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus-response values that change incrementally. RL models consider any response type indiscriminately, ranging from more concretely defined motor choices (pressing a key with the index finger), to more general choices that can be executed in a number of ways (selecting dinner at the restaurant). However, does the learning process vary as a function of the choice type? In Experiment 1, we show that it does: Participants were slower and less accurate in learning correct choices of a general format compared with learning more concrete motor actions. Using computational modeling, we show that two mechanisms contribute to this. First, there was evidence of irrelevant credit assignment: The values of motor actions interfered with the values of other choice dimensions, resulting in more incorrect choices when the correct response was not defined by a single motor action; second, information integration for relevant general choices was slower. In Experiment 2, we replicated and further extended the findings from Experiment 1 by showing that slowed learning was attributable to weaker working memory use, rather than slowed RL. In both experiments, we ruled out the explanation that the difference in performance between two condition types was driven by difficulty/different levels of complexity. We conclude that defining a more abstract choice space used by multiple learning systems for credit assignment recruits executive resources, limiting how much such processes then contribute to fast learning.

12.
Cogsci ; 44: 948-954, 2022 Jul.
Article in English | MEDLINE | ID: mdl-36534042

ABSTRACT

Humans have the exceptional ability to efficiently structure past knowledge during learning to enable fast generalization. Xia and Collins (2021) evaluated this ability in a hierarchically structured, sequential decision-making task, where participants could build "options" (strategy "chunks") at multiple levels of temporal and state abstraction. A quantitative model, the Option Model, captured the transfer effects observed in human participants, suggesting that humans create and compose hierarchical options and use them to explore novel contexts. However, it is not well understood how learning in a new context is attributed to new and old options (i.e., the credit assignment problem). In a new context with new contingencies, where participants can recompose some aspects of previously learned options, do they reliably create new options or overwrite existing ones? Does the credit assignment depend on how similar the new option is to an old one? In our experiment, two groups of participants (n=124 and n=104) learned hierarchically structured options, experienced different amounts of negative transfer in a new option context, and were subsequently tested on the previously learned options. Behavioral analysis showed that old options were successfully reused without interference, and new options were appropriately created and credited. This credit assignment did not depend on how similar the new option was to the old option, showing great flexibility and precision in human hierarchical learning. These behavioral results were captured by the Option Model, providing further evidence for option learning and transfer in humans.

13.
Elife ; 112022 Nov 04.
Article in English | MEDLINE | ID: mdl-36331872

ABSTRACT

Reinforcement Learning (RL) models have revolutionized the cognitive and brain sciences, promising to explain behavior from simple conditioning to complex problem solving, to shed light on developmental and individual differences, and to anchor cognitive processes in specific brain mechanisms. However, the RL literature increasingly reveals contradictory results, which might cast doubt on these claims. We hypothesized that many contradictions arise from two commonly-held assumptions about computational model parameters that are actually often invalid: That parameters generalize between contexts (e.g. tasks, models) and that they capture interpretable (i.e. unique, distinctive) neurocognitive processes. To test this, we asked 291 participants aged 8-30 years to complete three learning tasks in one experimental session, and fitted RL models to each. We found that some parameters (exploration / decision noise) showed significant generalization: they followed similar developmental trajectories, and were reciprocally predictive between tasks. Still, generalization was significantly below the methodological ceiling. Furthermore, other parameters (learning rates, forgetting) did not show evidence of generalization, and sometimes even opposite developmental trajectories. Interpretability was low for all parameters. We conclude that the systematic study of context factors (e.g. reward stochasticity; task volatility) will be necessary to enhance the generalizability and interpretability of computational cognitive models.


Subject(s)
Learning , Reinforcement, Psychology , Humans , Reward , Generalization, Psychological , Computer Simulation
14.
Cell Rep ; 40(4): 111129, 2022 07 26.
Article in English | MEDLINE | ID: mdl-35905722

ABSTRACT

The dorsomedial striatum (DMS) plays a key role in action selection, but less is known about how direct and indirect pathway spiny projection neurons (dSPNs and iSPNs, respectively) contribute to choice rejection in freely moving animals. Here, we use pathway-specific chemogenetic manipulation during a serial choice foraging task to test the role of dSPNs and iSPNs in learned choice rejection. We find that chemogenetic activation, but not inhibition, of iSPNs disrupts rejection of nonrewarded choices, contrary to predictions of a simple "select/suppress" heuristic. Our findings suggest that iSPNs' role in stopping and freezing does not extend in a simple fashion to choice rejection in an ethological, freely moving context. These data may provide insights critical for the successful design of interventions for addiction or other conditions in which it is desirable to strengthen choice rejection.


Subject(s)
Corpus Striatum , Neurons , Animals , Corpus Striatum/metabolism , Learning , Neostriatum , Neurites , Neurons/metabolism
15.
Dev Cogn Neurosci ; 55: 101106, 2022 06.
Article in English | MEDLINE | ID: mdl-35537273

ABSTRACT

During adolescence, youth venture out, explore the wider world, and are challenged to learn how to navigate novel and uncertain environments. We investigated how performance changes across adolescent development in a stochastic, volatile reversal-learning task that uniquely taxes the balance of persistence and flexibility. In a sample of 291 participants aged 8-30, we found that in the mid-teen years, adolescents outperformed both younger and older participants. We developed two independent cognitive models, based on Reinforcement learning (RL) and Bayesian inference (BI). The RL parameter for learning from negative outcomes and the BI parameters specifying participants' mental models were closest to optimal in mid-teen adolescents, suggesting a central role in adolescent cognitive processing. By contrast, persistence and noise parameters improved monotonically with age. We distilled the insights of RL and BI using principal component analysis and found that three shared components interacted to form the adolescent performance peak: adult-like behavioral quality, child-like time scales, and developmentally-unique processing of positive feedback. This research highlights adolescence as a neurodevelopmental window that can create performance advantages in volatile and uncertain environments. It also shows how detailed insights can be gleaned by using cognitive models in new ways.


Subject(s)
Attention , Reinforcement, Psychology , Adolescent , Adolescent Development , Adult , Bayes Theorem , Humans , Reversal Learning
16.
Front Psychiatry ; 13: 800290, 2022.
Article in English | MEDLINE | ID: mdl-35360119

ABSTRACT

Impulsivity is defined as a trait-like tendency to engage in rash actions that are poorly thought out or expressed in an untimely manner. Previous research has found that impulsivity relates to deficits in decision making, in particular when it necessitates executive control or reward outcomes. Reinforcement learning (RL) relies on the ability to integrate reward or punishment outcomes to make good decisions, and has recently been shown to often recruit executive function; as such, it is unsurprising that impulsivity has been studied in the context of RL. However, how impulsivity relates to the mechanisms of RL remains unclear. We aimed to investigate the relationship between impulsivity and learning in a reward-driven learning task with probabilistic feedback and reversal known to recruit executive function. Based on prior literature in clinical populations, we predicted that higher impulsivity would be associated with poorer performance on the task, driven by more frequent switching following unrewarded outcomes. Our results did not support this prediction, but more advanced, trial-history dependent analyses revealed specific effects of impulsivity on switching behavior following consecutive unrewarded trials. Computational modeling captured group-level behavior, but not impulsivity results. Our results support previous findings highlighting the importance of sensitivity to negative outcomes in understanding how impulsivity relates to learning, but indicate that this may stem from more complex strategies than usually considered in computational models of learning. This should be an important target for future research.

17.
iScience ; 25(3): 103902, 2022 Mar 18.
Article in English | MEDLINE | ID: mdl-35252809

ABSTRACT

We encounter the world as a continuous flow and effortlessly segment sequences of events into episodes. This process of event segmentation engages working memory (WM) for tracking the flow of events and impacts subsequent memory accuracy. WM is limited in how much information (i.e., WM capacity) and for how long the information is retained (i.e., forgetting rate). In this study, across multiple tasks, we estimated participants' WM capacity and forgetting rate in a dynamic context and evaluated their relationship to event segmentation. A U-shaped relationship across tasks shows that individuals who segmented the movie more finely or coarsely than the average have a faster WM forgetting rate. A separate task assessing long-term memory retrieval revealed that the coarse-segmenters have better recognition of temporal order of events compared to the fine-segmenters. These findings show that event segmentation employs dissociable memory strategies and correlates with how long information is retained in WM.

18.
Neuropsychopharmacology ; 47(1): 104-118, 2022 01.
Article in English | MEDLINE | ID: mdl-34453117

ABSTRACT

An organism's survival depends on its ability to learn about its environment and to make adaptive decisions in the service of achieving the best possible outcomes in that environment. To study the neural circuits that support these functions, researchers have increasingly relied on models that formalize the computations required to carry them out. Here, we review the recent history of computational modeling of learning and decision-making, and how these models have been used to advance understanding of prefrontal cortex function. We discuss how such models have advanced from their origins in basic algorithms of updating and action selection to increasingly account for complexities in the cognitive processes required for learning and decision-making, and the representations over which they operate. We further discuss how a deeper understanding of the real-world complexities in these computations has shed light on the fundamental constraints on optimal behavior, and on the complex interactions between corticostriatal pathways to determine such behavior. The continuing and rapid development of these models holds great promise for understanding the mechanisms by which animals adapt to their environments, and what leads to maladaptive forms of learning and decision-making within clinical populations.


Subject(s)
Decision Making , Neurosciences , Animals , Computer Simulation , Learning , Prefrontal Cortex
19.
J Cogn Neurosci ; 34(4): 551-568, 2022 03 05.
Article in English | MEDLINE | ID: mdl-34942642

ABSTRACT

Reinforcement learning and working memory are two core processes of human cognition and are often considered cognitively, neuroscientifically, and algorithmically distinct. Here, we show that the brain networks that support them actually overlap significantly and that they are less distinct cognitive processes than often assumed. We review literature demonstrating the benefits of considering each process to explain properties of the other and highlight recent work investigating their more complex interactions. We discuss how future research in both computational and cognitive sciences can benefit from one another, suggesting that a key missing piece for artificial agents to learn to behave with more human-like efficiency is taking working memory's role in learning seriously. This review highlights the risks of neglecting the interplay between different processes when studying human behavior (in particular when considering individual differences). We emphasize the importance of investigating these dynamics to build a comprehensive understanding of human cognition.


Subject(s)
Memory, Short-Term , Reinforcement, Psychology , Brain , Cognition , Humans , Learning
20.
Cogsci ; 43: 618-624, 2021 Jul.
Article in English | MEDLINE | ID: mdl-34964045

ABSTRACT

Humans have the astonishing capacity to quickly adapt to varying environmental demands and reach complex goals in the absence of extrinsic rewards. Part of what underlies this capacity is the ability to flexibly reuse and recombine previous experiences, and to plan future courses of action in a psychological space that is shaped by these experiences. Decades of research have suggested that humans use hierarchical representations for efficient planning and flexibility, but the origin of these representations has remained elusive. This study investigates how 73 participants learned hierarchical representations through experience, in a task in which they had to perform complex action sequences to obtain rewards. Complex action sequences were composed of simpler action sequences, which were not rewarded, but whose completion was signaled to participants. We investigated the process with which participants learned to perform simpler action sequences and combined them into complex action sequences. After learning action sequences, participants completed a transfer phase in which either simple sequences or complex sequences were manipulated without notice. Relearning progressed slower when simple than complex sequences were changed, in accordance with a hierarchical representations in which lower levels are quickly consolidated, potentially stabilizing exploration, while higher levels remain malleable, with benefits for flexible recombination.

SELECTION OF CITATIONS
SEARCH DETAIL
...