Pesquisa | Portal Regional da BVS

Softsatisficing: Risk-sensitive softmax action selection.

Kamiya, Takumi; Takahashi, Tatsuji.

Biosystems ; 213: 104633, 2022 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-35104613

RESUMO

Animals, humans, and organizations are known to adjust how (much) they explore complex environments that exceed their information processing capacity, rather than relentlessly search for the optimal action. The adjusted depth of exploration is supposed to depend on the aspiration level internal to the agent. This action selection tendency is known as satisficing. The Risk-sensitive Satisficing (RS) model implements satisficing in the reinforcement learning framework through conversion of action values into gains (or losses) relative to the aspiration level. The risk-sensitive evaluation of action values by RS has been shown to be effective in reinforcement learning. In this paper, first we analyze RS in comparison with UCB and Thompson sampling algorithms. We also show that RS shows differential risk-attitudes considering the risks. Then we propose the Softsatisficing policy that is a stochastic equivalent of RS and further analyze the exploratory behavior of risk-sensitive satisficing that RS and Softsatisficing implement. We emphasize that Softsatisficing has the potential of modeling risk-sensitive foraging and other decision-making behaviors by humans, animals, and organizations.

Assuntos

Tomada de Decisões , Reforço Psicológico , Algoritmos , Animais , Cognição , Comportamento Exploratório

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA