Pesquisa | Portal Regional da BVS

Quantifying Reinforcement-Learning Agent's Autonomy, Reliance on Memory and Internalisation of the Environment.

Ingel, Anti; Makkeh, Abdullah; Corcoll, Oriol; Vicente, Raul.

Entropy (Basel) ; 24(3)2022 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-35327912

RESUMO

Intuitively, the level of autonomy of an agent is related to the degree to which the agent's goals and behaviour are decoupled from the immediate control by the environment. Here, we capitalise on a recent information-theoretic formulation of autonomy and introduce an algorithm for calculating autonomy in a limiting process of time step approaching infinity. We tackle the question of how the autonomy level of an agent changes during training. In particular, in this work, we use the partial information decomposition (PID) framework to monitor the levels of autonomy and environment internalisation of reinforcement-learning (RL) agents. We performed experiments on two environments: a grid world, in which the agent has to collect food, and a repeating-pattern environment, in which the agent has to learn to imitate a sequence of actions by memorising the sequence. PID also allows us to answer how much the agent relies on its internal memory (versus how much it relies on the observations) when transitioning to its next internal state. The experiments show that specific terms of PID strongly correlate with the obtained reward and with the agent's behaviour against perturbations in the observations.

Introducing a differentiable measure of pointwise shared information.

Makkeh, Abdullah; Gutknecht, Aaron J; Wibral, Michael.

Phys Rev E ; 103(3-1): 032149, 2021 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-33862718

RESUMO

Partial information decomposition of the multivariate mutual information describes the distinct ways in which a set of source variables contains information about a target variable. The groundbreaking work of Williams and Beer has shown that this decomposition cannot be determined from classic information theory without making additional assumptions, and several candidate measures have been proposed, often drawing on principles from related fields such as decision theory. None of these measures is differentiable with respect to the underlying probability mass function. We here present a measure that satisfies this property, emerges solely from information-theoretic principles, and has the form of a local mutual information. We show how the measure can be understood from the perspective of exclusions of probability mass, a principle that is foundational to the original definition of mutual information by Fano. Since our measure is well defined for individual realizations of random variables it lends itself, for example, to local learning in artificial neural networks. We also show that it has a meaningful Möbius inversion on a redundancy lattice and obeys a target chain rule. We give an operational interpretation of the measure based on the decisions that an agent should take if given only the shared information.

Estimating the Unique Information of Continuous Variables.

Pakman, Ari; Nejatbakhsh, Amin; Gilboa, Dar; Makkeh, Abdullah; Mazzucato, Luca; Wibral, Michael; Schneidman, Elad.

Adv Neural Inf Process Syst ; 34: 20295-20307, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-35645551

RESUMO

The integration and transfer of information from multiple sources to multiple targets is a core motive of neural systems. The emerging field of partial information decomposition (PID) provides a novel information-theoretic lens into these mechanisms by identifying synergistic, redundant, and unique contributions to the mutual information between one and several variables. While many works have studied aspects of PID for Gaussian and discrete distributions, the case of general continuous distributions is still uncharted territory. In this work we present a method for estimating the unique information in continuous distributions, for the case of one versus two variables. Our method solves the associated optimization problem over the space of distributions with fixed bivariate marginals by combining copula decompositions and techniques developed to optimize variational autoencoders. We obtain excellent agreement with known analytic results for Gaussians, and illustrate the power of our new approach in several brain-inspired neural models. Our method is capable of recovering the effective connectivity of a chaotic network of rate neurons, and uncovers a complex trade-off between redundancy, synergy and unique information in recurrent networks trained to solve a generalized XOR task.

BROJA-2PID: A Robust Estimator for Bivariate Partial Information Decomposition.

Makkeh, Abdullah; Theis, Dirk Oliver; Vicente, Raul.

Entropy (Basel) ; 20(4)2018 Apr 11.

Artigo em Inglês | MEDLINE | ID: mdl-33265362

RESUMO

Makkeh, Theis, and Vicente found that Cone Programming model is the most robust to compute the Bertschinger et al. partial information decomposition (BROJA PID) measure. We developed a production-quality robust software that computes the BROJA PID measure based on the Cone Programming model. In this paper, we prove the important property of strong duality for the Cone Program and prove an equivalence between the Cone Program and the original Convex problem. Then, we describe in detail our software, explain how to use it, and perform some experiments comparing it to other estimators. Finally, we show that the software can be extended to compute some quantities of a trivaraite PID measure.

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA