Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 56
Filter
1.
Nat Commun ; 15(1): 4461, 2024 May 25.
Article in English | MEDLINE | ID: mdl-38796491

ABSTRACT

Behaving efficiently and flexibly is crucial for biological and artificial embodied agents. Behavior is generally classified into two types: habitual (fast but inflexible), and goal-directed (flexible but slow). While these two types of behaviors are typically considered to be managed by two distinct systems in the brain, recent studies have revealed a more sophisticated interplay between them. We introduce a theoretical framework using variational Bayesian theory, incorporating a Bayesian intention variable. Habitual behavior depends on the prior distribution of intention, computed from sensory context without goal-specification. In contrast, goal-directed behavior relies on the goal-conditioned posterior distribution of intention, inferred through variational free energy minimization. Assuming that an agent behaves using a synergized intention, our simulations in vision-based sensorimotor tasks explain the key properties of their interaction as observed in experiments. Our work suggests a fresh perspective on the neural mechanisms of habits and goals, shedding light on future research in decision making.


Subject(s)
Bayes Theorem , Goals , Habits , Humans , Intention , Decision Making/physiology , Brain/physiology
2.
Entropy (Basel) ; 25(11)2023 Oct 31.
Article in English | MEDLINE | ID: mdl-37998198

ABSTRACT

This study investigated how a physical robot can adapt goal-directed actions in dynamically changing environments, in real-time, using an active inference-based approach with incremental learning from human tutoring examples. Using our active inference-based model, while good generalization can be achieved with appropriate parameters, when faced with sudden, large changes in the environment, a human may have to intervene to correct actions of the robot in order to reach the goal, as a caregiver might guide the hands of a child performing an unfamiliar task. In order for the robot to learn from the human tutor, we propose a new scheme to accomplish incremental learning from these proprioceptive-exteroceptive experiences combined with mental rehearsal of past experiences. Our experimental results demonstrate that using only a few tutoring examples, the robot using our model was able to significantly improve its performance on new tasks without catastrophic forgetting of previously learned tasks.

3.
Front Psychiatry ; 14: 1080668, 2023.
Article in English | MEDLINE | ID: mdl-37009124

ABSTRACT

Introduction: Investigating the pathological mechanisms of developmental disorders is a challenge because the symptoms are a result of complex and dynamic factors such as neural networks, cognitive behavior, environment, and developmental learning. Recently, computational methods have started to provide a unified framework for understanding developmental disorders, enabling us to describe the interactions among those multiple factors underlying symptoms. However, this approach is still limited because most studies to date have focused on cross-sectional task performance and lacked the perspectives of developmental learning. Here, we proposed a new research method for understanding the mechanisms of the acquisition and its failures in hierarchical Bayesian representations using a state-of-the-art computational model, referred to as in silico neurodevelopment framework for atypical representation learning. Methods: Simple simulation experiments were conducted using the proposed framework to examine whether manipulating the neural stochasticity and noise levels in external environments during the learning process can lead to the altered acquisition of hierarchical Bayesian representation and reduced flexibility. Results: Networks with normal neural stochasticity acquired hierarchical representations that reflected the underlying probabilistic structures in the environment, including higher-order representation, and exhibited good behavioral and cognitive flexibility. When the neural stochasticity was high during learning, top-down generation using higher-order representation became atypical, although the flexibility did not differ from that of the normal stochasticity settings. However, when the neural stochasticity was low in the learning process, the networks demonstrated reduced flexibility and altered hierarchical representation. Notably, this altered acquisition of higher-order representation and flexibility was ameliorated by increasing the level of noises in external stimuli. Discussion: These results demonstrated that the proposed method assists in modeling developmental disorders by bridging between multiple factors, such as the inherent characteristics of neural dynamics, acquisitions of hierarchical representation, flexible behavior, and external environment.

4.
Entropy (Basel) ; 25(2)2023 Jan 31.
Article in English | MEDLINE | ID: mdl-36832633

ABSTRACT

This study explains how the leader-follower relationship and turn-taking could develop in a dyadic imitative interaction by conducting robotic simulation experiments based on the free energy principle. Our prior study showed that introducing a parameter during the model training phase can determine leader and follower roles for subsequent imitative interactions. The parameter is defined as w, the so-called meta-prior, and is a weighting factor used to regulate the complexity term versus the accuracy term when minimizing the free energy. This can be read as sensory attenuation, in which the robot's prior beliefs about action are less sensitive to sensory evidence. The current extended study examines the possibility that the leader-follower relationship shifts depending on changes in w during the interaction phase. We identified a phase space structure with three distinct types of behavioral coordination using comprehensive simulation experiments with sweeps of w of both robots during the interaction. Ignoring behavior in which the robots follow their own intention was observed in the region in which both ws were set to large values. One robot leading, followed by the other robot was observed when one w was set larger and the other was set smaller. Spontaneous, random turn-taking between the leader and the follower was observed when both ws were set at smaller or intermediate values. Finally, we examined a case of slowly oscillating w in anti-phase between the two agents during the interaction. The simulation experiment resulted in turn-taking in which the leader-follower relationship switched during determined sequences, accompanied by periodic shifts of ws. An analysis using transfer entropy found that the direction of information flow between the two agents also shifted along with turn-taking. Herein, we discuss qualitative differences between random/spontaneous turn-taking and agreed-upon sequential turn-taking by reviewing both synthetic and empirical studies.

5.
Front Neurorobot ; 16: 891031, 2022.
Article in English | MEDLINE | ID: mdl-36187567

ABSTRACT

Robot kinematic data, despite being high-dimensional, is highly correlated, especially when considering motions grouped in certain primitives. These almost linear correlations within primitives allow us to interpret motions as points drawn close to a union of low-dimensional affine subspaces in the space of all motions. Motivated by results of embedding theory, in particular, generalizations of the Whitney embedding theorem, we show that random linear projection of motor sequences into low-dimensional space loses very little information about the structure of kinematic data. Projected points offer good initial estimates for values of latent variables in a generative model of robot sensory-motor behavior primitives. We conducted a series of experiments in which we trained a Recurrent Neural Network to generate sensory-motor sequences for a robotic manipulator with 9 degrees of freedom. Experimental results demonstrate substantial improvement in generalization abilities for unobserved samples during initialization of latent variables with a random linear projection of motor data over initialization with zero or random values. Moreover, latent space is well-structured such that samples belonging to different primitives are well separated from the onset of the training process.

6.
Sci Rep ; 12(1): 14542, 2022 08 25.
Article in English | MEDLINE | ID: mdl-36008463

ABSTRACT

The brain attenuates its responses to self-produced exteroceptions (e.g., we cannot tickle ourselves). Is this phenomenon, known as sensory attenuation, enabled innately, or acquired through learning? Here, our simulation study using a multimodal hierarchical recurrent neural network model, based on variational free-energy minimization, shows that a mechanism for sensory attenuation can develop through learning of two distinct types of sensorimotor experience, involving self-produced or externally produced exteroceptions. For each sensorimotor context, a particular free-energy state emerged through interaction between top-down prediction with precision and bottom-up sensory prediction error from each sensory area. The executive area in the network served as an information hub. Consequently, shifts between the two sensorimotor contexts triggered transitions from one free-energy state to another in the network via executive control, which caused shifts between attenuating and amplifying prediction-error-induced responses in the sensory areas. This study situates emergence of sensory attenuation (or self-other distinction) in development of distinct free-energy states in the dynamic hierarchical neural system.


Subject(s)
Learning , Sensation , Brain , Learning/physiology , Neural Networks, Computer
7.
Entropy (Basel) ; 24(4)2022 Mar 28.
Article in English | MEDLINE | ID: mdl-35455132

ABSTRACT

We show that goal-directed action planning and generation in a teleological framework can be formulated by extending the active inference framework. The proposed model, which is built on a variational recurrent neural network model, is characterized by three essential features. These are that (1) goals can be specified for both static sensory states, e.g., for goal images to be reached and dynamic processes, e.g., for moving around an object, (2) the model cannot only generate goal-directed action plans, but can also understand goals through sensory observation, and (3) the model generates future action plans for given goals based on the best estimate of the current state, inferred from past sensory observations. The proposed model is evaluated by conducting experiments on a simulated mobile agent as well as on a real humanoid robot performing object manipulation.

8.
Artif Life ; 28(1): 3-21, 2022 06 09.
Article in English | MEDLINE | ID: mdl-35287173

ABSTRACT

Evolution and development operate at different timescales; generations for the one, a lifetime for the other. These two processes, the basis of much of life on earth, interact in many non-trivial ways, but their temporal hierarchy-evolution overarching development-is observed for most multicellular life forms. When designing robots, however, this tenet lifts: It becomes-however natural-a design choice. We propose to inverse this temporal hierarchy and design a developmental process happening at the phylogenetic timescale. Over a classic evolutionary search aimed at finding good gaits for tentacle 2D robots, we add a developmental process over the robots' morphologies. Within a generation, the morphology of the robots does not change. But from one generation to the next, the morphology develops. Much like we become bigger, stronger, and heavier as we age, our robots are bigger, stronger, and heavier with each passing generation. Our robots start with baby morphologies, and a few thousand generations later, end-up with adult ones. We show that this produces better and qualitatively different gaits than an evolutionary search with only adult robots, and that it prevents premature convergence by fostering exploration. In addition, we validate our method on voxel lattice 3D robots from the literature and compare it to a recent evolutionary developmental approach. Our method is conceptually simple, and it can be effective on small or large populations of robots, and intrinsic to the robot and its morphology, not the task or environment. Furthermore, by recasting the evolutionary search as a learning process, these results can be viewed in the context of developmental learning robotics.


Subject(s)
Robotics , Algorithms , Gait , Learning , Phylogeny , Robotics/methods
9.
Neural Comput ; 33(9): 2353-2407, 2021 08 19.
Article in English | MEDLINE | ID: mdl-34412116

ABSTRACT

Generalization by learning is an essential cognitive competency for humans. For example, we can manipulate even unfamiliar objects and can generate mental images before enacting a preplan. How is this possible? Our study investigated this problem by revisiting our previous study (Jung, Matsumoto, & Tani, 2019), which examined the problem of vision-based, goal-directed planning by robots performing a task of block stacking. By extending the previous study, our work introduces a large network comprising dynamically interacting submodules, including visual working memory (VWMs), a visual attention module, and an executive network. The executive network predicts motor signals, visual images, and various controls for attention, as well as masking of visual information. The most significant difference from the previous study is that our current model contains an additional VWM. The entire network is trained by using predictive coding and an optimal visuomotor plan to achieve a given goal state is inferred using active inference. Results indicate that our current model performs significantly better than that used in Jung et al. (2019), especially when manipulating blocks with unlearned colors and textures. Simulation results revealed that the observed generalization was achieved because content-agnostic information processing developed through synergistic interaction between the second VWM and other modules during the course of learning, in which memorizing image contents and transforming them are dissociated. This letter verifies this claim by conducting both qualitative and quantitative analysis of simulation results.


Subject(s)
Memory, Short-Term , Robotics , Cognition , Humans , Visual Perception
10.
Front Psychol ; 11: 584869, 2020.
Article in English | MEDLINE | ID: mdl-33335499

ABSTRACT

Interdisciplinary efforts from developmental psychology, phenomenology, and philosophy of mind, have studied the rudiments of social cognition and conceptualized distinct forms of intersubjective communication and interaction at human early life. Interaction theorists consider primary intersubjectivity a non-mentalist, pre-theoretical, non-conceptual sort of processes that ground a certain level of communication and understanding, and provide support to higher-level cognitive skills. We argue the study of human/neurorobot interaction consists in a unique opportunity to deepen understanding of underlying mechanisms in social cognition through synthetic modeling, while allowing to examine a second person experiential (2PP) access to intersubjectivity in embodied dyadic interaction. Concretely, we propose the study of primary intersubjectivity as a 2PP experience characterized by predictive engagement, where perception, cognition, and action are accounted for an hermeneutic circle in dyadic interaction. From our interpretation of the concept of active inference in free-energy principle theory, we propose an open-source methodology named neural robotics library (NRL) for experimental human/neurorobot interaction, wherein a demonstration program named virtual Cartesian robot (VCBot) provides an opportunity to experience the aforementioned embodied interaction to general audiences. Lastly, through a study case, we discuss some ways human-robot primary intersubjectivity can contribute to cognitive science research, such as to the fields of developmental psychology, educational technology, and cognitive rehabilitation.

11.
Entropy (Basel) ; 22(5)2020 May 18.
Article in English | MEDLINE | ID: mdl-33286336

ABSTRACT

It is crucial to ask how agents can achieve goals by generating action plans using only partial models of the world acquired through habituated sensory-motor experiences. Although many existing robotics studies use a forward model framework, there are generalization issues with high degrees of freedom. The current study shows that the predictive coding (PC) and active inference (AIF) frameworks, which employ a generative model, can develop better generalization by learning a prior distribution in a low dimensional latent state space representing probabilistic structures extracted from well habituated sensory-motor trajectories. In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound, while goal-directed planning is accomplished by inferring latent variables for maximizing the estimated lower bound. Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data by setting an intermediate value for a regularization coefficient. Furthermore, comparative simulation results show that the proposed model outperforms a conventional forward model in goal-directed planning, due to the learned prior confining the search of motor plans within the range of habituated trajectories.

12.
Front Neurorobot ; 14: 61, 2020.
Article in English | MEDLINE | ID: mdl-33013346

ABSTRACT

When agents interact socially with different intentions (or wills), conflicts are difficult to avoid. Although the means by which social agents can resolve such problems autonomously has not been determined, dynamic characteristics of agency may shed light on underlying mechanisms. Therefore, the current study focused on the sense of agency, a specific aspect of agency referring to congruence between the agent's intention in acting and the outcome, especially in social interaction contexts. Employing predictive coding and active inference as theoretical frameworks of perception and action generation, we hypothesize that regulation of complexity in the evidence lower bound of an agent's model should affect the strength of the agent's sense of agency and should have a significant impact on social interactions. To evaluate this hypothesis, we built a computational model of imitative interaction between a robot and a human via visuo-proprioceptive sensation with a variational Bayes recurrent neural network, and simulated the model in the form of pseudo-imitative interaction using recorded human body movement data, which serve as the counterpart in the interactions. A key feature of the model is that the complexity of each modality can be regulated differently by changing the values of a hyperparameter assigned to each local module of the model. We first searched for an optimal setting of hyperparameters that endow the model with appropriate coordination of multimodal sensation. These searches revealed that complexity of the vision module should be more tightly regulated than that of the proprioception module because of greater uncertainty in visual information flow. Using this optimally trained model as a default model, we investigated how changing the tightness of complexity regulation in the entire network after training affects the strength of the sense of agency during imitative interactions. The results showed that with looser regulation of complexity, an agent tends to act more egocentrically, without adapting to the other. In contrast, with tighter regulation, the agent tends to follow the other by adjusting its intention. We conclude that the tightness of complexity regulation significantly affects the strength of the sense of agency and the dynamics of interactions between agents in social settings.

13.
Neural Netw ; 129: 149-162, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32534378

ABSTRACT

Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics.


Subject(s)
Machine Learning/standards
14.
Neural Comput ; 31(11): 2025-2074, 2019 11.
Article in English | MEDLINE | ID: mdl-31525309

ABSTRACT

This study introduces PV-RNN, a novel variational RNN inspired by predictive-coding ideas. The model learns to extract the probabilistic structures hidden in fluctuating temporal patterns by dynamically changing the stochasticity of its latent states. Its architecture attempts to address two major concerns of variational Bayes RNNs: how latent variables can learn meaningful representations and how the inference model can transfer future observations to the latent variables. PV-RNN does both by introducing adaptive vectors mirroring the training data, whose values can then be adapted differently during evaluation. Moreover, prediction errors during backpropagation-rather than external inputs during the forward computation-are used to convey information to the network about the external data. For testing, we introduce error regression for predicting unseen sequences as inspired by predictive coding that leverages those mechanisms. As in other variational Bayes RNNs, our model learns by maximizing a lower bound on the marginal likelihood of the sequential data, which is composed of two terms: the negative of the expectation of prediction errors and the negative of the Kullback-Leibler divergence between the prior and the approximate posterior distributions. The model introduces a weighting parameter, the meta-prior, to balance the optimization pressure placed on those two terms. We test the model on two data sets with probabilistic structures and show that with high values of the meta-prior, the network develops deterministic chaos through which the randomness of the data is imitated. For low values, the model behaves as a random process. The network performs best on intermediate values and is able to capture the latent probabilistic structure with good generalization. Analyzing the meta-prior's impact on the network allows us to precisely study the theoretical value and practical benefits of incorporating stochastic dynamics in our model. We demonstrate better prediction performance on a robot imitation task with our model using error regression compared to a standard variational Bayes model lacking such a procedure.

15.
Front Neurorobot ; 12: 78, 2018.
Article in English | MEDLINE | ID: mdl-30546302

ABSTRACT

Artificial autonomous agents and robots interacting in complex environments are required to continually acquire and fine-tune knowledge over sustained periods of time. The ability to learn from continuous streams of information is referred to as lifelong learning and represents a long-standing challenge for neural network models due to catastrophic forgetting in which novel sensory experience interferes with existing representations and leads to abrupt decreases in the performance on previously acquired knowledge. Computational models of lifelong learning typically alleviate catastrophic forgetting in experimental scenarios with given datasets of static images and limited complexity, thereby differing significantly from the conditions artificial agents are exposed to. In more natural settings, sequential information may become progressively available over time and access to previous experience may be restricted. Therefore, specialized neural network mechanisms are required that adapt to novel sequential experience while preventing disruptive interference with existing representations. In this paper, we propose a dual-memory self-organizing architecture for lifelong learning scenarios. The architecture comprises two growing recurrent networks with the complementary tasks of learning object instances (episodic memory) and categories (semantic memory). Both growing networks can expand in response to novel sensory experience: the episodic memory learns fine-grained spatiotemporal representations of object instances in an unsupervised fashion while the semantic memory uses task-relevant signals to regulate structural plasticity levels and develop more compact representations from episodic experience. For the consolidation of knowledge in the absence of external sensory input, the episodic memory periodically replays trajectories of neural reactivations. We evaluate the proposed model on the CORe50 benchmark dataset for continuous object recognition, showing that we significantly outperform current methods of lifelong learning in three different incremental learning scenarios.

16.
Inorg Chem ; 57(21): 13137-13149, 2018 Nov 05.
Article in English | MEDLINE | ID: mdl-30345760

ABSTRACT

Thermal decomposition of layered zinc hydroxides (LZHs) is a simple and convenient way to achieve porous ZnO nanostructures. The type of anion contained in an LZH determines the fundamental characteristics of the LZH and thus affects the formation process of the resulting porous ZnO. Here we report a comparative study on the crystal orientation relationship between LZH precursors and the corresponding porous ZnO products by using well-faceted and highly oriented LZH crystals with three different anions, i.e., NO3-, SO42-, and Cl-. Highly oriented LZH crystals were prepared on layer-by-layer coated indium tin oxide substrates by electrodeposition in aqueous solution and were transformed into porous ZnO by calcination in air. The synthesized materials were characterized by X-ray diffraction, scanning electron microscopy with electron backscatter diffraction, Fourier transformed infrared spectroscopy, and X-ray photoelectron spectroscopy. The layered structure of the highly oriented LZHs was parallel to the substrate surface and all transformed to nanoporous ZnO with a ⟨0001⟩ preferred orientation. The ⟨0001⟩ orientation degree and in-plane orientation of the nanoporous ZnO differed significantly depending on the type of anion but not the decomposition temperature, revealing that the initial formation process of ZnO from the LZHs is crucial. Finally, a possible transformation mechanism explaining the difference in the resulting ZnO orientation by anions (NO3-, SO42-, and Cl-) is discussed on the basis of their layered structure and thermal decomposition processes.

17.
Neural Netw ; 105: 356-370, 2018 Sep.
Article in English | MEDLINE | ID: mdl-29936360

ABSTRACT

Video image recognition has been extensively studied with rapid progress recently. However, most methods focus on short-term rather than long-term (contextual) video recognition. Convolutional recurrent neural networks (ConvRNNs) provide robust spatio-temporal information processing capabilities for contextual video recognition, but require extensive computation that slows down training. Inspired by normalization and detrending methods, in this paper we propose "adaptive detrending" (AD) for temporal normalization in order to accelerate the training of ConvRNNs, especially of convolutional gated recurrent unit (ConvGRU). For each neuron in a recurrent neural network (RNN), AD identifies the trending change within a sequence and subtracts it, removing the internal covariate shift. In experiments testing for contextual video recognition with ConvGRU, results show that (1) ConvGRU clearly outperforms feed-forward neural networks, (2) AD consistently and significantly accelerates training and improves generalization, (3) performance is further improved when AD is coupled with other normalization methods, and most importantly, (4) the more long-term contextual information is required, the more AD outperforms existing methods.


Subject(s)
Machine Learning , Pattern Recognition, Automated/methods , Neural Networks, Computer , Pattern Recognition, Automated/standards , Video Recording/methods
18.
Comput Psychiatr ; 2: 164-182, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30627669

ABSTRACT

Recently, applying computational models developed in cognitive science to psychiatric disorders has been recognized as an essential approach for understanding cognitive mechanisms underlying psychiatric symptoms. Autism spectrum disorder is a neurodevelopmental disorder that is hypothesized to affect information processes in the brain involving the estimation of sensory precision (uncertainty), but the mechanism by which observed symptoms are generated from such abnormalities has not been thoroughly investigated. Using a humanoid robot controlled by a neural network using a precision-weighted prediction error minimization mechanism, it is suggested that both increased and decreased sensory precision could induce the behavioral rigidity characterized by resistance to change that is characteristic of autistic behavior. Specifically, decreased sensory precision caused any error signals to be disregarded, leading to invariability of the robot's intention, while increased sensory precision caused an excessive response to error signals, leading to fluctuations and subsequent fixation of intention. The results may provide a system-level explanation of mechanisms underlying different types of behavioral rigidity in autism spectrum and other psychiatric disorders. In addition, our findings suggest that symptoms caused by decreased and increased sensory precision could be distinguishable by examining the internal experience of patients and neural activity coding prediction error signals in the biological brain.

19.
Neural Comput ; 30(1): 237-270, 2018 01.
Article in English | MEDLINE | ID: mdl-29064785

ABSTRACT

This letter proposes a novel predictive coding type neural network model, the predictive multiple spatiotemporal scales recurrent neural network (P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body cyclic movement patterns by exploiting multiscale spatiotemporal constraints imposed on network dynamics by using differently sized receptive fields as well as different time constant values for each layer. After learning, the network can imitate target movement patterns by inferring or recognizing corresponding intentions by means of the regression of prediction error. Results show that the network can develop a functional hierarchy by developing a different type of dynamic structure at each layer. The letter examines how model performance during pattern generation, as well as predictive imitation, varies depending on the stage of learning. The number of limit cycle attractors corresponding to target movement patterns increases as learning proceeds. Transient dynamics developing early in the learning process successfully perform pattern generation and predictive imitation tasks. The letter concludes that exploitation of transient dynamics facilitates successful task performance during early learning periods.


Subject(s)
Learning/physiology , Movement/physiology , Neural Networks, Computer , Nonlinear Dynamics , Visual Pathways/physiology , Visual Perception/physiology , Algorithms , Humans , Memory/physiology , Predictive Value of Tests
20.
Neural Netw ; 96: 137-149, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29017140

ABSTRACT

Lifelong learning is fundamental in autonomous robotics for the acquisition and fine-tuning of knowledge through experience. However, conventional deep neural models for action recognition from videos do not account for lifelong learning but rather learn a batch of training data with a predefined number of action classes and samples. Thus, there is the need to develop learning systems with the ability to incrementally process available perceptual cues and to adapt their responses over time. We propose a self-organizing neural architecture for incrementally learning to classify human actions from video sequences. The architecture comprises growing self-organizing networks equipped with recurrent neurons for processing time-varying patterns. We use a set of hierarchically arranged recurrent networks for the unsupervised learning of action representations with increasingly large spatiotemporal receptive fields. Lifelong learning is achieved in terms of prediction-driven neural dynamics in which the growth and the adaptation of the recurrent networks are driven by their capability to reconstruct temporally ordered input sequences. Experimental results on a classification task using two action benchmark datasets show that our model is competitive with state-of-the-art methods for batch learning also when a significant number of sample labels are missing or corrupted during training sessions. Additional experiments show the ability of our model to adapt to non-stationary input avoiding catastrophic interference.


Subject(s)
Machine Learning/trends , Neural Networks, Computer , Pattern Recognition, Visual , Humans , Neurons/physiology , Pattern Recognition, Visual/physiology , Robotics/methods , Robotics/trends
SELECTION OF CITATIONS
SEARCH DETAIL
...