Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Front Artif Intell ; 6: 1098982, 2023.
Article in English | MEDLINE | ID: mdl-36762255

ABSTRACT

Learning from only real-world collected data can be unrealistic and time consuming in many scenario. One alternative is to use synthetic data as learning environments to learn rare situations and replay buffers to speed up the learning. In this work, we examine the hypothesis of how the creation of the environment affects the training of reinforcement learning agent through auto-generated environment mechanisms. We take the autonomous vehicle as an application. We compare the effect of two approaches to generate training data for artificial cognitive agents. We consider the added value of curriculum learning-just as in human learning-as a way to structure novel training data that the agent has not seen before as well as that of using a replay buffer to train further on data the agent has seen before. In other words, the focus of this paper is on characteristics of the training data rather than on learning algorithms. We therefore use two tasks that are commonly trained early on in autonomous vehicle research: lane keeping and pedestrian avoidance. Our main results show that curriculum learning indeed offers an additional benefit over a vanilla reinforcement learning approach (using Deep-Q Learning), but the replay buffer actually has a detrimental effect in most (but not all) combinations of data generation approaches we considered here. The benefit of curriculum learning does depend on the existence of a well-defined difficulty metric with which various training scenarios can be ordered. In the lane-keeping task, we can define it as a function of the curvature of the road, in which the steeper and more occurring curves on the road, the more difficult it gets. Defining such a difficulty metric in other scenarios is not always trivial. In general, the results of this paper emphasize both the importance of considering data characterization, such as curriculum learning, and the importance of defining an appropriate metric for the task.

2.
PLoS One ; 15(8): e0236939, 2020.
Article in English | MEDLINE | ID: mdl-32823270

ABSTRACT

We present a dataset of behavioral data recorded from 61 children diagnosed with Autism Spectrum Disorder (ASD). The data was collected during a large-scale evaluation of Robot Enhanced Therapy (RET). The dataset covers over 3000 therapy sessions and more than 300 hours of therapy. Half of the children interacted with the social robot NAO supervised by a therapist. The other half, constituting a control group, interacted directly with a therapist. Both groups followed the Applied Behavior Analysis (ABA) protocol. Each session was recorded with three RGB cameras and two RGBD (Kinect) cameras, providing detailed information of children's behavior during therapy. This public release of the dataset comprises body motion, head position and orientation, and eye gaze variables, all specified as 3D data in a joint frame of reference. In addition, metadata including participant age, gender, and autism diagnosis (ADOS) variables are included. We release this data with the hope of supporting further data-driven studies towards improved therapy methods as well as a better understanding of ASD in general.


Subject(s)
Autism Spectrum Disorder/therapy , Databases, Factual , Medical Informatics , Robotics , Behavior , Child , Evidence-Based Medicine , Female , Humans , Male
3.
Sensors (Basel) ; 17(11)2017 Nov 09.
Article in English | MEDLINE | ID: mdl-29120389

ABSTRACT

In this paper, we developed a fully textile sensing fabric for tactile touch sensing as the robot skin to detect human-robot interactions. The sensor covers a 20-by-20 cm 2 area with 400 sensitive points and samples at 50 Hz per point. We defined seven gestures which are inspired by the social and emotional interactions of typical people to people or pet scenarios. We conducted two groups of mutually blinded experiments, involving 29 participants in total. The data processing algorithm first reduces the spatial complexity to frame descriptors, and temporal features are calculated through basic statistical representations and wavelet analysis. Various classifiers are evaluated and the feature calculation algorithms are analyzed in details to determine each stage and segments' contribution. The best performing feature-classifier combination can recognize the gestures with a 93 . 3 % accuracy from a known group of participants, and 89 . 1 % from strangers.


Subject(s)
Textiles , Emotions , Humans , Robotics , Skin , Touch
4.
Biol Cybern ; 111(5-6): 365-388, 2017 12.
Article in English | MEDLINE | ID: mdl-28913644

ABSTRACT

The partial reinforcement extinction effect (PREE) is an experimentally established phenomenon: behavioural response to a given stimulus is more persistent when previously inconsistently rewarded than when consistently rewarded. This phenomenon is, however, controversial in animal/human learning theory. Contradictory findings exist regarding when the PREE occurs. One body of research has found a within-subjects PREE, while another has found a within-subjects reversed PREE (RPREE). These opposing findings constitute what is considered the most important problem of PREE for theoreticians to explain. Here, we provide a neurocomputational account of the PREE, which helps to reconcile these seemingly contradictory findings of within-subjects experimental conditions. The performance of our model demonstrates how omission expectancy, learned according to low probability reward, comes to control response choice following discontinuation of reward presentation (extinction). We find that a PREE will occur when multiple responses become controlled by omission expectation in extinction, but not when only one omission-mediated response is available. Our model exploits the affective states of reward acquisition and reward omission expectancy in order to differentially classify stimuli and differentially mediate response choice. We demonstrate that stimulus-response (retrospective) and stimulus-expectation-response (prospective) routes are required to provide a necessary and sufficient explanation of the PREE versus RPREE data and that Omission representation is key for explaining the nonlinear nature of extinction data.


Subject(s)
Affect/physiology , Computer Simulation , Extinction, Psychological , Models, Neurological , Neurons/physiology , Reinforcement, Psychology , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...