Search | VHL Regional Portal

Emergent Solutions to High-Dimensional Multitask Reinforcement Learning.

Kelly, Stephen; Heywood, Malcolm I.

Evol Comput ; 26(3): 347-380, 2018.

Article in English | MEDLINE | ID: mdl-29932363

ABSTRACT

Algorithms that learn through environmental interaction and delayed rewards, or reinforcement learning (RL), increasingly face the challenge of scaling to dynamic, high-dimensional, and partially observable environments. Significant attention is being paid to frameworks from deep learning, which scale to high-dimensional data by decomposing the task through multilayered neural networks. While effective, the representation is complex and computationally demanding. In this work, we propose a framework based on genetic programming which adaptively complexifies policies through interaction with the task. We make a direct comparison with several deep reinforcement learning frameworks in the challenging Atari video game environment as well as more traditional reinforcement learning frameworks based on a priori engineered features. Results indicate that the proposed approach matches the quality of deep learning while being a minimum of three orders of magnitude simpler with respect to model complexity. This results in real-time operation of the champion RL agent without recourse to specialized hardware support. Moreover, the approach is capable of evolving solutions to multiple game titles simultaneously with no additional computational cost. In this case, agent behaviours for an individual game as well as single agents capable of playing all games emerge from the same evolutionary run.

Subject(s)

Algorithms , Learning/physiology , Models, Psychological , Neural Networks, Computer , Reinforcement, Psychology , Cooperative Behavior , Game Theory

Classification as clustering: a Pareto cooperative-competitive GP approach.

McIntyre, Andrew R; Heywood, Malcolm I.

Evol Comput ; 19(1): 137-66, 2011.

Article in English | MEDLINE | ID: mdl-20879899

ABSTRACT

Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

Subject(s)

Artificial Intelligence , Classification/methods , Cluster Analysis , Humans

Scaling genetic programming to large datasets using hierarchical dynamic subset selection.

Curry, Robert; Lichodzijewski, Peter; Heywood, Malcolm I.

IEEE Trans Syst Man Cybern B Cybern ; 37(4): 1065-73, 2007 Aug.

Article in English | MEDLINE | ID: mdl-17702303

ABSTRACT

The computational overhead of genetic programming (GP) may be directly addressed without recourse to hardware solutions using active learning algorithms based on the random or dynamic subset selection heuristics (RSS or DSS). This correspondence begins by presenting a family of hierarchical DSS algorithms: RSS-DSS, cascaded RSS-DSS, and the balanced block DSS algorithm, where the latter has not been previously introduced. Extensive benchmarking over four unbalanced real-world binary classification problems with 30000-500000 training exemplars demonstrates that both the cascade and balanced block algorithms are able to reduce the likelihood of degenerates while providing a significant improvement in classification accuracy relative to the original RSS-DSS algorithm. Moreover, comparison with GP trained without an active learning algorithm indicates that classification performance is not compromised, while training is completed in minutes as opposed to half a day.

Subject(s)

Algorithms , Artificial Intelligence , Databases, Factual , Decision Support Techniques , Information Storage and Retrieval/methods , Models, Theoretical , Pattern Recognition, Automated/methods , Computer Simulation , Database Management Systems

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL