Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 11(6): e0157088, 2016.
Article in English | MEDLINE | ID: mdl-27304891

ABSTRACT

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.


Subject(s)
Algorithms , Bayes Theorem , Benchmarking/methods , Learning/physiology , Reinforcement, Psychology , Animals , Choice Behavior , Computational Biology/methods , Computer Simulation , Decision Making , Humans , Markov Chains , Reproducibility of Results , Reward
2.
Biores Open Access ; 3(5): 233-41, 2014 Oct 01.
Article in English | MEDLINE | ID: mdl-25371860

ABSTRACT

This review shows the potential ground-breaking impact that mathematical tools may have in the analysis and the understanding of the HIV dynamics. In the first part, early diagnosis of immunological failure is inferred from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This method is supported by clinical research results from an original clinical trial: data just after 1 month following therapy initiation are used to carry out the model identification. The diagnosis is shown to be consistent with results from monitoring of the patients after 6 months. In the second part of this review, prospective research results are given for the design of individual anti-HIV treatments optimizing the recovery of the immune system and minimizing side effects. In this respect, two methods are discussed. The first one combines HIV population dynamics with pharmacokinetics and pharmacodynamics models to generate drug treatments using impulsive control systems. The second one is based on optimal control theory and uses a recently published differential equation to model the side effects produced by highly active antiretroviral therapy therapies. The main advantage of these revisited methods is that the drug treatment is computed directly in amounts of drugs, which is easier to interpret by physicians and patients.

4.
Ann Oper Res ; 208(1): 383-416, 2013 Sep 01.
Article in English | MEDLINE | ID: mdl-24049244

ABSTRACT

In this paper, we consider the batch mode reinforcement learning setting, where the central problem is to learn from a sample of trajectories a policy that satisfies or optimizes a performance criterion. We focus on the continuous state space case for which usual resolution schemes rely on function approximators either to represent the underlying control problem or to represent its value function. As an alternative to the use of function approximators, we rely on the synthesis of "artificial trajectories" from the given sample of trajectories, and show that this idea opens new avenues for designing and analyzing algorithms for batch mode reinforcement learning.

SELECTION OF CITATIONS
SEARCH DETAIL
...