Publications on Function Approximation (RL)



Abe, Naoki , Alan Biermann and Philip M. Long( nabe@us.ibm.com )
Reinforcement learning with immediate rewards and linear hypotheses
Algorithmica (Postscript - 350KB) Abstract:
We perform theoretical analysis of algorithms for reinforcement learning with immediate rewards usi...

Ackley, David , Michael L. Littman( ackley@cs.unm.edu)
Generalization and scaling in reinforcement learning
D. S. Touretzky, editor, Advances in Neural Information Processing Systems, volume 2, pages 550--557, San Mateo, CA, 1990. Morgan Kaufmann (Postscript - 140KB) Abstract:
In associative reinforcement learning, an environment generates input vectors, a learning system ge...

Baird, Leemon , H. Klopf( leemon@cs.cmu.edu)
Reinforcement Learning with high-dimensional continuous actions
Technical Report WL-TR-93-1147, Wright Laboratory, Wright-Patterson Air Force Base, 1993 (HTML) Abstract:
Many reinforcement learning systems, such as Q-learning, or advantage updating, require that a func...

Baird, Leemon ( leemon@cs.cmu.edu)
Residual Algorithms: Reinforcement Learning with Function Approximation
Armand Prieditis & Stuart Russell, eds. Machine Learning: Proceedings of the Twelfth International Conference, 9-12 July, Morgan Kaufman Publishers, San Francisco, CA (HTML) Abstract:
A number of reinforcement learning algorithms have been developed that are guaranteed to converge t...

Bhulai, Sandjai ( sbhulai@cs.vu.nl)
Markov Decision Processes: the control of high-dimensional systems
Ph.D. Thesis, Vrije Universiteit, 2002 (Postscript - ) Abstract:
We develop algorithms for the computation of (nearly) optimal decision rules in high-dimensional sys...

Boyan, Justin , Andrew Moore( Justin.Boyan@cs.cmu.edu)
Generalization in Reinforcement Learning: Safely Approximating the Value Function
Proceedings of Neural Information Processings Systems 7, Morgan Kaufmann, January 1995 (8 pages) (compressed Postscript - 743 KB) Abstract:
A straightforward approach to the curse of dimensionality in reinforcement learning and dynamic pro...

Boyan, Justin , Andrew W. Moore( jab@cs.cmu.edu)
Learning Evaluation Functions for Large Acyclic Domains
ICML-96 (Postscript - 147KB) Abstract:
Some of the most successful recent applications of reinforcement learning have used neural netw...

Carreras, Marc ( marcc@eia.udg.es)
A Proposal of a Behavior-based Control Architecture with Reinforcement Learning for an Autonomous Underwater Robot
(pdf - 4 MB) Abstract:
The achievement of a mission with an autonomous robot in an unknown and unstructured environment is ...

Coulom, Rémi ( Remi.Coulom@imag.fr)
Reinforcement Learning Using Neural Networks, with Applications to Motor Control
PhD thesis (html - 1Mb) Abstract:
This thesis is a study of practical methods to estimate value functions with feedforward neural netw...

Coulom, Rémi ( Remi.Coulom@free.fr)
Feedforward Neural Networks in Reinforcement Learning Applied to High-dimensional Motor Control
Proceedings of ALT2002 (pdf - 139 Kb) Abstract:
Local linear function approximators are often preferred to feedforward neural networks to estimate v...

Dietterich, Thomas ( tgd@cs.orst.edu)
State abstraction in MAXQ hierarchical reinforcement learning
unpublished ( gzipped Postscript - 102Kb) Abstract:
Many researchers have explored methods for hierarchical reinforcement learning (RL) with tempora...

Dimitrakakis, Christos ( olethros@geocities.com)
Reinforcement Learning With Continuous Action Values
unpublished ( gzipped Postscript - 120KB) Abstract:
The problem of reinforcement learning in the case of a continuous action set remains largely unsolv...

Ernst, Damien , Pierre Geurts and Louis Wehenkel( dernst@ulg.ac.be)
Iteratively extending time horizon reinforcement learning
Proceedings of ECML 2003 (Postscript - 6 KB) Abstract:
Reinforcement learning aims to determine an (infinite time horizon) optimal control policy fro...

Ernst, Damien , Geurts Pierre, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Tree-based batch mode reinforcement learning
Journal of Machine Learning Research, April 2005, Volume 6, pp 503-556 (Pdf - 1290 KB) Abstract:
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...

Ernst, Damien , Pierre Geurts, Mevludin Glavic, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Approximate value iteration in the reinforcement learning context. Application to electrical power system control
International Journal of Emerging Electric Power Systems (.pdf - 780) Abstract:
In this paper we explain how to design intelligent agents able to process the information acquired f...

Ernst, Damien ( ernst@montefiore.ulg.ac.be)
Selecting concise sets of samples for a reinforcement learning agent
Conference Proceedings of CIRAS 2005 (pdf - 1036 KB) Abstract:
We derive an algorithm for selecting from the set of samples gathered by a reinforcement learning a...

Ernst, Damien , Raphael Marée, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Reinforcement learning with raw image pixels as state input
International Workshop on Intelligent Computing in Pattern Analysis/Synthesis (IWICPAS). Proceedings series: Lecture Notes in Computer Science, Volume 4153, page 446-454, August 2006 Abstract:
We report in this paper some positive simulation results obtained when image pixels are directly u...

Fernandes de Arruda, Edilson (efarruda@gmail.com)
Approximate Dynamic Programming via Direct Search in the Space of Value Function Approximations
European Journal of Operational Research, 2010 Abstract:
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be ...

Fernandez, Fernando , Daniel Borrajo( ffernand@grial.uc3m.es)
Vector Quantization Applied to Reinforcement Learning
Proceedings of the Fifth Workshop on RoboCup. Stockholm, Sweden. August, 1999. IJCAI'99 (Postscript - 202 KB) Abstract:
Reinforcement learning has proven to be a set of successful techniques for finding optimal policies ...

Fernandez, Fernando , Daniel Borrajo( ffernand@inf.uc3m.es)
On Determinism Handling while Learning Reduced State Space Representations
European Conference on Artificial Intelligence (Postscript - ) Abstract:
When applying a Reinforcement Learning technique to problems with continuous or very large state spa...

Francois, Rivest , Doina Precup( rivestfr@iro.umontreal.ca)
Combining TD-learning with Cascade-correlation Networks
ICML 2003 Abstract:
Using neural networks to represent value functions in reinforcement learning algorithms often invo...

Ghory, Imran ( imran@bits.bris.ac.uk)
Reinforcement Learning in Board Games
Technical Report CSTR-04-004, Department of Computer Science, University of Bristol, May 2004. (pdf - 1097439 bytes) Abstract:
This project investigates the application of the TD(lambda) reinforcement learning algorithm and neu...

Girgin, S., Ph. Preux
Incremental Basis Function Expansion in Reinforcement Learning using Cascade-Correlation Networks
ICML-A 2008

Girgin, S., Ph. Preux
Incremental basis function expansion in reinforcement learning using cascade-correlation networks
Proc. ECAI Workshop, ERLARS, 2008

Girgin, S., Ph. Preux
Basis Expansion In Natural Actor Critic Methods
Recent Advances in reinforcement Learning, Springer LNAI 5323

Girgin, S., Ph. Preux
Feature discovery in reinforcement learning using genetic programming
Proc. 11th European Conference on Genetic Programming (EUROGP)

Littman, Michael , Anthony Cassandra and Leslie Kaelbling( mlittman@cs.duke.edu)
Learning policies for partially observable environments: Scaling up
Proceedings of the Twelfth International Conference on Machine Learning (Postscript - 315K) Abstract:
Partially observable Markov decision processes (POMDPs) model decision problems in which an agent t...

Loth, M., Ph. Preux, M. Davy
A unified view of TD algorithms - Intro\ ducing full-gradient TD and Equi-gradient descent TD
Proc. 11th European Conference on Genetic Programming (EUROGP) Springer, 2008

Loth, M., M. Davy, Ph. Preux
Sparse temporal difference learning using LASSO
Proc. IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning 2007

Matt, Andreas , Georg Regensburger( andreas.matt@uibk.ac.at)
"Reinforcement Learning for Several Environments: Theory and Applications"
A joint PhD thesis by Andreas Matt and Georg Regensburger Abstract:
Until now reinforcement learning has been applied to learn the optimal behavior for a single environ...

Munos, Remi ( munos@cs.cmu.edu)
Reinforcement Learning for Continuous Stochastic Control Problems
Neural Information Processing Systems, 1997 (Postscript - 809KB) Abstract:
This paper is concerned with the problem of Reinforcement Learning for continuous state space and ...

Munos, Remi ( munos@cs.cmu.edu)
A convergent Reinforcement Learning algorithm in the continuous case based on a Finite Difference method
IJCAI'1997 (compressed Postscript - 225Kb) Abstract:
In this paper, we propose a convergent Reinforcement Learning algorithm for solving optimal contr...

Munos, Remi ( munos@cs.cmu.edu)
A Convergent Reinforcement Learning algorithm in the continuous case : the Finite-Element Reinforcement Learning
International Conference on Machine Learning, 1996 (Postscript - 197Kb) Abstract:
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcem...

Munos, Remi ( munos@cs.cmu.edu)
A general convergence method for Reinforcement Learning in the continuous case
European Conference on Machine Learning, 1998 (compressed Postscript - 230Kb) Abstract:
In this paper, we propose a general method for designing convergent Reinforcement Learning algorit...

Munos, Remi ( munos@cs.cmu.edu)
Finite-Element methods with local triangulation refinement for continuous Reinforcement Learning problems
European Conference on Machine Learning, 1997 (compressed Postscript - 283Kb) Abstract:
This paper presents a reinforcement learning algorithm for generating an adaptive control for a ...

Munos, Remi , Andrew Moore( munos@cs.cmu.edu)
Variable resolution discretization for high-accuracy solutions of optimal control problems
IJCAI'99 ( gzipped Postscript - 315KB) Abstract:
State abstraction is of central importance in reinforcement learning and Markov Decision Processes. ...

Munos, Remi , Leemon Baird, Andrew Moore( munos@cs.cmu.edu)
Gradient Descent Approaches to Neural-Net-Based Solutions of the Hamilton-Jacobi-Bellman Equation.
IJCNN'99 ( gzipped Postscript - 128KB) Abstract:
In this paper we investigate new approaches to dynamic-programming-based optimal control of contin...

Munos, Remi ( remi.munos@polytechnique.fr)
Error Bounds for Approximate Policy Iteration
Icml 2003 ( gzipped Postscript - 80 KB) Abstract:
In Dynamic Programming, convergence of algorithms such as Value Iteration or Policy Iteration resul...

Ormoneit, Dirk , Saunak Sen( ormoneit@stat.stanford.edu)
Kernel-Based Reinforcement Learning
Department of Statistics, Stanford University, Technical Report No. 1999-8 (Postscript - 260 KB) Abstract:
Kernel-based methods have recently attracted increased attention in the machine learning literature...

Preux, Ph., Girgin, S., Loth, M.
Feature Discovery in Approximate Dynamic Programming
in Proc. Approximate Dynamic Programming and Reinforcement Learning (ADPRL), IEEE Press, Nashville, Mar-Apr. 2009

Reynolds, Stuart ( sir@cs.bham.ac.uk)
The Stability of General Discounted Reinforcement Learning with Linear Function Approximation
UKCI'02 ( gzipped Postscript - 80) Abstract:
This paper shows that general discounted return estimating reinforcement learning algorithms ca...

Reynolds, Stuart ( sir@cs.bham.ac.uk)
Decision Boundary Partitioning: Variable Resolution Model-Free Reinforcement Learning
ICML-2k ( gzipped Postscript - 241 KB) Abstract:
This paper presents a method to refine the resolution of a continuous state Q-function. Q-functions ...

Reynolds, Stuart ( sir@cs.bham.ac.uk)
Reinforcement Learning with Exploration
PhD Thesis, School of Computer Science, The University of Birmingham, B15 2TT, UK ( gzipped Postscript - 1.1MB) Abstract:
Reinforcement Learning (RL) techniques may be used to find optimal controllers for multistep decisio...

Rivest, Francois , Yoshua Bengio, John Kalask( rivestfr@iro.umontreal.ca)
Brain Inspired Reinforcement Learning
NIPS 2004 (NIPS 17) Abstract:
Successful application of reinforcement learning algorithms often involves considerable hand-craftin...

Siebel, Nils T ( nils-NOSPAM@siebel-research.de)
Learning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand (PDF - 168 KB) Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...

Siebel, Nils T , Kassahun, Yohannes
Learning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand (PDF - 168 KB) Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...

Singh, Satinder
Reinforcement Learning With Soft State Aggregation
NIPS 7 ( gzipped Postscript - )

Abstract: It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling...

Strens, Malcolm ( mjstrens@qinetiq.com)
Learning Multi-Agent Search Strategies
The Interdisciplinary Journal of Artificial Intelligence and the Simulation of Behaviour, 1(4), 2003. (pdf - 305KB) Abstract:
We identify a specialised class of reinforcement learning problem in which the agent(s) have the goa...

Sutton, Rich ( rich@cs.umass.edu)
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
Advances in Neural Information Processing Systems 8, pp. 1038-1044, MIT Press (compressed Postscript - 230 KB) Abstract:
On large problems, reinforcement learning systems must use parameterized function approximat...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Scaling up average reward reinforcement learning by approximating the domain models and the value function
Proceedings of the Thirteenth International Conference on Machine Learning, pages 471-479. Morgan Kaufmann, 1996 (Postscript - ) Abstract:
Almost all the work in Average-reward Reinforcement Learning (ARL) so far has focused on table-base...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec (Postscript - 53 pages) Abstract:
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving r...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec (Postscript - 53 pages) Abstract:
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving re...

Tsitsiklis, John , Benjamin Van Roy( jnt@mit.edu)
Feature-Based Methods for Large Scale Dynamic Programming
Machine Learning, Vol. 22, 1996, pp. 59-94. Abstract:
We develop a methodological framework and present a few different ways in which dynamic programmin... ( PDF - 2.8 MB)

Van Roy, Benjamin ( bvr@stanford.edu)
Learning and Value Function Approximation in Complex Decision Processes
PhD Thesis (Postscript - 1691 KB) Abstract:
In principle, a wide variety of sequential decision problems -- ranging from dynamic resource alloc...

Wilson, Stewart ( wilson@smith.rowland.org)
Generalization in the XCS classifier system
Genetic Programming 1998: Proceedings of the Third Annual Conference. San Francisco, CA: Morgan Kaufmann. (gzipped Postscript - 61 KB) Abstract:
This paper studies two changes to XCS, a classifier system in which fitness is based on prediction...

Xu, Xin , Han-gen He and Dewen Hu( xuxin_mail@263.net)
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
Journal of Artificial Intelligence Research, Vol.16,2002, pp:259-292 ( gzipped Postscript - 700) Abstract:
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptiv...

Yin, ChangMing ( cmyin@cs167.net)
Forgetting Algorithm for Q-learning
unpublished (Microsoft Word - 120) Abstract:
...