All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class sim.errFun.QLearning

java.lang.Object
   |
   +----sim.errFun.ErrFun
           |
           +----sim.errFun.RLErrFun
                   |
                   +----sim.errFun.QLearning

public class QLearning
extends RLErrFun
Perform Q learning, either residual gradient, residual or direct, with a given Markov Decision Process, function approximator, and gradient-descent algorithm. This code works with both stochastic and deterministic systems.

This code is (c) 1996 Mance E. Harmon <harmonme@aa.wpafb.af.mil>, http://www.cs.cmu.edu/~baird
The source and object code may be redistributed freely provided no fee is charged. If the code is modified, please state so in the comments.

Version:
1.03, 21 July 97
Author:
Mance E. Harmon

Variable Index

 o dEdInQ1
gradient of mean squared error for 1 training example wrt inputs of original state
 o dEdWeightsQ1
gradient of mean squared error for 1 training example wrt weights of original state
 o oldAction
A copy of the original action.
 o oldState
A copy of the original state.
 o rnd
The random number generator that will be used for this object.

Constructor Index

 o QLearning()

Method Index

 o BNF(int)
Return the BNF description of how to parse the parameters of this object.
 o evaluate(Random, boolean, boolean, boolean)
return the scalar output for the current dInput vector
 o findGradient()
update the fGradient vector based on the current fInput vector
 o initVects(MDP, RLErrFun)
Create inputs, state, and action vectors.
 o parse(Parser, int)
Parse the input file to get the parameters for this object.
 o unparse(Unparser, int)
Output a description of this object that can be parsed with parse().

Variables

 o dEdWeightsQ1
 protected MatrixD dEdWeightsQ1
gradient of mean squared error for 1 training example wrt weights of original state

 o dEdInQ1
 protected MatrixD dEdInQ1
gradient of mean squared error for 1 training example wrt inputs of original state

 o oldState
 protected MatrixD oldState
A copy of the original state.

 o oldAction
 protected MatrixD oldAction
A copy of the original action.

 o rnd
 protected Random rnd
The random number generator that will be used for this object. This is a copy of the generator passed to evaluate()

Constructors

 o QLearning
 public QLearning()

Methods

 o BNF
 public String BNF(int lang)
Return the BNF description of how to parse the parameters of this object.

Overrides:
BNF in class ErrFun
 o unparse
 public void unparse(Unparser u,
                     int lang)
Output a description of this object that can be parsed with parse().

Overrides:
unparse in class ErrFun
See Also:
Parsable
 o parse
 public Object parse(Parser p,
                     int lang) throws ParserException
Parse the input file to get the parameters for this object.

Throws: ParserException
parser didn't find the required token
Overrides:
parse in class ErrFun
 o initVects
 public void initVects(MDP mdp,
                       RLErrFun rl)
Create inputs, state, and action vectors. Also, create any vectors that might be specific to this module.

Overrides:
initVects in class RLErrFun
 o evaluate
 public double evaluate(Random rnd,
                        boolean willFindDeriv,
                        boolean willFindHess,
                        boolean rememberNoise)
return the scalar output for the current dInput vector

Overrides:
evaluate in class ErrFun
 o findGradient
 public void findGradient()
update the fGradient vector based on the current fInput vector

Overrides:
findGradient in class ErrFun

All Packages  Class Hierarchy  This Package  Previous  Next  Index