All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class sim.errFun.RLErrFun

java.lang.Object
   |
   +----sim.errFun.ErrFun
           |
           +----sim.errFun.RLErrFun

public abstract class RLErrFun
extends ErrFun
All RL objects inherit from this class. RLErrFun is used to allow RL algorithm objects (i.e. QLearning, AdvantageLearning, ValueIteration) to have access to the parameters passed to ReinforcementLearning from the html file.

This code is (c) 1997 Mance E. Harmon <mharmon@acm.org>, http://www.cs.cmu.edu/~baird/java
The source and object code may be redistributed freely. If the code is modified, please state so in the comments.

Version:
1.03, 21 July 97
Author:
Mance E. Harmon

Variable Index

 o action
An action that can be chosen in a given state
 o dt
The time step size used in transitioning from state x(t) to x(t+1)
 o endTrajectory
Are we at an absorbing state? Used doing batch training where the length of a trajectory is the size of a batch.
 o exploration
The percentage of time a random action is chosen for training.
 o gamma
The discount factor
 o incremental
The mode of learning: incremental or epoch-wise.
 o mdp
the mdp to control
 o method
Specifies method (0=residual, 1=resGrad, 2=direct)
 o methodStr
Specifies method "residual", "resGrad" or "direct"
 o mu
The decay factor for the trace of the of resgrad and direct update vectors used to calculate phi.
 o newState
The state reached after performing an action
 o phi
The weighting factor for the weights of resgrad and direct vectors.
 o state
The state of the MDP
 o statesOnly
Used to tell ReinforcementLearning if this algorithm uses states only (as opposed to state/action pairs).
 o trajectories
Should we follow trajectories.
 o valueKnown
A flag stating whether or not we know for certain the value of a state.

Constructor Index

 o RLErrFun()

Method Index

 o initVects(MDP, RLErrFun)
Used to initialize the inputs, state, and action vectors in all RL algorithm objects (not ReinforcementLearning).

Variables

 o mdp
 protected MDP mdp
the mdp to control

 o state
 protected MatrixD state
The state of the MDP

 o newState
 protected MatrixD newState
The state reached after performing an action

 o action
 protected MatrixD action
An action that can be chosen in a given state

 o dt
 protected NumExp dt
The time step size used in transitioning from state x(t) to x(t+1)

 o phi
 protected NumExp phi
The weighting factor for the weights of resgrad and direct vectors.

 o mu
 protected NumExp mu
The decay factor for the trace of the of resgrad and direct update vectors used to calculate phi.

 o method
 protected NumExp method
Specifies method (0=residual, 1=resGrad, 2=direct)

 o gamma
 protected NumExp gamma
The discount factor

 o methodStr
 protected PString methodStr
Specifies method "residual", "resGrad" or "direct"

 o incremental
 protected PBoolean incremental
The mode of learning: incremental or epoch-wise.

 o exploration
 protected NumExp exploration
The percentage of time a random action is chosen for training.

 o statesOnly
 protected boolean statesOnly
Used to tell ReinforcementLearning if this algorithm uses states only (as opposed to state/action pairs). If the RL algorithm being implemented uses only states, then this variable must be set to true.

 o endTrajectory
 protected PBoolean endTrajectory
Are we at an absorbing state? Used doing batch training where the length of a trajectory is the size of a batch.

 o valueKnown
 protected PBoolean valueKnown
A flag stating whether or not we know for certain the value of a state.

 o trajectories
 protected PBoolean trajectories
Should we follow trajectories.

Constructors

 o RLErrFun
 public RLErrFun()

Methods

 o initVects
 public abstract void initVects(MDP mpd,
                                RLErrFun rl)
Used to initialize the inputs, state, and action vectors in all RL algorithm objects (not ReinforcementLearning).


All Packages  Class Hierarchy  This Package  Previous  Next  Index