All Packages Class Hierarchy This Package Previous Next Index
Class sim.errFun.RLErrFun
java.lang.Object
|
+----sim.errFun.ErrFun
|
+----sim.errFun.RLErrFun
- public abstract class RLErrFun
- extends ErrFun
All RL objects inherit from this class. RLErrFun is used to allow RL algorithm objects (i.e. QLearning,
AdvantageLearning, ValueIteration) to have access to the parameters passed to ReinforcementLearning from the
html file.
This code is (c) 1997 Mance E. Harmon
<mharmon@acm.org>,
http://www.cs.cmu.edu/~baird/java
The source and object code may be redistributed freely.
If the code is modified, please state so in the comments.
- Version:
- 1.03, 21 July 97
- Author:
- Mance E. Harmon
-
action
- An action that can be chosen in a given state
-
dt
- The time step size used in transitioning from state x(t) to x(t+1)
-
endTrajectory
- Are we at an absorbing state? Used doing batch training where the length of a trajectory is the size of a batch.
-
exploration
- The percentage of time a random action is chosen for training.
-
gamma
- The discount factor
-
incremental
- The mode of learning: incremental or epoch-wise.
-
mdp
- the mdp to control
-
method
- Specifies method (0=residual, 1=resGrad, 2=direct)
-
methodStr
- Specifies method "residual", "resGrad" or "direct"
-
mu
- The decay factor for the trace of the of resgrad and direct update vectors used to calculate phi.
-
newState
- The state reached after performing an action
-
phi
- The weighting factor for the weights of resgrad and direct vectors.
-
state
- The state of the MDP
-
statesOnly
- Used to tell ReinforcementLearning if this algorithm uses states only (as opposed to state/action pairs).
-
trajectories
- Should we follow trajectories.
-
valueKnown
- A flag stating whether or not we know for certain the value of a state.
-
RLErrFun()
-
-
initVects(MDP, RLErrFun)
- Used to initialize the inputs, state, and action vectors in all RL algorithm objects (not ReinforcementLearning).
mdp
protected MDP mdp
- the mdp to control
state
protected MatrixD state
- The state of the MDP
newState
protected MatrixD newState
- The state reached after performing an action
action
protected MatrixD action
- An action that can be chosen in a given state
dt
protected NumExp dt
- The time step size used in transitioning from state x(t) to x(t+1)
phi
protected NumExp phi
- The weighting factor for the weights of resgrad and direct vectors.
mu
protected NumExp mu
- The decay factor for the trace of the of resgrad and direct update vectors used to calculate phi.
method
protected NumExp method
- Specifies method (0=residual, 1=resGrad, 2=direct)
gamma
protected NumExp gamma
- The discount factor
methodStr
protected PString methodStr
- Specifies method "residual", "resGrad" or "direct"
incremental
protected PBoolean incremental
- The mode of learning: incremental or epoch-wise.
exploration
protected NumExp exploration
- The percentage of time a random action is chosen for training.
statesOnly
protected boolean statesOnly
- Used to tell ReinforcementLearning if this algorithm uses states only (as opposed to state/action pairs).
If the RL algorithm being implemented uses only states, then this variable must be set to true.
endTrajectory
protected PBoolean endTrajectory
- Are we at an absorbing state? Used doing batch training where the length of a trajectory is the size of a batch.
valueKnown
protected PBoolean valueKnown
- A flag stating whether or not we know for certain the value of a state.
trajectories
protected PBoolean trajectories
- Should we follow trajectories.
RLErrFun
public RLErrFun()
initVects
public abstract void initVects(MDP mpd,
RLErrFun rl)
- Used to initialize the inputs, state, and action vectors in all RL algorithm objects (not ReinforcementLearning).
All Packages Class Hierarchy This Package Previous Next Index