All Packages Class Hierarchy This Package Previous Next Index
Class sim.mdp.LQR
java.lang.Object
|
+----sim.mdp.MDP
|
+----sim.mdp.LQR
- public class LQR
- extends MDP
A Markov Decision Process that takes a state and action
and returns a new state and a reinforcement. This MDP is deterministic.
The state space is the interval [-1,1], discretized by the time step dt.
An agent sits on one of the discrete states defined by dt. The agent can
perform 2 possible actions: go left (-1), and go right (1). The reinforcement
returned after performing an action is the new position on the number line squared.
The object of this MDP is to minimize total discounted reinforcement.
If the parameter discrete is not set to true, then epochSize should be
set if the experiment is using epochwise training. This might be the
case if, for example, an inherently epochwise method such as conjugate gradient
is the learning algorithm. In that case, the incremental parameter for the experiment,
assuming one exists, would be set to false. However, the discrete parameter of LQR
could be set to false and a number of gradients to average over would be defined using
epochSize. If discrete is set to false then epochSize is ignored.
This code is (c) 1996 Mance E. Harmon
<harmonme@aa.wpafb.af.mil>,
http://www.aa.wpafb.af.mil/~harmonme
The source and object code may be redistributed freely provided
no fee is charged. If the code is modified, please state so
in the comments.
- Version:
- 1.27, 21 July 97
- Author:
- Mance Harmon
-
discrete
- is the state space continuous or discrete
-
epochSize
- Size of the epoch.
-
LQR()
-
-
actionSize()
- Return the number of elements in the action vector.
-
BNF(int)
- Return the BNF description of how to parse the parameters of this object.
-
findValAct(MatrixD, MatrixD, FunApp, MatrixD, PBoolean)
- Find the value and best action of this state.
-
findValue(MatrixD, MatrixD, PDouble, FunApp, PDouble, MatrixD, PDouble, PBoolean, NumExp, Random)
- Find the max over action for where V(x') is the value of the successor state
given state x, R is the reinforcement, gamma is the discount factor.
-
getAction(MatrixD, MatrixD, Random)
- Return the next possible action in a state given an action.
-
getState(MatrixD, PDouble, Random)
- Return the next state when doing epoch-wise training.
-
initialAction(MatrixD, MatrixD, Random)
- Return an initial action possible in a given state.
-
initialState(MatrixD, Random)
- Return a start state for epoch-wise training.
-
nextState(MatrixD, MatrixD, MatrixD, PDouble, PBoolean, Random)
- Find a next state given a state and action,
and return the reinforcement received.
-
numActions(MatrixD)
- Return the number of actions in each state.
-
numPairs(PDouble)
- Return the number of state/action pairs for a given dt.
-
numStates(PDouble)
- Return the number of states in this LQR for a given dt.
-
parse(Parser, int)
- Parse the input file to get the parameters for this object.
-
randomAction(MatrixD, MatrixD, Random)
- Generates a random action from those possible.
-
randomState(MatrixD, Random)
- Generates a random state from those possible.
-
setWatchManager(WatchManager, String)
- Register all variables with this WatchManager.
-
stateSize()
- Return the number of elements in the state vector.
-
unparse(Unparser, int)
- Output a description of this object that can be parsed with parse().
discrete
protected boolean discrete
- is the state space continuous or discrete
epochSize
protected IntExp epochSize
- Size of the epoch. Only needed when doing epochwise training on continuous state space
LQR
public LQR()
setWatchManager
public void setWatchManager(WatchManager wm,
String name)
- Register all variables with this WatchManager.
Override this if there are internal variables that
should be registered here.
- Overrides:
- setWatchManager in class MDP
numStates
public int numStates(PDouble dt)
- Return the number of states in this LQR for a given dt. dt must evenly divide 2.
- Overrides:
- numStates in class MDP
stateSize
public int stateSize()
- Return the number of elements in the state vector.
- Overrides:
- stateSize in class MDP
initialState
public void initialState(MatrixD state,
Random random) throws MatrixException
- Return a start state for epoch-wise training.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- initialState in class MDP
getState
public void getState(MatrixD state,
PDouble dt,
Random random) throws MatrixException
- Return the next state when doing epoch-wise training.
If the state passed in is 1 then the next state is -1.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- getState in class MDP
actionSize
public int actionSize()
- Return the number of elements in the action vector.
- Overrides:
- actionSize in class MDP
numActions
public int numActions(MatrixD state)
- Return the number of actions in each state.
- Overrides:
- numActions in class MDP
initialAction
public void initialAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Return an initial action possible in a given state.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- initialAction in class MDP
getAction
public void getAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Return the next possible action in a state given an action.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- getAction in class MDP
numPairs
public int numPairs(PDouble dt)
- Return the number of state/action pairs for a given dt.
This only works for dt's in which 2 is evenly divisible by dt.
- Overrides:
- numPairs in class MDP
randomAction
public void randomAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Generates a random action from those possible.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- randomAction in class MDP
randomState
public void randomState(MatrixD state,
Random random) throws MatrixException
- Generates a random state from those possible.
This doesn't generate random states with uniform probability. States
-1 and 1 are half as likely as the other states.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- randomState in class MDP
nextState
public double nextState(MatrixD state,
MatrixD action,
MatrixD newState,
PDouble dt,
PBoolean valueKnown,
Random random) throws MatrixException
- Find a next state given a state and action,
and return the reinforcement received.
All 3 should be vectors (single-column matrices).
The duration of the time step, dt, is also returned. Most MDPs
will generally make this a constant, given in the parsed string.
- Throws: MatrixException
- if sizes aren't right.
- Overrides:
- nextState in class MDP
findValAct
public double findValAct(MatrixD state,
MatrixD action,
FunApp f,
MatrixD outputs,
PBoolean valueKnown) throws MatrixException
- Find the value and best action of this state. This corrupts the original action passed in
by returning in its place the best action for the given state.
- Throws: MatrixException
- column vectors are wrong size or shape
- Overrides:
- findValAct in class MDP
findValue
public double findValue(MatrixD state,
MatrixD optAction,
PDouble gamma,
FunApp f,
PDouble dt,
MatrixD outputs,
PDouble reinforcement,
PBoolean valueKnown,
NumExp explorationFactor,
Random random) throws MatrixException
- Find the max over action for where V(x') is the value of the successor state
given state x, R is the reinforcement, gamma is the discount factor. This method is used in
the object ValIter (value iteration).
- Throws: MatrixException
- column vectors are wrong size or shape
- Overrides:
- findValue in class MDP
BNF
public String BNF(int lang)
- Return the BNF description of how to parse the parameters of this object.
- Overrides:
- BNF in class MDP
unparse
public void unparse(Unparser u,
int lang)
- Output a description of this object that can be parsed with parse().
Also creates the state/action/nextState vectors
- Overrides:
- unparse in class MDP
- See Also:
- Parsable
parse
public Object parse(Parser p,
int lang) throws ParserException
- Parse the input file to get the parameters for this object.
- Throws: ParserException
- parser didn't find the required token
- Overrides:
- parse in class MDP
All Packages Class Hierarchy This Package Previous Next Index