All Packages Class Hierarchy This Package Previous Next Index
Class sim.mdp.HC
java.lang.Object
|
+----sim.mdp.MDP
|
+----sim.mdp.HC
- public class HC
- extends MDP
A Markov Decision Process that takes a state and action
and returns a new state and a reinforcement. This MDP is deterministic.
This code is (c) 1996 Mance E. Harmon
<mharmon@acm.org>,
http://eureka1.aa.wpafb.af.mil
The source and object code may be redistributed freely provided
no fee is charged. If the code is modified, please state so
in the comments.
- Version:
- 1.0, 13 May 97
- Author:
- Mance Harmon
-
epochSize
- Size of the epoch.
-
HC()
-
-
actionSize()
- Return the number of elements in the action vector.
-
BNF(int)
- Return the BNF description of how to parse the parameters of this object.
-
findValAct(MatrixD, MatrixD, FunApp, MatrixD, PBoolean)
- Find the value and best action of this state.
-
findValue(MatrixD, MatrixD, PDouble, FunApp, PDouble, MatrixD, PDouble, PBoolean, NumExp, Random)
- Find the max over action for where V(x') is the value of the successor state
given state x, R is the reinforcement, gamma is the discount factor.
-
getAction(MatrixD, MatrixD, Random)
- Return the next possible action in a state given an action.
-
getState(MatrixD, PDouble, Random)
- Return the next state when doing epoch-wise training.
-
initialAction(MatrixD, MatrixD, Random)
- Return an initial action possible in a given state.
-
initialState(MatrixD, Random)
- Return a start state for epoch-wise training.
-
nextState(MatrixD, MatrixD, MatrixD, PDouble, PBoolean, Random)
- Find a next state given a state and action,
and return the reinforcement received.
-
numActions(MatrixD)
- Return the number of actions in each state.
-
numPairs(PDouble)
- Return the number of state/action pairs for a given dt.
-
numStates(PDouble)
- Return the number of states in this MDP.
-
parse(Parser, int)
- Parse the input file to get the parameters for this object.
-
randomAction(MatrixD, MatrixD, Random)
- Generates a random action from those possible: (missile,plane) {(-1,-1),(-1,1),(1,-1),(1,1)}
-
randomState(MatrixD, Random)
- Generates a random state from those possible.
-
setWatchManager(WatchManager, String)
- Register all variables with this WatchManager.
-
stateSize()
- Return the number of elements in the state vector.
-
unparse(Unparser, int)
- Output a description of this object that can be parsed with parse().
epochSize
protected IntExp epochSize
- Size of the epoch. Only needed when doing epochwise training on continuous state space
HC
public HC()
setWatchManager
public void setWatchManager(WatchManager wm,
String name)
- Register all variables with this WatchManager.
- Overrides:
- setWatchManager in class MDP
numStates
public int numStates(PDouble dt)
- Return the number of states in this MDP. This will always be epochSize because state space is continuous.
- Overrides:
- numStates in class MDP
stateSize
public int stateSize()
- Return the number of elements in the state vector.
- Overrides:
- stateSize in class MDP
initialState
public void initialState(MatrixD state,
Random random) throws MatrixException
- Return a start state for epoch-wise training. This is actually NOT the state, but rather the difference
in the state variables of the two players.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- initialState in class MDP
getState
public void getState(MatrixD state,
PDouble dt,
Random random) throws MatrixException
- Return the next state when doing epoch-wise training.
Because this MDP is defined with continuous state space, this simply returns a random state.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- getState in class MDP
actionSize
public int actionSize()
- Return the number of elements in the action vector.
- Overrides:
- actionSize in class MDP
numActions
public int numActions(MatrixD state)
- Return the number of actions in each state.
- Overrides:
- numActions in class MDP
initialAction
public void initialAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Return an initial action possible in a given state.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- initialAction in class MDP
getAction
public void getAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Return the next possible action in a state given an action.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- getAction in class MDP
numPairs
public int numPairs(PDouble dt)
- Return the number of state/action pairs for a given dt.
Because we have continuous states, this returns the number of actions in a given state (4)
times the pseudo-epoch size passed in to this as a parameter.
- Overrides:
- numPairs in class MDP
randomAction
public void randomAction(MatrixD state,
MatrixD action,
Random random) throws MatrixException
- Generates a random action from those possible: (missile,plane) {(-1,-1),(-1,1),(1,-1),(1,1)}
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- randomAction in class MDP
randomState
public void randomState(MatrixD state,
Random random) throws MatrixException
- Generates a random state from those possible.
- Throws: MatrixException
- Vector is wrong length.
- Overrides:
- randomState in class MDP
nextState
public double nextState(MatrixD state,
MatrixD action,
MatrixD newState,
PDouble dt,
PBoolean valueKnown,
Random random) throws MatrixException
- Find a next state given a state and action,
and return the reinforcement received.
All 3 should be vectors (single-column matrices).
The duration of the time step, dt, is also returned. Most MDPs
will generally make this a constant, given in the parsed string.
- Throws: MatrixException
- if sizes aren't right.
- Overrides:
- nextState in class MDP
findValAct
public double findValAct(MatrixD state,
MatrixD action,
FunApp f,
MatrixD outputs,
PBoolean valueKnown) throws MatrixException
- Find the value and best action of this state. This corrupts the original action passed in
by returning in its place the best action for the given state.
- Throws: MatrixException
- column vectors are wrong size or shape
- Overrides:
- findValAct in class MDP
findValue
public double findValue(MatrixD state,
MatrixD optAction,
PDouble gamma,
FunApp f,
PDouble dt,
MatrixD outputs,
PDouble reinforcement,
PBoolean valueKnown,
NumExp explorationFactor,
Random random) throws MatrixException
- Find the max over action for where V(x') is the value of the successor state
given state x, R is the reinforcement, gamma is the discount factor. This method is used in
the object ValIter (value iteration).
- Throws: MatrixException
- column vectors are wrong size or shape
- Overrides:
- findValue in class MDP
BNF
public String BNF(int lang)
- Return the BNF description of how to parse the parameters of this object.
- Overrides:
- BNF in class MDP
unparse
public void unparse(Unparser u,
int lang)
- Output a description of this object that can be parsed with parse().
Also creates the state/action/nextState vectors
- Overrides:
- unparse in class MDP
- See Also:
- Parsable
parse
public Object parse(Parser p,
int lang) throws ParserException
- Parse the input file to get the parameters for this object.
- Throws: ParserException
- parser didn't find the required token
- Overrides:
- parse in class MDP
All Packages Class Hierarchy This Package Previous Next Index