All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class sim.mdp.HCDemo

java.lang.Object
   |
   +----sim.mdp.MDP
           |
           +----sim.mdp.HCDemo

public class HCDemo
extends MDP
A Markov Decision Process that takes a state and action and returns a new state and a reinforcement. This is a module used to demonstrate the capabilities of the VRMLInterface module. This module was created by changing the HC mdp module. The policy was hardcoded so that the missile move directly toward the plane and the plane would move at a right angle to the missile.

This code is (c) 1996 Mance E. Harmon <mharmon@acm.org>, http://eureka1.aa.wpafb.af.mil
The source and object code may be redistributed freely provided no fee is charged. If the code is modified, please state so in the comments.

Version:
1.0, 13 May 97
Author:
Mance Harmon

Variable Index

 o epochSize
Size of the epoch.

Constructor Index

 o HCDemo()

Method Index

 o actionSize()
Return the number of elements in the action vector.
 o BNF(int)
Return the BNF description of how to parse the parameters of this object.
 o findValAct(MatrixD, MatrixD, FunApp, MatrixD, PBoolean)
Find the value and best action of this state.
 o findValue(MatrixD, MatrixD, PDouble, FunApp, PDouble, MatrixD, PDouble, PBoolean, NumExp, Random)
Find the max over action for where V(x') is the value of the successor state given state x, R is the reinforcement, gamma is the discount factor.
 o getAction(MatrixD, MatrixD, Random)
Return the next possible action in a state given an action.
 o getState(MatrixD, PDouble, Random)
Return the next state when doing epoch-wise training.
 o initialAction(MatrixD, MatrixD, Random)
Return an initial action possible in a given state.
 o initialState(MatrixD, Random)
Return a start state for epoch-wise training.
 o nextState(MatrixD, MatrixD, MatrixD, PDouble, PBoolean, Random)
Find a next state given a state and action, and return the reinforcement received.
 o numActions(MatrixD)
Return the number of actions in each state.
 o numPairs(PDouble)
Return the number of state/action pairs for a given dt.
 o numStates(PDouble)
Return the number of states in this MDP.
 o parse(Parser, int)
Parse the input file to get the parameters for this object.
 o randomAction(MatrixD, MatrixD, Random)
Generates a random action from those possible: (missile,plane) {(-1,-1),(-1,1),(1,-1),(1,1)}
 o randomState(MatrixD, Random)
Generates a random state from those possible.
 o setWatchManager(WatchManager, String)
Register all variables with this WatchManager.
 o stateSize()
Return the number of elements in the state vector.
 o unparse(Unparser, int)
Output a description of this object that can be parsed with parse().

Variables

 o epochSize
 protected IntExp epochSize
Size of the epoch. Only needed when doing epochwise training on continuous state space

Constructors

 o HCDemo
 public HCDemo()

Methods

 o setWatchManager
 public void setWatchManager(WatchManager wm,
                             String name)
Register all variables with this WatchManager.

Overrides:
setWatchManager in class MDP
 o numStates
 public int numStates(PDouble dt)
Return the number of states in this MDP. This will always be epochSize because state space is continuous.

Overrides:
numStates in class MDP
 o stateSize
 public int stateSize()
Return the number of elements in the state vector.

Overrides:
stateSize in class MDP
 o initialState
 public void initialState(MatrixD state,
                          Random random) throws MatrixException
Return a start state for epoch-wise training. This is actually NOT the state, but rather the difference in the state variables of the two players.

Throws: MatrixException
Vector is wrong length.
Overrides:
initialState in class MDP
 o getState
 public void getState(MatrixD state,
                      PDouble dt,
                      Random random) throws MatrixException
Return the next state when doing epoch-wise training. Because this MDP is defined with continuous state space, this simply returns a random state.

Throws: MatrixException
Vector is wrong length.
Overrides:
getState in class MDP
 o actionSize
 public int actionSize()
Return the number of elements in the action vector.

Overrides:
actionSize in class MDP
 o numActions
 public int numActions(MatrixD state)
Return the number of actions in each state.

Overrides:
numActions in class MDP
 o initialAction
 public void initialAction(MatrixD state,
                           MatrixD action,
                           Random random) throws MatrixException
Return an initial action possible in a given state.

Throws: MatrixException
Vector is wrong length.
Overrides:
initialAction in class MDP
 o getAction
 public void getAction(MatrixD state,
                       MatrixD action,
                       Random random) throws MatrixException
Return the next possible action in a state given an action.

Throws: MatrixException
Vector is wrong length.
Overrides:
getAction in class MDP
 o numPairs
 public int numPairs(PDouble dt)
Return the number of state/action pairs for a given dt. Because we have continuous states, this returns the number of actions in a given state (4) times the pseudo-epoch size passed in to this as a parameter.

Overrides:
numPairs in class MDP
 o randomAction
 public void randomAction(MatrixD state,
                          MatrixD action,
                          Random random) throws MatrixException
Generates a random action from those possible: (missile,plane) {(-1,-1),(-1,1),(1,-1),(1,1)}

Throws: MatrixException
Vector is wrong length.
Overrides:
randomAction in class MDP
 o randomState
 public void randomState(MatrixD state,
                         Random random) throws MatrixException
Generates a random state from those possible.

Throws: MatrixException
Vector is wrong length.
Overrides:
randomState in class MDP
 o nextState
 public double nextState(MatrixD state,
                         MatrixD action,
                         MatrixD newState,
                         PDouble dt,
                         PBoolean valueKnown,
                         Random random) throws MatrixException
Find a next state given a state and action, and return the reinforcement received. All 3 should be vectors (single-column matrices). The duration of the time step, dt, is also returned. Most MDPs will generally make this a constant, given in the parsed string.

Throws: MatrixException
if sizes aren't right.
Overrides:
nextState in class MDP
 o findValAct
 public double findValAct(MatrixD state,
                          MatrixD action,
                          FunApp f,
                          MatrixD outputs,
                          PBoolean valueKnown) throws MatrixException
Find the value and best action of this state. This corrupts the original action passed in by returning in its place the best action for the given state.

Throws: MatrixException
column vectors are wrong size or shape
Overrides:
findValAct in class MDP
 o findValue
 public double findValue(MatrixD state,
                         MatrixD optAction,
                         PDouble gamma,
                         FunApp f,
                         PDouble dt,
                         MatrixD outputs,
                         PDouble reinforcement,
                         PBoolean valueKnown,
                         NumExp explorationFactor,
                         Random random) throws MatrixException
Find the max over action for where V(x') is the value of the successor state given state x, R is the reinforcement, gamma is the discount factor. This method is used in the object ValIter (value iteration).

Throws: MatrixException
column vectors are wrong size or shape
Overrides:
findValue in class MDP
 o BNF
 public String BNF(int lang)
Return the BNF description of how to parse the parameters of this object.

Overrides:
BNF in class MDP
 o unparse
 public void unparse(Unparser u,
                     int lang)
Output a description of this object that can be parsed with parse(). Also creates the state/action/nextState vectors

Overrides:
unparse in class MDP
See Also:
Parsable
 o parse
 public Object parse(Parser p,
                     int lang) throws ParserException
Parse the input file to get the parameters for this object.

Throws: ParserException
parser didn't find the required token
Overrides:
parse in class MDP

All Packages  Class Hierarchy  This Package  Previous  Next  Index