All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class sim.funApp.ValuePolicy

java.lang.Object
   |
   +----sim.funApp.FunApp
           |
           +----sim.funApp.ValuePolicy

public class ValuePolicy
extends FunApp
This funApp is used in conjunction with the Graph3D display object to observe the policy and value function associated with a learning algorithm and MDP. This does not implement findGradient or findHessian.

This code is (c) 1996 Mance Harmon <harmonme@aa.wpafb.af.mil>, http://www.cs.cmu.edu/~baird
The source and object code may be redistributed freely. If the code is modified, please state so in the comments.

Version:
1.11, 17 June 97
Author:
Mance Harmon

Variable Index

 o action
the action vector that is passed to findValAct().
 o function
Function approximator to plot (a duplicate of the original)
 o mdp
the mdp to control
 o optAction
the action vector that is optimal in a given state
 o origFunction
The origianl function approximator whose duplicate will be plotted
 o random
the random number generator
 o statesOnly
A flag switching from state/action pairs to states only (as in value iteration).
 o value
the value of the state passed to the MDP
 o valueKnown
A flag stating whether or not we know for certain the value of a state.

Constructor Index

 o ValuePolicy()

Method Index

 o BNF(int)
 o clone()
Make an exact duplicate of this class.
 o cloneVars(FunApp)
After making a copy of self during a clone(), call cloneVars() to copy variables into the copy, then return super.cloneVars(copy).
 o evaluate()
calculate the output for the given input
 o findGradients()
Calculate the output and gradient for a given input.
 o findHessian()
Calculate the output, gradient, and Hessian for a given input.
 o getParameters(int)
Return a parameter array if BNF(), parse(), and unparse() are to be automated, null otherwise.
 o initialize(int)
Initialize, either partially or completely.
 o nWeights(int, int)
Return # weights needed for nIn inputs (including the first //this routine is never called, should throw null pointer exception one which is always 1.0), and nOut outputs.
 o parse(Parser, int)
Parse the input file to get the parameters for this object.
 o setIO(MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD)
Define the MatrixD objects that will be used by evaluate(), findGradients(), and findHessian().
 o setWatchManager(WatchManager, String)
Register all variables with this WatchManager.
 o unparse(Unparser, int)
Output a description of this object that can be parsed with parse().

Variables

 o random
 protected Random random
the random number generator

 o optAction
 protected MatrixD optAction
the action vector that is optimal in a given state

 o action
 protected MatrixD action
the action vector that is passed to findValAct(). This vector points to the location of the input vector to the function approximator that changed the action.

 o valueKnown
 protected PBoolean valueKnown
A flag stating whether or not we know for certain the value of a state.

 o mdp
 protected MDP mdp[]
the mdp to control

 o function
 protected FunApp function
Function approximator to plot (a duplicate of the original)

 o origFunction
 protected FunApp origFunction[]
The origianl function approximator whose duplicate will be plotted

 o value
 protected MatrixD value
the value of the state passed to the MDP

 o statesOnly
 protected PBoolean statesOnly
A flag switching from state/action pairs to states only (as in value iteration).

Constructors

 o ValuePolicy
 public ValuePolicy()

Methods

 o getParameters
 public Object[][] getParameters(int lang)
Return a parameter array if BNF(), parse(), and unparse() are to be automated, null otherwise.

Overrides:
getParameters in class FunApp
See Also:
getParameters
 o setIO
 public void setIO(MatrixD inVect,
                   MatrixD outVect,
                   MatrixD weights,
                   MatrixD dEdIn,
                   MatrixD dEdOut,
                   MatrixD dEdWeights,
                   MatrixD dEdIndIn,
                   MatrixD dEdOutdOut,
                   MatrixD dEdWeightsdWeights) throws MatrixException
Define the MatrixD objects that will be used by evaluate(), findGradients(), and findHessian(). First 6 should be column vectors (n by 1 matrices). The last 3 parameters can be null if the Hessian is never to be calculated. If a function approximator overrides this, it should first call super.setIO() for important housekeeping.

Throws: MatrixException
if inputs are vectors with nonmatching sizes
Overrides:
setIO in class FunApp
 o evaluate
 public void evaluate()
calculate the output for the given input

Overrides:
evaluate in class FunApp
 o findGradients
 public void findGradients()
Calculate the output and gradient for a given input. This does everything evaluate() does, plus it calculates the gradient of the error with respect to the inputs and weights, dEdx and dEdw,

Overrides:
findGradients in class FunApp
 o findHessian
 public void findHessian()
Calculate the output, gradient, and Hessian for a given input. This does everything evaluate() and findGradients() do, plus it calculates the Hessian of the error with resepect to the the weights and inputs, dEdxdx, dEdwdx, and dEdwdw.

Overrides:
findHessian in class FunApp
 o nWeights
 public int nWeights(int nIn,
                     int nOut)
Return # weights needed for nIn inputs (including the first //this routine is never called, should throw null pointer exception one which is always 1.0), and nOut outputs.

Overrides:
nWeights in class FunApp
 o BNF
 public String BNF(int lang)
Overrides:
BNF in class FunApp
 o unparse
 public void unparse(Unparser u,
                     int lang)
Output a description of this object that can be parsed with parse().

Overrides:
unparse in class FunApp
See Also:
Parsable
 o parse
 public Object parse(Parser p,
                     int lang) throws ParserException
Parse the input file to get the parameters for this object.

Throws: ParserException
parser didn't find the required token
Overrides:
parse in class FunApp
 o setWatchManager
 public void setWatchManager(WatchManager wm,
                             String name)
Register all variables with this WatchManager. This will be called after all parsing is done. setWatchManager should be overridden and forced to call the same method on all the other objects in the experiment.

Overrides:
setWatchManager in class FunApp
 o clone
 public Object clone()
Make an exact duplicate of this class. For objects it contains, it only duplicates the pointers, not the objects they point to. For a new FunApp called MyFunApp, the code in this method should be the single line: return cloneVars(new MyFunApp());

Overrides:
clone in class FunApp
 o cloneVars
 public Object cloneVars(FunApp copy)
After making a copy of self during a clone(), call cloneVars() to copy variables into the copy, then return super.cloneVars(copy). The variables copied are just those set in parse() and setWatchManager(). The caller will be required to call setIO to set up the rest of the variables.

Overrides:
cloneVars in class FunApp
 o initialize
 public void initialize(int level)
Initialize, either partially or completely.

Overrides:
initialize in class FunApp
See Also:
initialize

All Packages  Class Hierarchy  This Package  Previous  Next  Index