All Packages Class Hierarchy This Package Previous Next Index
Class sim.funApp.ValuePolicy
java.lang.Object
|
+----sim.funApp.FunApp
|
+----sim.funApp.ValuePolicy
- public class ValuePolicy
- extends FunApp
This funApp is used in conjunction with the Graph3D display object to observe the
policy and value function associated with a learning algorithm and MDP.
This does not implement findGradient or findHessian.
This code is (c) 1996 Mance Harmon
<harmonme@aa.wpafb.af.mil>,
http://www.cs.cmu.edu/~baird
The source and object code may be redistributed freely.
If the code is modified, please state so in the comments.
- Version:
- 1.11, 17 June 97
- Author:
- Mance Harmon
-
action
- the action vector that is passed to findValAct().
-
function
- Function approximator to plot (a duplicate of the original)
-
mdp
- the mdp to control
-
optAction
- the action vector that is optimal in a given state
-
origFunction
- The origianl function approximator whose duplicate will be plotted
-
random
- the random number generator
-
statesOnly
- A flag switching from state/action pairs to states only (as in value iteration).
-
value
- the value of the state passed to the MDP
-
valueKnown
- A flag stating whether or not we know for certain the value of a state.
-
ValuePolicy()
-
-
BNF(int)
-
-
clone()
- Make an exact duplicate of this class.
-
cloneVars(FunApp)
- After making a copy of self during a clone(), call cloneVars() to
copy variables into the copy, then return super.cloneVars(copy).
-
evaluate()
- calculate the output for the given input
-
findGradients()
- Calculate the output and gradient for a given input.
-
findHessian()
- Calculate the output, gradient, and Hessian for a given input.
-
getParameters(int)
- Return a parameter array if BNF(), parse(), and unparse() are to be automated, null otherwise.
-
initialize(int)
- Initialize, either partially or completely.
-
nWeights(int, int)
- Return # weights needed for nIn inputs (including the first //this routine is never called, should throw null pointer exception
one which is always 1.0), and nOut outputs.
-
parse(Parser, int)
- Parse the input file to get the parameters for this object.
-
setIO(MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD, MatrixD)
- Define the MatrixD objects that will be used by evaluate(), findGradients(),
and findHessian().
-
setWatchManager(WatchManager, String)
- Register all variables with this WatchManager.
-
unparse(Unparser, int)
- Output a description of this object that can be parsed with parse().
random
protected Random random
- the random number generator
optAction
protected MatrixD optAction
- the action vector that is optimal in a given state
action
protected MatrixD action
- the action vector that is passed to findValAct(). This vector points to the location of the
input vector to the function approximator that changed the action.
valueKnown
protected PBoolean valueKnown
- A flag stating whether or not we know for certain the value of a state.
mdp
protected MDP mdp[]
- the mdp to control
function
protected FunApp function
- Function approximator to plot (a duplicate of the original)
origFunction
protected FunApp origFunction[]
- The origianl function approximator whose duplicate will be plotted
value
protected MatrixD value
- the value of the state passed to the MDP
statesOnly
protected PBoolean statesOnly
- A flag switching from state/action pairs to states only (as in value iteration).
ValuePolicy
public ValuePolicy()
getParameters
public Object[][] getParameters(int lang)
- Return a parameter array if BNF(), parse(), and unparse() are to be automated, null otherwise.
- Overrides:
- getParameters in class FunApp
- See Also:
- getParameters
setIO
public void setIO(MatrixD inVect,
MatrixD outVect,
MatrixD weights,
MatrixD dEdIn,
MatrixD dEdOut,
MatrixD dEdWeights,
MatrixD dEdIndIn,
MatrixD dEdOutdOut,
MatrixD dEdWeightsdWeights) throws MatrixException
- Define the MatrixD objects that will be used by evaluate(), findGradients(),
and findHessian(). First 6 should be column vectors (n by 1 matrices).
The last 3 parameters can be null if the Hessian is never to be calculated.
If a function approximator overrides this, it should first call
super.setIO() for important housekeeping.
- Throws: MatrixException
- if inputs are vectors with nonmatching sizes
- Overrides:
- setIO in class FunApp
evaluate
public void evaluate()
- calculate the output for the given input
- Overrides:
- evaluate in class FunApp
findGradients
public void findGradients()
- Calculate the output and gradient for a given input.
This does everything evaluate() does, plus it calculates
the gradient of the error with respect to the inputs and
weights, dEdx and dEdw,
- Overrides:
- findGradients in class FunApp
findHessian
public void findHessian()
- Calculate the output, gradient, and Hessian for a given input.
This does everything evaluate() and findGradients() do, plus
it calculates the Hessian of the error with resepect to the
the weights and inputs, dEdxdx, dEdwdx, and dEdwdw.
- Overrides:
- findHessian in class FunApp
nWeights
public int nWeights(int nIn,
int nOut)
- Return # weights needed for nIn inputs (including the first //this routine is never called, should throw null pointer exception
one which is always 1.0), and nOut outputs.
- Overrides:
- nWeights in class FunApp
BNF
public String BNF(int lang)
- Overrides:
- BNF in class FunApp
unparse
public void unparse(Unparser u,
int lang)
- Output a description of this object that can be parsed with parse().
- Overrides:
- unparse in class FunApp
- See Also:
- Parsable
parse
public Object parse(Parser p,
int lang) throws ParserException
- Parse the input file to get the parameters for this object.
- Throws: ParserException
- parser didn't find the required token
- Overrides:
- parse in class FunApp
setWatchManager
public void setWatchManager(WatchManager wm,
String name)
- Register all variables with this WatchManager.
This will be called after all parsing is done.
setWatchManager should be overridden and forced to
call the same method on all the other objects in the experiment.
- Overrides:
- setWatchManager in class FunApp
clone
public Object clone()
- Make an exact duplicate of this class. For objects it contains, it
only duplicates the pointers, not the objects they point to. For a
new FunApp called MyFunApp, the code in this method should be the
single line: return cloneVars(new MyFunApp());
- Overrides:
- clone in class FunApp
cloneVars
public Object cloneVars(FunApp copy)
- After making a copy of self during a clone(), call cloneVars() to
copy variables into the copy, then return super.cloneVars(copy).
The variables copied are just those set in parse() and
setWatchManager(). The caller will be required to call
setIO to set up the rest of the variables.
- Overrides:
- cloneVars in class FunApp
initialize
public void initialize(int level)
- Initialize, either partially or completely.
- Overrides:
- initialize in class FunApp
- See Also:
- initialize
All Packages Class Hierarchy This Package Previous Next Index