net.pakl.rl
Class PolicyExtractor
java.lang.Object
net.pakl.rl.PolicyExtractor
public class PolicyExtractor
- extends java.lang.Object
Outputs the optimal actions given a world (environment) (which includes the
starting state), value function and generic policy (describing allowed
actions) by picking actions which lead to the next highest-valued state.
Note: this does not take into account the immediate reward, only the value
of the next state.
This supports "Patch" value functions, and so will load a new value
function according to the patch name derived by the state object if it
does not match with the currently-loaded value function.
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DYNAMICALLY_LOAD_VALUE_FUNCTION_PATCHES
public static boolean DYNAMICALLY_LOAD_VALUE_FUNCTION_PATCHES
MAXIMUM_ALLOWED_NUM_ACTIONS
public static final int MAXIMUM_ALLOWED_NUM_ACTIONS
- See Also:
- Constant Field Values
PolicyExtractor
public PolicyExtractor()
forceInitialState
public void forceInitialState(State s)
extractOptimalPolicy
public java.lang.String extractOptimalPolicy(ActionSet naivePolicy,
ValueFunction valueFunction,
World trainedWorld,
World testWorld,
ReinforcementFunction rf,
double discountFactor)
getOptimalActions
public java.util.List getOptimalActions()
setOptimalActions
public void setOptimalActions(java.util.List optimalActions)