net.pakl.rl
Class Agent

java.lang.Object
  extended by net.pakl.rl.Agent
Direct Known Subclasses:
AgentParallelized

public class Agent
extends java.lang.Object

This class represents an agent, which is capable of iterating over all of the actions from its given state given a ActionSet. An agent maintains its own state and can be assigned a fixed policy.


Field Summary
protected  int callNumber
           
protected  long DEFAULT_RANDOM_NUMBER_SEED
           
 double discountFactor
          The higher this is, the more the future (distant reward) becomes important in making a decision -- but beware of infinite loops and explosions of value.
 double epsilon
          This controls the rate at which states are changed to contain the new value obtained by the learning algorithm -- for a stochastic world set this to something less than 1 so you don't forget good experiences from a state.
 double greed
          Greediness (used during TrajectorySampling) reflects the probability of taking the max action during learning; so 1-greed is the probability of taking a random action from allowable possible actions.
protected  java.lang.String name
           
protected  ActionSet policy
           
static int printStartStateValueEvery
           
protected  java.util.Random random
           
protected  ReinforcementFunction reinforcementFunction
           
protected  State state
           
static boolean UPDATE_ONLY_WHEN_GREEDY
           
protected  World world
           
 
Constructor Summary
Agent()
           
Agent(java.lang.String newName)
           
 
Method Summary
 void experience(ValueFunction newValueFunction, ValueFunction valueFunction, State startState, State nextState, double reinforcement)
           
 double getAverageDelta()
           
 Action getBestActionForValueFrom(State s, ValueFunction vf, ActionSet p)
          This is a function which should probably be called more often to reduce duplicated code.
 double getDiscountFactor()
           
 double getMaximumDelta()
           
 java.lang.String getName()
           
 double getTotalDelta()
           
 void initializeRandomSeed(long newSeed)
           
 ValueFunction performValueIteration(ValueFunction newValueFunction, ValueFunction valueFunction)
          This is the MAIN learning function in the Agent.
 void performValueIterationTrajectorySample(ValueFunction newValueFunction, ValueFunction valueFunction)
           
protected  void performValueIterationUpdateOnState(ValueFunction newValueFunction, ValueFunction valueFunction, State currentState)
           
 void setDiscountFactor(double discountFactor)
           
 void setEpsilon(double epsilon)
           
 void setGreed(double greed)
           
 void setPolicy(ActionSet newPolicy)
           
 void setReinforcementFunction(ReinforcementFunction newReinforcementFunction)
           
 void setState(State newState)
           
 void setWorld(World newWorld)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UPDATE_ONLY_WHEN_GREEDY

public static boolean UPDATE_ONLY_WHEN_GREEDY

state

protected State state

policy

protected ActionSet policy

world

protected World world

reinforcementFunction

protected ReinforcementFunction reinforcementFunction

name

protected java.lang.String name

DEFAULT_RANDOM_NUMBER_SEED

protected final long DEFAULT_RANDOM_NUMBER_SEED
See Also:
Constant Field Values

random

protected java.util.Random random

discountFactor

public double discountFactor
The higher this is, the more the future (distant reward) becomes important in making a decision -- but beware of infinite loops and explosions of value.


greed

public double greed
Greediness (used during TrajectorySampling) reflects the probability of taking the max action during learning; so 1-greed is the probability of taking a random action from allowable possible actions.


epsilon

public double epsilon
This controls the rate at which states are changed to contain the new value obtained by the learning algorithm -- for a stochastic world set this to something less than 1 so you don't forget good experiences from a state.


callNumber

protected int callNumber

printStartStateValueEvery

public static int printStartStateValueEvery
Constructor Detail

Agent

public Agent()

Agent

public Agent(java.lang.String newName)
Method Detail

setWorld

public void setWorld(World newWorld)

initializeRandomSeed

public void initializeRandomSeed(long newSeed)

performValueIterationUpdateOnState

protected void performValueIterationUpdateOnState(ValueFunction newValueFunction,
                                                  ValueFunction valueFunction,
                                                  State currentState)

experience

public void experience(ValueFunction newValueFunction,
                       ValueFunction valueFunction,
                       State startState,
                       State nextState,
                       double reinforcement)

performValueIterationTrajectorySample

public void performValueIterationTrajectorySample(ValueFunction newValueFunction,
                                                  ValueFunction valueFunction)

performValueIteration

public ValueFunction performValueIteration(ValueFunction newValueFunction,
                                           ValueFunction valueFunction)
This is the MAIN learning function in the Agent.


getBestActionForValueFrom

public Action getBestActionForValueFrom(State s,
                                        ValueFunction vf,
                                        ActionSet p)
This is a function which should probably be called more often to reduce duplicated code.


getDiscountFactor

public double getDiscountFactor()

setDiscountFactor

public void setDiscountFactor(double discountFactor)

setEpsilon

public void setEpsilon(double epsilon)

getMaximumDelta

public double getMaximumDelta()

getAverageDelta

public double getAverageDelta()

getTotalDelta

public double getTotalDelta()

setState

public void setState(State newState)

setPolicy

public void setPolicy(ActionSet newPolicy)

setReinforcementFunction

public void setReinforcementFunction(ReinforcementFunction newReinforcementFunction)

getName

public java.lang.String getName()

setGreed

public void setGreed(double greed)