|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectnet.pakl.rl.Agent
public class Agent
This class represents an agent, which is capable of iterating
over all of the actions from its given state given a ActionSet.
An agent maintains its own state and can be assigned a fixed policy.
| Field Summary | |
|---|---|
protected int |
callNumber
|
protected long |
DEFAULT_RANDOM_NUMBER_SEED
|
double |
discountFactor
The higher this is, the more the future (distant reward) becomes important in making a decision -- but beware of infinite loops and explosions of value. |
double |
epsilon
This controls the rate at which states are changed to contain the new value obtained by the learning algorithm -- for a stochastic world set this to something less than 1 so you don't forget good experiences from a state. |
double |
greed
Greediness (used during TrajectorySampling) reflects the probability of taking the max action during learning; so 1-greed is the probability of taking a random action from allowable possible actions. |
protected java.lang.String |
name
|
protected ActionSet |
policy
|
static int |
printStartStateValueEvery
|
protected java.util.Random |
random
|
protected ReinforcementFunction |
reinforcementFunction
|
protected State |
state
|
static boolean |
UPDATE_ONLY_WHEN_GREEDY
|
protected World |
world
|
| Constructor Summary | |
|---|---|
Agent()
|
|
Agent(java.lang.String newName)
|
|
| Method Summary | |
|---|---|
void |
experience(ValueFunction newValueFunction,
ValueFunction valueFunction,
State startState,
State nextState,
double reinforcement)
|
double |
getAverageDelta()
|
Action |
getBestActionForValueFrom(State s,
ValueFunction vf,
ActionSet p)
This is a function which should probably be called more often to reduce duplicated code. |
double |
getDiscountFactor()
|
double |
getMaximumDelta()
|
java.lang.String |
getName()
|
double |
getTotalDelta()
|
void |
initializeRandomSeed(long newSeed)
|
ValueFunction |
performValueIteration(ValueFunction newValueFunction,
ValueFunction valueFunction)
This is the MAIN learning function in the Agent. |
void |
performValueIterationTrajectorySample(ValueFunction newValueFunction,
ValueFunction valueFunction)
|
protected void |
performValueIterationUpdateOnState(ValueFunction newValueFunction,
ValueFunction valueFunction,
State currentState)
|
void |
setDiscountFactor(double discountFactor)
|
void |
setEpsilon(double epsilon)
|
void |
setGreed(double greed)
|
void |
setPolicy(ActionSet newPolicy)
|
void |
setReinforcementFunction(ReinforcementFunction newReinforcementFunction)
|
void |
setState(State newState)
|
void |
setWorld(World newWorld)
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static boolean UPDATE_ONLY_WHEN_GREEDY
protected State state
protected ActionSet policy
protected World world
protected ReinforcementFunction reinforcementFunction
protected java.lang.String name
protected final long DEFAULT_RANDOM_NUMBER_SEED
protected java.util.Random random
public double discountFactor
public double greed
public double epsilon
protected int callNumber
public static int printStartStateValueEvery
| Constructor Detail |
|---|
public Agent()
public Agent(java.lang.String newName)
| Method Detail |
|---|
public void setWorld(World newWorld)
public void initializeRandomSeed(long newSeed)
protected void performValueIterationUpdateOnState(ValueFunction newValueFunction,
ValueFunction valueFunction,
State currentState)
public void experience(ValueFunction newValueFunction,
ValueFunction valueFunction,
State startState,
State nextState,
double reinforcement)
public void performValueIterationTrajectorySample(ValueFunction newValueFunction,
ValueFunction valueFunction)
public ValueFunction performValueIteration(ValueFunction newValueFunction,
ValueFunction valueFunction)
public Action getBestActionForValueFrom(State s,
ValueFunction vf,
ActionSet p)
public double getDiscountFactor()
public void setDiscountFactor(double discountFactor)
public void setEpsilon(double epsilon)
public double getMaximumDelta()
public double getAverageDelta()
public double getTotalDelta()
public void setState(State newState)
public void setPolicy(ActionSet newPolicy)
public void setReinforcementFunction(ReinforcementFunction newReinforcementFunction)
public java.lang.String getName()
public void setGreed(double greed)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||