Agent

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.pakl.rl
Class Agent

java.lang.Object
  net.pakl.rl.Agent

Direct Known Subclasses:: AgentParallelized

public class Agent
extends java.lang.Object
extends java.lang.Object

This class represents an agent, which is capable of iterating over all of the actions from its given state given a ActionSet. An agent maintains its own state and can be assigned a fixed policy.

Field Summary
`protected int`	`callNumber`
`protected long`	`DEFAULT_RANDOM_NUMBER_SEED`
`double`	`discountFactor` The higher this is, the more the future (distant reward) becomes important in making a decision -- but beware of infinite loops and explosions of value.
`double`	`epsilon` This controls the rate at which states are changed to contain the new value obtained by the learning algorithm -- for a stochastic world set this to something less than 1 so you don't forget good experiences from a state.
`double`	`greed` Greediness (used during TrajectorySampling) reflects the probability of taking the max action during learning; so 1-greed is the probability of taking a random action from allowable possible actions.
`protected java.lang.String`	`name`
`protected ActionSet`	`policy`
`static int`	`printStartStateValueEvery`
`protected java.util.Random`	`random`
`protected ReinforcementFunction`	`reinforcementFunction`
`protected State`	`state`
`static boolean`	`UPDATE_ONLY_WHEN_GREEDY`
`protected World`	`world`

Constructor Summary
`Agent()`
`Agent(java.lang.String newName)`

Method Summary
`void`	`experience(ValueFunction newValueFunction, ValueFunction valueFunction, State startState, State nextState, double reinforcement)`
`double`	`getAverageDelta()`
`Action`	`getBestActionForValueFrom(State s, ValueFunction vf, ActionSet p)` This is a function which should probably be called more often to reduce duplicated code.
`double`	`getDiscountFactor()`
`double`	`getMaximumDelta()`
`java.lang.String`	`getName()`
`double`	`getTotalDelta()`
`void`	`initializeRandomSeed(long newSeed)`
`ValueFunction`	`performValueIteration(ValueFunction newValueFunction, ValueFunction valueFunction)` This is the MAIN learning function in the Agent.
`void`	`performValueIterationTrajectorySample(ValueFunction newValueFunction, ValueFunction valueFunction)`
`protected void`	`performValueIterationUpdateOnState(ValueFunction newValueFunction, ValueFunction valueFunction, State currentState)`
`void`	`setDiscountFactor(double discountFactor)`
`void`	`setEpsilon(double epsilon)`
`void`	`setGreed(double greed)`
`void`	`setPolicy(ActionSet newPolicy)`
`void`	`setReinforcementFunction(ReinforcementFunction newReinforcementFunction)`
`void`	`setState(State newState)`
`void`	`setWorld(World newWorld)`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

UPDATE_ONLY_WHEN_GREEDY

public static boolean UPDATE_ONLY_WHEN_GREEDY

state

protected State state

policy

protected ActionSet policy

world

protected World world

reinforcementFunction

protected ReinforcementFunction reinforcementFunction

name

protected java.lang.String name

DEFAULT_RANDOM_NUMBER_SEED

protected final long DEFAULT_RANDOM_NUMBER_SEED

See Also:: Constant Field Values

random

protected java.util.Random random

discountFactor

public double discountFactor

The higher this is, the more the future (distant reward) becomes important in making a decision -- but beware of infinite loops and explosions of value.

greed

public double greed

Greediness (used during TrajectorySampling) reflects the probability of taking the max action during learning; so 1-greed is the probability of taking a random action from allowable possible actions.

epsilon

public double epsilon

This controls the rate at which states are changed to contain the new value obtained by the learning algorithm -- for a stochastic world set this to something less than 1 so you don't forget good experiences from a state.

callNumber

protected int callNumber

printStartStateValueEvery

public static int printStartStateValueEvery

Constructor Detail

Agent

public Agent()

Agent

public Agent(java.lang.String newName)

Method Detail

setWorld

public void setWorld(World newWorld)

initializeRandomSeed

public void initializeRandomSeed(long newSeed)

performValueIterationUpdateOnState

protected void performValueIterationUpdateOnState(ValueFunction newValueFunction,
                                                  ValueFunction valueFunction,
                                                  State currentState)

experience

public void experience(ValueFunction newValueFunction,
                       ValueFunction valueFunction,
                       State startState,
                       State nextState,
                       double reinforcement)

performValueIterationTrajectorySample

public void performValueIterationTrajectorySample(ValueFunction newValueFunction,
                                                  ValueFunction valueFunction)

performValueIteration

public ValueFunction performValueIteration(ValueFunction newValueFunction,
                                           ValueFunction valueFunction)

This is the MAIN learning function in the Agent.

getBestActionForValueFrom

public Action getBestActionForValueFrom(State s,
                                        ValueFunction vf,
                                        ActionSet p)

This is a function which should probably be called more often to reduce duplicated code.

getDiscountFactor

public double getDiscountFactor()

setDiscountFactor

public void setDiscountFactor(double discountFactor)

setEpsilon

public void setEpsilon(double epsilon)

getMaximumDelta

public double getMaximumDelta()

getAverageDelta

public double getAverageDelta()

getTotalDelta

public double getTotalDelta()

setState

public void setState(State newState)

setPolicy

public void setPolicy(ActionSet newPolicy)

setReinforcementFunction

public void setReinforcementFunction(ReinforcementFunction newReinforcementFunction)

getName

public java.lang.String getName()

setGreed

public void setGreed(double greed)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.pakl.rl Class Agent

UPDATE_ONLY_WHEN_GREEDY

state

policy

world

reinforcementFunction

name

DEFAULT_RANDOM_NUMBER_SEED

random

discountFactor

greed

epsilon

callNumber

printStartStateValueEvery

Agent

Agent

setWorld

initializeRandomSeed

performValueIterationUpdateOnState

experience

performValueIterationTrajectorySample

performValueIteration

getBestActionForValueFrom

getDiscountFactor

setDiscountFactor

setEpsilon

getMaximumDelta

getAverageDelta

getTotalDelta

setState

setPolicy

setReinforcementFunction

getName

setGreed

net.pakl.rl
Class Agent