Contents
Teleo Reactive Ponder2
Introduction
A teleo-reactive (T-R) program is a mid-level agent control program that robustly directs an agent toward a goal in a manner that continuously takes into account the agent's changing perceptions of a dynamic environment. T-R programs are written in a production-rule-like language and require a specialized interpreter [ ref: nilsson ].
T-R programs typically have an ordered set of rules (a Rule Set) each of which contains a condition and one or more actions. The Rule Set then selects the first rule whose condition is true and starts executing the rule's actions. What follows afterwards depends upon the Rule Set's configuration and is described below.
Example
Rule |
Condition |
Action |
1 |
unloaded & at depot |
finished |
2 |
loaded & at depot |
unload |
3 |
loaded & facing depot |
go forwards |
4 |
loaded |
turn left |
5 |
next to bin |
load bin |
6 |
facing bin |
go forwards |
7 |
|
turn left |
Here is a simple example of a T-R rule set. It is for a robot truck that starts somewhere other than its depot. It's goal is to pick up a bin and put it down at the depot. When it starts off it may or may not be facing the bin (we assume that it is not next to it but that would work too). We can see that the first five rules' conditions are not satisfied, so we look at rule 6. If the truck is not facing the bin then rule 6 would be invoked. If the truck is facing the bin then rule 7 would be invoked and it would move forwards towards the bin.
If the rule set were continuously evaluated you can see that the truck would start turning to the left, then move forwards, pick up the bin, turn to the left again, go forwards and drop the bin off at the depot. This is because the lower numbered rules take precedence over the higher numbered ones. So, if rule 7 were being continuously executed then as soon as the truck were facing the bin, rule 6 would take over and the truck would move forwards.
Note that if someone moved the bin while the truck was approaching it (using rule 6) then rule 6's condition would no longer be true and rule 7 would be executed continuously until the truck were facing the bin again.
Example Notes
Rule 7 is executed if the truck is not loaded and it is not at the depot and it is not facing the bin. It does not require a condition (empty implies true) because if none of the other rules' conditions are satisfied we must run this rule.
- It would be more efficient if we had two rules with conditions asking if the bin were to the left or the right of the truck and actions to turn in the appropriate direction.
- Another enhancement would be to have rule 6 running in parallel with a turning rule so the truck drive forwards and turn at thesame time towards the bin.
Rules
Rules have a condition and one or more actions. The rule can be asked if it is available to be run i.e. to evaluate its condition and return the result, true or false. The rule can also be told to execute its actions. A rule can handle two or more parallel sequences of actions, it does this by spawning off separate threads and waiting for them all to finish before indicating that its actions have been completed. Actions are normally blocks but may be Rule Sets or even other rules, more about this later, for the time being we will just use blocks. NB Conditions are likely to be run many times and at any time therefore condition blocks should not have any side effects.
A rule in PonderTalk is created as follows:
Operation |
Description |
condition: aCondition |
Creates a rule with the given condition block aCondition |
condition: aCondition action: anAction |
Creates a rule with the given condition block aCondition and the given action action anAction |
condition: aCondition actions: anActionArray |
Creates a rule with the given condition block aCondition and the given array of actions in anActionArray |
Example: Creating a new rule
Here we can see the T-R rule factory being imported and then a new rule Managed Object is created with a condition that tests if myValue equals 23. Its action is a block that when activated moves the cart forwards. Notice that both the condition and action are blocks. Blocks are not executed when they are compiled but are activated later when needed; in this case, when the rule is told to evaluate its condition and when the rule is told to run it action(s).
1 rule := root load: "TrRule".
2 myrule := rule condition: [ myValue = 23 ] action: [ cart forwards ].
Once the rule has been created, more actions can be added to it, including parallel markers. The operations allowed on a rule are:
Operation |
Description |
action: anAction |
Adds an action to the rule |
actions: anActionArray |
Adds an array of actions to the rule. The actions are copied from the array, the array is discarded |
parallel |
Indicates that any following actions are to be run in parallel with the previously given actions. This command may be used many times for multiple parallelism |
Example: Adding to a rule
In this example actions are added to the rule. Blocks can be defined and held in domains or variables so these can be to reference them. We show here some blocks being created and then being given to the rule. NB Rules are not (yet) themselves thread safe so any one instance of a rule can only be used at any one time. Do not give the same rule to different rule sets.
Here we can see that two threads will be started when the rule is run. One thread will move the cart forwards, left, right and then stop it. The other thread will, independently, make the cart go fast, then slow, then fast again.
1 checkValue := [ myValue = 23 ].
2 root/actions at: "forwards" put: [ cart forwards ]
3 myrule := rule condition: checkValue.
4 myrule actions: #( root/actions/forwards [cart left] [cart right] ).
5 // The following line adds an action to the current action sequence, after [cart right]
6 // then it starts a new, parallel action sequence that will be run in parallel to the first 4 actions
7 myrule action: [ cart stop ]; parallel; action: [ cart fast. cart slow. cart fast ].
The following examples are the same. The rule runs blocks b1, b3 and b4 in parallel. b2 is run when b1 finishes. b5 is run when b4 finishes. The rule finishes when all parallel threads have finished.
1 myrule action: b1; action: b2; parallel; action: b3; parallel; action: b4; action: b5.
2 // The following is the same as the preceeeding line
3 myrule actions: #( b1 b2 ); parallel; action: b3; parallel; actions: #( b4 b5 ).
4 // The following is similar but not identical
5 myrule action: [ b1 value. b2 value ]; parallel; action: b3; parallel; action: [ b4 value. b5 value ].
Stopping Rules
Rules can be stopped using the stop command. This is normally done by the system not the user. However the user should be aware how the rule manages stop commands. If a stop is received, sends a stop to the currently running action(s) and then waits for them to terminate. Once they have finished the rule returns control to the requester. Blocks are atomic as far as PonderTalk is concerned and cannot be interrupted (and therefore don't receive the stop). This should be taken into account when designing the rule's actions.
In the above example, if either of the first two rules are used then a stop could cause the rule to terminate without having run b2 or b5 or even both. In the third example, if the rule receives a stop command then the rule will wait until all three threads have run to completion because blocks are atomic. I.e. even if b2 hasn't started it will still be run when b1 finishes before the action finishes.
Rule Sets
Rule Sets are the heart of T-R programming. A rule set is given an ordered set of rules which it then manages. The rule set, when started, asks each rule in turn if it can be run, the first rule to respond that it can, gets run. The rule set then proceeds from the top evaluating the rules over and over again. Each time the rules are evaluated, the top-most rule is run. If the top-most rule is the currently running rule (and it is still running) then that rule is simply left alone to continue. If, however, the highest priority rule ready to run is not the currently running rule then the currently running rule is told to stop; when it has stopped, the chosen rule is run.
Creating a Rule Set
A rule set in PonderTalk is created using one of the following messages:
Operation |
Description |
create |
Creates a new TrRuleSet |
create: aName |
Creates a new TrRuleSet with the name aName (used for trace messages) |
Here we can see the TR rule set factory being imported and then a new rule set Managed Object being created:
1 ruleset := root load: "TrRuleSet".
2 myruleset := ruleset create.
Adding Rules to a Rule Set
Once the rule set has been created it can have rules added to its ordered sequence of rules. Rules can also explicitly be added at a certain position. Once everything is set up the rule set can be told to start whilst giving it the percepts hash:
Operation |
Description |
add: aRule |
Adds aRule to the end of the ordered rule set. |
at: anIndex put: aRule |
Adds aRule at position anIndex to the rule set. Displaced rules get pushed down by one |
In this example some rules are created and added immediately to a rule set:
1 ruleset add: (rule condition: [ c > 4 ] action: [ a := a + 1 ]).
2 ruleset add: (rule condition: [ a > 4 ] action: myAction).
3 // Whoops we forgot a rule(!), let's add it in front of the last one
4 ruleset at: 1 add: (rule condition: [ true ] action: anotherAction).
Starting and Stopping Rule Sets
A rule set is started by sending it a run command along with the percepts that the rule set and all its rules will be using. The start is asynchronous, a new thread is started for the rule set. The stop is synchronous i.e. the stop does not return until the rule set has stopped its currently running rule. A rule set can explicitly be told to re-evaluate its rules by sending it an explicit event (this is how the percepts tells the rule set to re-evaluate the rules' conditions):
Operation |
Description |
run: thePercepts |
runs the ruleset using the percepts found in thePercepts. The rule set is run in a separate thread so that it can be given new events or may be stopped. This call immediately returns leaving the ruleset running. |
stop |
Tells the rule set to stop executing rules. If a rule is running then that rule is told to stop. This action returns after the currently running rule (if any) has been stopped |
Example: Controlling a rule set
Here we can see the TR rule set being started, later being told to reevaluate its rules and then still later stopped:
1 myruleset run: mypercepts.
2 // Wait for 10 seconds before telling the rule set to reevaluate its rules.
3 root sleep: 10.
4 myruleset event.
5 // Wait for 20 seconds before stopping the rule set
6 root sleep: 20.
7 myruleset stop.
Example: Controlling multiple rule sets
Multiple rule sets can be given events
1 mypercepts tell: myruleset1; tell: myruleset2.
2 myruleset1 run: mypercepts.
3 myruleset2 run: mypercepts.
4 // Wait for 10 seconds before telling all the rule set to re-evaluate their rules.
5 root sleep: 10.
6 mypercepts event.
7 // This line will also cause the rule sets to re-evaluate their rules
8 percepts at: "myvar" put: "myvalue".
Debugging
It can sometimes be difficult to see what is happening in a T-R program, especially if there are multiple threads running. The TrPonder system has a trace facility that can be turned on and off to help keep track of the rules being run. Debugging is turned on with the TrRuleSet trace command:
Operation |
Description |
trace: aBoolean |
Sets all T-R tracing on for all rule sets and rules if aBoolean is true. Tracing is set to false initially |
Debugging is turned on and off for the whole TrPonder system running within an SMC. That means that all rule sets and rules will start generating tracing information to the Java console until tracing is turned off again.
1 // Turn tracing on
2 myruleset trace: true.
3 // Turn tracing off
4 myruleset trace: false.
Rule Set Behaviour Modifications
In the strict sense of T-R programs, the Rule Set constantly re-evaluates its rules' conditions. If a rule with a higher priority (earlier on in the ordered set) than the currently running rule becomes available to run then the currently running rule is stopped and the new rule is run. This is not very efficient in a multasking system so various options have been created to manage the rule sets efficently:
Run and Wait
The Rule Set can be told to run one rule and then wait for a new event to occur or the rule to finish before re-evaluating all the rules' conditions.
Wait For Rule
The Rule Set will not try to stop a currently running rule even if there is a higher priority ready available to run.The Rule Set will wait for the rule to terminate before running the new rule.
Run Once
The Rule Set can to told to run one rule and then terminate. This implies Wait For Rule.
Event
The event command tells the rule set that an event has occurred and that the rules should be reevaluated. This may be overridden by one of the above settings. This call is used by the precepts to indicate that a value has changed.
These operations are performed on rule sets using the following PonderTalk commands:
Operation |
Description |
event |
Tells the receiver to re-evaluate the rule conditions and to choose a (possibly) new rule to run |
runOnce: aBoolean |
Tells the rule set to run one rule and then terminate if aBoolean is true. The initial value is false |
waitForEvent: aBoolean |
Tells the rule set to wait for an event before running through the rule list again if aBoolean is true. Use of this option can remove unnecessary scans of the rule conditions. The initial value is false. |
waitForRule: aBoolean |
Tells the rule set to wait for completion of a rule rather than interrupting it as necessary if aBoolean is true. The initial value is false |
Percepts
Percepts contain the global knowledge that the T-R rule set shares with all the rules. The rule set is informed whenever the percepts change and, depending upon the options used, this could trigger a new evaluation of the rules and therefore possibly a new rule to be run. The percepts managed object is derived from a PonderTalk Hash managed object which is a basic managed object within Ponder2. As such it needs to be created in the following way which ensures that the new object looks like a hash to other managed objects (internally, basic managed objects are treated differently from normal managed objects, see the source code for more details):
1 pfactory := root load: "Percepts".
2 percepts := pfactory create asHash.
The percepts object can now be used as a hash in the normal way, it can also be told to notify one or more rule sets should any values be written to it. The percepts object is given to a rue set when the rule set is run. The rule set hands it to each rule that is run, the rule hands it to each action. Blocks see the percepts values a global variables so block arguments are not required.
Operation |
Description |
tell: aRuleSet |
Tells the percepts that it must inform the given rule set whenever a value changes or an event is raised. This command may be used many times to inform many rule sets |
event |
Tells the percepts to inform the rule set(s), given by the tell command, that an event has occurred |
at: aName put: aValue |
Store a value and raise an event for the rule set(s) |
Percepts Examples
PonderTalk blocks can take arguments from a Ponder2 hash managed object. They can also merge the values from a hash into their environment variables before executing the block's code. This allows values in a hash to be used as plain variables without declaring them as arguments.
1 percepts at: "size" put: 5.
2 rule condition: [ size == 5 ] action: ...
3 // The rule runs this as:
4 ruleCondition valueVars: percepts
5 // Otherwise we would need
6 rule condition: [ :size | size == 5 ] action: ...
Actions
The examples above have all shown actions as blocks. In addition to blocks, actions may also be rule sets or other rules. All three types are described more fully here.
Code Blocks
A code block is a non interruptible sequence of operations. If a long running action is to be made interruptible then it should be split into a sequence of actions then after one code block as finished the rule can be terminated before the next action is started. Conversely if there are several actions that should be atomic then they should be grouped into a single code block.
Rule Sets
Other Rules
If another rule is given as an action then it is simply told to execute its actions when the time comes. It's condition is ignored. This is useful if an action is to perform operations in parallel, then it can be described as a rule.
The Truck Example Revisited
An enhancement to the truck rule set would be to have the truck turn and move forwards at the same time. This could be done by using parallel actions in certain rules.e.g. We can combine rules 6 and 7:
Rule |
Condition |
Action |
1 |
unloaded & at depot |
finished |
2 |
loaded & at depot |
unload |
3 |
loaded & facing depot |
go forwards |
4 |
loaded |
turn left |
5 |
next to bin |
load bin |
6 |
|
approach bin |
Where the action "approach bin" would be a rule that contains two rule sets as parallel actions:
rule action: turn_towards; parallel; action: move_towards.
One rule set (turn_towards) keeps the truck facing the bin:
Rule |
Condition |
Action |
1.1 |
bin to left or directly behind |
turn left |
1.2 |
bin to right |
turn right |
1.3 |
|
do nothing |
The other rule set (move_towards) moves the truck in the direction of the bin:
Rule |
Condition |
Action |
2.1 |
bin in front within 45 degrees left or right |
move forwards |
2.2 |
|
do nothing |
Once the truck has reached the bin, rule 5 kicks in which stops rule 6. Stopping rule 6 stops the two movement rule sets which in turn stop their currently running rules. Only when rule 6 has stopped completely will rule 5's action be started.
Another enhancement would be to reuse the new rule sets to move towards the depot as well as the bin. This can be done by specifying that the new rule sets move towards not the truck but, say, target. Target can be set in the precepts at first to be the bin and later after load bin is executed to be the depot. This will give us a main rule set looking something like:
Rule |
Condition |
Action |
1 |
unloaded & at depot |
finished |
2 |
loaded & at depot |
unload |
3 |
loaded |
target=depot, approach target |
4 |
next to bin |
load bin |
5 |
|
target=bin, approach target |
Caveats and Notes
There are a few caveats to consider when using the TrPonder system. These are described here. More will be added based on users' experiences.
Use of Non-Percepts Values in Conditions
If a rule condition is not based on a value in percepts, but rather it references some other external managed object's state or value, the programme writer has to ensure that those conditions will get a chance to be re-evaluated. For example let us assume the following T-R programme:
Rule set 1 |
||
Rule |
Condition |
Action |
1 |
a > 4 |
myaction |
2 |
|
ruleset 2 |
Rule set 2 |
||
Rule |
Condition |
Action |
1 |
c > 4 |
a++ |
2 |
|
c++ |
In TrPonder this would be written as:
1 percepts at: "c" put: 1.
2 a := 1;
3
4 ruleset2 add: (rule condition: [ c > 4 ] action: [ a := a + 1 ]).
5 ruleset2 add: (rule condition: [ true ] action: [ percepts at: "c" put: (c + 1) ]).
6
7 ruleset1 add: (rule condition: [ a > 4 ] action: myaction).
8 ruleset1 add: (rule condition: [ true ] action: ruleset2).
9
10 percepts tell: ruleset1.
11 ruleset1 run: percepts.
Once the c variable has reached 5, the T-R programme will start increasing the variable a. However, since a is not in the percepts, no events are fired to signal the change in its value. Therefore, even when a becomes 5, the top-most rule will not get fired. This situation is the result of the following two semantic details: 1) conditions are re-evaluated when a percept value changes. 2) conditions in the currently running rule-set are re-evaluated after a rule has finished. Since a is not in percepts an event is not received and since the condition (a > 4) is not in the current running rule set. The mentioned condition does not get a chance to be re-evaluated. The easiest way around this problem is to constrain the conditions to be based only on percepts.
Rules Are Not Re-entrant
Rules are not (yet) themselves thread safe so any one instance of a rule can only be used at any one time. Do not give the same rule to different rule sets.
PonderTalk Blocks
There are a few problems inherent in the way that PonderTalk's blocks work. A block is a closure, that is it contains a copy of its environment at the time the block is created and so it is difficult to maintain common information between blocks. This is where the usefulness of the percepts comes in. The copy of the environment is a shallow copy which means that only the values of the top level environment variables are copied not the objects that they refer to. Thus if a variable refers to a domain, then then new copy of the variable will refer to the same domain, in this way shared values can be facilitated.
1 // Create a new hash
2 var := #() asHash.
3 var at: "size" put: 10.
4
5 b1 := [ ... var at: "size" put: 12. ... var := 7. ..].
6 b2 := [ ... root print: (var at: "size"). ... ].
7
8 b2 value. // prints 10
9 b1 value.
10 b2 value. // prints 12
11
12 b1 value. // error number does not understand "at:put:"
13 b2 value. // prints 12
14
15 var := 4.
16 b2 value. // prints 12
After the blocks are created, the value of var is copied into each block but they still both refer to the same PonderTalk Hash managed object.
Download TrPonder
Note: This is a work in progress and may be changed at any time. The current files are dated 11th October 2009.
The TrPonder src zip file can be downloaded here and the TrPonder binary Jar file is here. The source file should be unzipped inside your Ponder2 installation src directory. It will create some files in net/ponder2 and some files in resource. To run the example use the build.xml file you already have in the Ponder2 installation:
ant run -Dboot=trmaze.p2
The binary file can simply be added to your classpath in the build.xml file and run as above.