Deliberative vs Reflexive Learning

Deliberative learning is when an architecture decides if it is worth its while to learn something. The decision may manifest itself as several smaller issues:

Some of these decisions are made when the new information first arrives, but some made be made later. In particular, Prodigy keeps running statistics on how well it is served by the control rules that it learns. Balancing the decrease in problem space search cost gained by additional knowledge against the increase in knowledge search cost is known as the utility problem.

The advantages of deliberative learning are:

  1. memory savings for the most effective rules,
  2. time savings for matching rules
  3. the system designer has to do less work to make sure the agent learns only what it is supposed to

Reflexive learning is when an architecture automatically keeps all rules it creates. No statistics on rule efficiency are kept. The advantages of reflexive learning are:

  1. simplier learning mechanisms,
  2. time and memory savings by not keeping statistics on, and grading, rules

Press this line for general discussion on learning.

Examples of deliberative learning architectures are:

  • Dynamic Control Architecture by B. Hayes-Roth. Dynamic control planning decides what to learn.
  • Prodigy by Carbonell et al. (cf. the utility problem).

  • Examples of reflexive learning architectures are:

  • Atlantis by E. Gat. Learning takes place at the deliberative layer.
  • ERE by M. Drummond et al
  • Homer by Vere & Bickmore. It always adds data to its episodic memory.
  • SOAR by A. Newell et al. Chunking is SOAR's learning mechanism, it may either be turned on or off.

  • Examples of "hybrid" learning architectures are:

  • Icarus by P. Langley et al. Labrynth keeps statitics on which learned rules are used. These statistics do not "erase" anything from memory, but they are used to control what types of memory are searched when.
  • Theo by T. Mitchell et al.

  • Examples of architectures that make no commitment are:

  • Behavior-Based Programming by R. Brooks.
  • MAX by Daniel Kuokka. MAX has no architecturally-defined learning mechanism.
  • RALPH by Ogasawara and Russell. RALPH has no architecturally-defined learning mechanism, but, in principle, the simplier Execution Architectures may use cached results computed by their more thorough cousins.
  • Subsumption by R. Brooks. Because Subsumption Architecture has no state, its agents cannot learn.

  • Other Properties.

    Back to the Title Page.