Kohonen has proposed a physiologically plausible method of cooperative and competitive organization for artificial neural networks that allows them to self-organize around a set of input vectors.' The now famous approach of Kohonen's Self-Organizing Topological Feature Maps has been applied extensively to pattern classification problems. For many problems, however, it is not enough to say which class the input falls into, but rather what (real-valued) output is appropriate for the class to which the input vector belongs. Several methods have been proposed for extending Kohonen Networks so that they may learn appropriate output responses.2' Unfortunately, these supervised methods require having a "teacher" that knows the correct output responses. While in all problems the input vectors are "correct" (that is why the network is to learn them), in many problems the correct output responses are not available (which, contrapositively, may be the reason we wish to train the network). To partially fill this gap I propose a SelfOrganizing Neural Network using Eligibility Traces (SONNET). SONNET is appropriate for those problems in which the correct output responses are not known, but a feedback mechanism that allows for an overall evaluation of system performance (success and/or failure signals) is available and for which system performance is temporally based on network responses. Such is the case with controllers for many physical systems (including some robotics applications) as well as chemiCal and biological systems and is even the case for some object recognition and other computer vision problems. SONNET works by combining the self-organizing capabilities of Kohonen Networks with the temporal sensitivity of eligibility traces. The concept of the eligibility trace comes from observations of human and animal brains. It has been noted that many neurons become more amenable to change when they fire.4 This plasticity reduces with time, but produces a trace of eligibility for adaptation. Using these races, SONNEr adapts the output responses to a greater or lesser degree depending on their eligibility for adaptation at the time when the failure and/or success signals are received. The use of SONNET is demonstrated on a simulation of the well-known (toy) physical system control problem known as the pole-balancing problem. Comparisons are made between SONNET and other neural network5 and nonconnectionist control-learning systems. SONNET is seen to be powerful and adaptable. It is capable of learning both a useful partitioning of the input space and, without supervision, an appropriate output response for each class.
|