The rediscovery and popularization of the back propagation training technique for multilayer perceptrons as well as the invention of the Boltzmann Machine learning algorithm has given a new boost to the study of supervised learning networks. In recent years, besides the widely spread applications and the various further improvements of the classical back propagation technique, many new supervised learning models, techniques as well as theories, have also been proposed in a vast number of publications. This paper tries to give a rather systematical review on the recent advances on supervised learning techniques and theories for static feedforward networks. We summarize a great number of developments into four aspects: (1) Various improvements and variants made on the classical back propagation techniques for multilayer (static) perceptron nets, for speeding up training, avoiding local minima, increasing the generalization ability, as well as for many other interesting purposes. (2) A number of other learning methods for training multilayer (static) perceptron, such as derivative estimation by perturbation, direct weight update by perturbation, genetic algorithms, recursive least square estimate and extended Kalman filter, linear programming, the policy of fixing one layer while updating another, constructing networks by converting decision tree classifiers, and others. (3) Various other feedforward models which are also able to implement function approximation, probability density estimation and classification, including various models of basis function expansion (e.g., radial basis functions, restricted coulomb energy, multivariate adaptive regression splines, trigonometric and polynomial bases, projection pursuit, basis function tree, and may others), and several other supervised learning models. (4) Models with complex structures, e.g., modular architecture, hierarchy architecture, and others. (5) A number of theoretical issues involving the universal approximation of continuous functions, best approximation ability, learnability, capability, generalization ability, and the relations between these abilities to the number of layers in a network, the number of the neurons needed, hidden neurons, as well as the number of training samples. Altogether, we try to give a global picture of the present state of supervised learning techniques and theories for training static feedforward networks.
|