Accelerated Learning in Layered Neural Networks
Sara A. Solla
AT&T Bell Laboratories, Holmdel, NJ 07733, USA
Esther Levin
Michael Fleisher
Technion Israel Institute of Technology, Haifa 32000, Israel
Abstract
Learning in layered neural networks is posed as the minimization of an errror function defined over the training set. A probabilistic interpretation of the target activities suggests the use of relative entropy as an error measure. We investigate the merits of using this error function over the traditional quadratic function for gradient descent learning. Comparative numerical simulations for the contiguity problem show marked reductions in learning times. This improvement is explained in terms of the characteristic steepness of the landscape defined by the error function in configuration space.