Backpropagation Can Give Rise to Spurious Local Minima Even for Networks without Hidden Layers
Eduardo D. Sontag
Héctor J. Sussmann
Department of Mathematics, Rutgers University,
New Brunswick, NJ 08903, USA
Abstract
We give an example of a neural net without hidden layers and with a sigmoid transfer function, together with a training set of binary vectors, for which the sum of the squared errors, regarded as a function of the weights, has a local minimum which is not a global minimum. The example consists of a set of 125 training instances, with four weights and a threshold to be learned. We do not know if substantially smaller binary examples exist.