Bayesian Back-Propagation
Wray L. Buntine
RIACS & NASA Ames Research Center
Mail Stop 269-2, Moffet Field, CA 94035, USA
Andreas S. Weigend
Xerox Palo Alto Research Center,
3333 Coyote Hill Rd., Palo Alto, CA, 94304, USA
Abstract
Connectionist feed-forward networks, trained with back-propagation, can be used both for nonlinear regression and for (discrete one-of-C) classification. This paper presents approximate Bayesian methods to statistical components of back-propagation: choosing a cost function and penalty term (interpreted as a form of prior probability), pruning insignificant weights, estimating the uncertainty of weights, predicting for new patterns ("out-of-sample''), estimating the uncertainty in the choice of this prediction ("error bars''), estimating the generalization error, comparing different network structures, and handling missing values in the training patterns. These methods extend some heuristic techniques suggested in the literature, and in most cases require a small additional factor in computation during back-propagation, or computation once back-propagation has finished.