Bayesian Back-Propagation

Wray L. Buntine; Andreas S. Weigend

Bayesian Back-Propagation

Wray L. Buntine
RIACS & NASA Ames Research Center
Mail Stop 269-2, Moffet Field, CA 94035, USA

Andreas S. Weigend
Xerox Palo Alto Research Center,
3333 Coyote Hill Rd., Palo Alto, CA, 94304, USA

Abstract

Connectionist feed-forward networks, trained with back-propagation, can be used both for nonlinear regression and for (discrete one-of-C) classification. This paper presents approximate Bayesian methods to statistical components of back-propagation: choosing a cost function and penalty term (interpreted as a form of prior probability), pruning insignificant weights, estimating the uncertainty of weights, predicting for new patterns ("out-of-sample''), estimating the uncertainty in the choice of this prediction ("error bars''), estimating the generalization error, comparing different network structures, and handling missing values in the training patterns. These methods extend some heuristic techniques suggested in the literature, and in most cases require a small additional factor in computation during back-propagation, or computation once back-propagation has finished.