A Focused Backpropagation Algorithm for Temporal Pattern Recognition
Michael C. Mozer
Department of Computer Science and Institute of Cognitive Science,
University of Colorado, Boulder, CO 80309, USA
Abstract
Time is at the heart of many pattern recognition tasks, e.g., speech recognition. However, connectionist learning algorithms to date are not well suited for dealing with time-varying input patterns. This paper introduces a specialized connectionist architecture and corresponding specialization of the backpropagation learning algorithm that operates efficiently on temporal sequences. The key feature of the architecture is a layer of self-connected hidden units that integrate their current value with the new input at each time step to construct a static representation of the temporal input sequence. This architecture avoids two deficiencies found in other models of sequence recognition: first, it reduces the difficulty of temporal credit assignment by focusing the backpropagated error signal; second, it eliminates the need for a buffer to hold the input sequence and/or intermediate activity levels. The latter property is due to the fact that during the forward (activation) phase, incremental activity traces can be locally computed that hold all information necessary for backpropagation in time. It is argued that this architecture should scale better than conventional recurrent architectures with respect to sequence length. The architecture has been used to implement a temporal version of Rumelhart and McClelland's verb past-tense model [1]. The hidden units learn to behave something like Rumelhart and McClelland's "Wickelphones,'' a rich and flexible representation of temporal information.