In any document retrieval system the retrieval time given a response is an important issue. Typically, in fixed keyword systems the keyword locations are determined in advance and stored in a table for fast look-up when a search request is received. In open vocabulary systems, the keyword locations have to be determined at the time of the search request. Fast word-spotting techniques are therefore essential.
The following is the abstract of a paper presented at ICASSP'96 in Atlanta. Patent applied for.
K.M.Knill and S.J.Young
This paper explores methods of increasing the speed of a Viterbi-based word-spotting system for audio document retrieval. Fast processing is essential since the user expects to receive the results of a keyword search many times faster than the actual length of the speech. A number of computational short-cuts to the standard Viterbi word-spotter are presented. These are based on exploiting the background Viterbi phone recognition path that is computed to provide a normalisation base. An initial approximation using the phone transition boundaries reduces the retrieval time by a factor of 5, while achieving a slight improvement in word-spotting performance. To further reduce retrieval time, pattern matching, feature selection, and Gaussian selection techniques are applied to this approximate pass to give a total x50 increase in speed with little loss in performance. In addition, a low memory requirement means that these approaches can be implemented on any platform, including hand-held devices.
Back to Audio Document Processing main page