Audio Document Processing

There is growing interest in the use of hand-held computing devices for a wide range of applications. As dimensions decrease, the use of voice as an input media becomes more attractive. This project will investigate the integrating of speech processing techniques such as word spotting, speaker separation, sound classification, etc to facilitate the manipulation of stored audio documents. A typical application being studied is "Voice Notes" which allows short spoken documents to be stored and subsequently retrieved by category or content.

Sponsor: Hewlett Packard Labs, Bristol.

Research Areas

Publications

Fast implementations of Viterbi-based word-spotting. In Proc ICASSP'96, Atlanta, May, 1996, pp520-523.

Techniques for automatically transcribing unknown keywords for open keyword set HMM-based word-spotting. Tech Report CUED/F-INFENG/230, Cambridge Uni Engineering Dept, September, 1995.

Keyword training using a single spoken example for applications in audio document detrieval. In Proc ICSST, Perth, Australia, December, 1994.

Speaker dependent keyword-spotting for accessing stored speech. Tech Report CUED/F-INFENG/193, Cambridge Uni Engineering Dept, October, 1994.

Please send bug reports/comments/suggestions to Kate Knill (kmk@eng.cam.ac.uk)