Proc. ICSLP'94, Yokohama, Japan


M. Jones and P.C. Woodland

September 1994

The acoustic-phonetic modelling used in state-of-the-art large vocabulary continuous speech recognisers (LVCSR) cannot effectively exploit the prosody based distinctions known to exist at thesyllable level. These distinctions are between the strength of the syllable (strong or weak) and the stress (stressed or unstressed) it is given.

This paper shows how a small set of syllable-sized Hidden Markov Models (HMMs) can model syllable type effectively. These models have been applied to a large vocabulary continuous speech recogniser and a 23% reduction in word error rate was achieved.

