Abstract for valtchev_icslp94

Proc. ICSLP'94


V. Valtchev, J.J. Odell, P.C. Woodland, and S.J. Young


Accuracy and speed are the main issues to consider when designing a large vocabulary speech recogniser. Recent experience with the Wall Street Journal (WSJ) corpus has shown that high recognition accuracy requires the use of detailed acoustic models in conjunction with well-trained long span language models. In this paper we present a two-pass decoder architecture which extends an original one-pass design. The initial pass consists of a time synchronous backward search in a pre-compiled network using simplified acoustic models and a null grammar. The forward pass can function as a stand-alone one-pass decoder capable of using cross-word context-dependent models and long span language models. The capabilities of this framework are empirically examined in terms of recognition accuracy vs speed on the Wall Street Journal database.

