D.Y. Kim, H.Y. Chan, G. Evermann, M.J.F. Gales, D. Mrva, K.C. Sim, P.C. Woodland
March 2005
This paper describes our recent work on improving broadcast news transcription and presents details of the CU-HTK Broadcast News English (BN-E) transcription system for the DARPA/NIST Rich Transcription 2004 Speech-to-Text (RT04) evaluation. A key focus has been building a system using an order of magnitude more acoustic training data than we have previously attempted. We have also investigated a range of techniques to improve both Minimum Phone Error (MPE) training and the efficient creation of MPE-based narrow-band models. The paper describes two alternative system structures that run in under 10xRT and a further system that runs in less than 1xRT. This final system gives lower word error rates than our 2003 system that ran in 10xRT.
If you have difficulty viewing files that end '.gz'
which are gzip compressed, then you may be able to find
tools to uncompress them at the gzip
web site.
If you have difficulty viewing files that are in PostScript, (ending
or '.ps.gz'
), then you may be able to
find tools to view them at
the gsview
web site.
We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.