The cross-entropy toolkit was produced for research into speech recognition. It was presented in the following publications, available from Rogier van Dalen’s website:

  • R. C. van Dalen and M. J. F. Gales (2013). “Importance Sampling to Compute Likelihoods of Noise-Corrupted Speech.” In Computer Speech and Language 27 (1), pp. 322–349.
  • R. C. van Dalen and M. J. F. Gales (2010). “Asymptotically Exact Noise-Corrupted Speech Likelihoods.” In Proceedings of Interspeech, pp. 709–712.
  • R. C. van Dalen and M. J. F. Gales (2010). A theoretical bound for noise-robust speech recognition. Technical report CUED/F-INFENG/TR.648, Cambridge University Engineering Department, Sep 2010.

Other useful references are:

  • Mark J. F. Gales (1995). Model-Based Techniques for Noise Robust Speech Recognition. Ph.D. thesis, Cambridge University.
  • Pedro J. Moreno (1996). Speech Recognition in Noisy Environments. Ph.D. thesis, Carnegie Mellon University.
  • Li Deng, Jasha Droppo, and Alex Acero (2004). “Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise.” IEEE Transactions on Speech and Audio Processing 12 (2), pp. 133–143.
  • Volker Leutnant and Reinhold Haeb-Umbach (2009). “An analytic derivation of a phase-sensitive observation model for noise robust speech recognition.” In Proceedings of Interspeech. pp. 2395–2398.

Previous topic


This Page