[Univ of Cambridge]alt[Dept of Engineering]

MIL Speech Seminars 2007-2008

The MIL Speech Seminar series schedule for the Long Vacation 2009 was as follows:

1st September 2009 Scott Novotney (HLTCOE and BBN Technologies) Factors Affecting ASR Model Self-Training Low-resource ASR self-training seeks to minimize resource requirements such as manual transcriptions or language modeling text. This is accomplished by training on large quantities of audio automatically labeled by a small initial model. By analyzing our previous experiments with the conversational telephone English Fisher corpus, we demonstrate where self-training succeeds and under what resource conditions it provides the most benefit. Additionally, we will show success on Spanish and Levantine conversational speech as well as the tougher English Callhome set, despite initial WER of more than 60%. Finally, by digging beneath average word error rate and analyzing individual word performance, we show that self-trained models successfully learn new words. More importantly, self-training benefits most words which appear in the unlabeled audio but do not appear in the manual transcriptions.
4th September 2009 Jen-Tzung Chien (National Cheng Kung University, Taiwan) Bayesian Learning Approaches for Speech Recognition In this talk, I will present my previous and ongoing studies on Bayesian learning for speech recognition. In the areas of speech recognition, Bayesian adaptation has been widely presented to deal with the issue of speaker adaptation where the likelihood function of adapation data and the prior density of the existing model are merged to find the adapted model for new speaker. Such a Bayesian learning approach is not only useful for model adaptation but also for model regularization where the regularized hidden Markov models (HMMs) are good for prediction of unknown test data. The regularized HMMs can be applied for decision tree state tying in a data generation model and even can be integrated with a large magin classifier to improve the generalization of a discriminative model based on the large margin HMMs. Furthermore, the Bayesian learning is beneficial for topic- based language model under the paradigm of latent Dirichlet allocation (Blei et al., 2003). A Bayesian topic-based language model shall be presented for speech recognition. This regularized language model is established according to the marginal likelihood over the uncertainties of latent topics and topic mixtures. The topic information is extracted from the n-gram events and directly applied for speech recognition. At last, I will summarize my viewpoints about the studies of Bayesian learning and address the other challenging topics of machine learning methods for speech