[Univ of Cambridge]alt[Dept of Engineering]

MIL Speech Seminars 2003-2004

The MIL Speech Seminar series schedule for Easter Term 2004 was as follows:

15th June 2004 Karl Weilhammer (MIL RA) Language Models with Structural Elements A novel approach to language modelling will be presented. It is based on the idea, that by inserting "structural elements" in a text, new n-grams involving these "virtual words" are created. This will lead to more robust language models, because n-grams that have not been observed in the training can be bridged by n-grams involving structural elements. After a short introduction in the structural element language model, different probability distributions are discussed and a training algorithm, that is related to clustering methods is presented. In the second part the structural element language model is formulated as an HMM, which has the advantage, that a well known statistical framework can be used along with the Baum-Welch training algorithm. Finally both realisations of the structural element model were compared with standard language models. On the German Verbmobil corpus the results for the structural element HMM are better than that of the algorithm based on clustering. The HMM variant performs comparable or slightly better than standard models.