Minimum Risk Acoustic Clustering for Multilingual Acoustic Model Compination

Download: PDF.

“Minimum Risk Acoustic Clustering for Multilingual Acoustic Model Compination” by D. Vergyri, S. Tsakalidis, and W. Byrne. In International Conference on Spoken Language Processing, 2000.


In this paper we describe procedures for combining multiple acoustic models, obtained using training corpora from different languages, in order to improve ASR performance in languages for which large amounts of training data are not available. We treat these models as multiple sources of information whose scores are combined in a log-linear model to compute the hypothesis likelihood. The model combination can either be performed in a static way, with constant combination weights, or in a dynamic way, with parameters that can vary for different segments of a hypothesis. The aim is to optimize the parameters so as to achieve minimum word error rate. In order to achieve robust parameter estimation in the dynamic combination case, the parameters are defined to be piecewise constant on different phonetic classes that form a partition of the space of hypothesis segments. The partition is defined, using phonological knowledge, on segments that correspond to hypothesized phones. We examine different ways to define such a partition, including an automatic approach that gives a binary tree structured partition which tries to achieve the minimum WER with the minimum number of classes.

Download: PDF.

BibTeX entry:

   author = {D. Vergyri and S. Tsakalidis and W. Byrne},
   title = {Minimum Risk Acoustic Clustering for Multilingual Acoustic
	Model Compination},
   booktitle = {International Conference on Spoken Language Processing},
   pages = {(4 pages)},
   year = {2000}

Back to Bill Byrne publications.