

Department of Engineering  
University of Cambridge > Engineering Department > Machine Intelligence Lab 
THE GENERATION AND USE OF REGRESSION CLASS TREES FOR MLLR ADAPTATION
Mark Gales
August 1996
Maximum likelihood linear regression (MLLR) is an adaptation technique suitable for both speaker and environmental modelbased adaptation. The models are adapted using a set of linear transformations, estimated in a maximum likelihood fashion from the available adaptation data. As these transformations can capture general relationships between the original model set and the current speaker, or new acoustic environment, they can be effective in adapting all the HMM distributions with limited adaptation data. Two important decisions that must be made are (i) how to cluster components together, such that they all have a similar transformation matrix, and (ii) how many transformation matrices to generate for a given block of adaptation data. This paper addresses both problems. Firstly it describes two optimal clustering techniques, in the sense of maximising the likelihood of the adaptation data. The first assigns each component to one of the regression classes. This may be used to generate standard regression class trees. The second scheme performs a {\em fuzzy} assignment of base class to regression class, so the transformation associated with each component is a linear combination of a set of transformations. Secondly two schemes are examined which address the problem of how to determine the number of regression classes, transforms, for a given amount of adaptation data. Two schemes are examined here. A crossvalidation scheme based on the auxiliary function of the adaptation data is described. Another scheme based on the use of iterative MLLR is also detailed. Both these schemes require no apriori thresholding information. An initial evaluation of the techniques was performed using data from the ARPA 1994 test data. On this task, though ``good'' trees, in terms of the likelihood of the adaptation training data were generated, neither of the optimal clustering schemes yielded gains in recognition performance. The performance of the crossvalidation scheme was found to be comparable to an empirically determined threshold scheme. The best performance was achieved using iterative MLLR, which outperformed both fixed classes and threshold based schemes.
If you have difficulty viewing files that end '.gz'
,
which are gzip compressed, then you may be able to find
tools to uncompress them at the gzip
web site.
If you have difficulty viewing files that are in PostScript, (ending
'.ps'
or '.ps.gz'
), then you may be able to
find tools to view them at
the gsview
web site.
We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database  due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.
 Search  CUED  Cambridge University  
©
2005 Cambridge University Engineering Dept Information provided by milabmaintainer 