ADAPTING SEMI-TIED FULL-COVARIANCE MATRIX HMMS

M.J.F. Gales

August 1997

There is normally a simple choice made in the form of the covariance matrix to be used with HMMs. Either a diagonal covariance matrix is used, with the underlying assumption that elements of the feature vector are independent, or a full or block-diagonal matrix is used, where all or some of the correlations are explicitly modelled. Unfortunately when using full or block-diagonal covariance matrices there tends to be a dramatic increase in the number of parameters per Gaussian component, limiting the number of components which may be robustly estimated. This paper investigates a recently introduced form of covariance matrix, the semi-tied full-covariance matrix. This allows a few ``full'' covariance matrices to be shared over many distributions, whilst each distribution maintains its own ``diagonal'' covariance matrix. In current systems it is essential to be able to rapidly adapt the acoustic models to a particular speaker or new acoustic environment. This paper examines two linear-transformation speaker-adaptation schemes that may be applied to these semi-tied models. Both yield maximum likelihood estimates of the transform, but differ in the domains in which the transforms are estimated. A large-vocabulary speaker-independent speech-recognition task was used to assess the performance of the techniques. Both the adaptation schemes showed gains in performance. Depending on the semi-tied model set used and the adaptation scheme improvements over the unadapted models ranged from 3% to 11% relative. Furthermore, a 9% relative reduction in word error rate was achieved over the standard model set adapted using maximum likelihood linear regression.

