Speaker adaptation with All-Pass transforms

Download: PDF.

“Speaker adaptation with All-Pass transforms” by J. Mc{D}onough and W. Byrne. In International Conference on Acoustics, Speech, and Signal Processing, 1999, IEEE.

Abstract

In recent work, a class of transforms were proposed which achieve a remapping of the frequency axis much like conventional vocal tract length normalization. These mappings, known collectively as all-pass transforms (APT), were shown to produce substantial improvements in the performance of a large vocabulary speech recognition system when used to normalize incoming speech prior to recognition. In this application, the most advantageous characteristic of the APT was its cepstral-domain linearity; this linearity makes speaker normalization simple to implement, and provides for the robust estimation of the parameters characterizing individual speakers. In the current work, we exploit the APT to develop a speaker adaptation scheme in which the cepstral means of a speech recognition model are transformed to better match the speech of a given speaker. In a set of speech recognition experiments conducted on the Switchboard Corpus, we report reductions in word error rate of 3.7% absolute.

Download: PDF.

BibTeX entry:

@inproceedings{jmcd_icassp99,
   author = {J. Mc{D}onough and W. Byrne},
   title = {Speaker adaptation with All-Pass transforms},
   booktitle = {International Conference on Acoustics, Speech, and Signal
	Processing},
   pages = {(4 pages)},
   year = {1999},
   organization = {IEEE}
}

Back to Bill Byrne publications.