Segmental Minimum Bayes-Risk Decoding for Automatic Speech Recognition

Segmental Minimum Bayes-Risk Decoding for Automatic Speech Recognition” by V. Goel, S. Kumar, and W. Byrne. IEEE Transactions on Speech and Audio Processing, vol. 12, May 2004, pp. 234-249 (16 pages).

Abstract

Minimum Bayes-Risk (MBR) speech recognizers have been shown to yield improvements over the search over word lattices. We present a Segmental Minimum Bayes-Risk decoding (SMBR) framework that simplifies the implementation of MBR recognizers through the segmentation of the N-best lists or lattices over which the recognition is to be performed. This paper presents lattice cutting procedures that underly SMBR decoding. Two of these procedures are based on a risk minimization criterion while a third one is guided by word-level confidence scores. In conjunction with SMBR decoding, these lattice segmentation procedures give consistent improvements in recognition word error rate (WER) on the Switchboard corpus. We also discuss an application of risk-based lattice cutting to multiplesystem SMBR decoding and show that it is related to other system combination techniques such as ROVER. This strategy combines lattices produced from multiple ASR systems and is found to give WER improvements in a Switchboard evaluation system. ≠wline Correction Available : In our recently published paper, we presented a risk-based lattice cutting procedure to segment ASR word lattices into smaller sub-lattices as a means to to improve the efficiency of Minimum Bayes-Risk (MBR) rescoring. In the experiments reported, some of the hypotheses in the original lattices were inadvertently discarded during segmentation, and this affected MBR performance adversely. This note gives the corrected results as well as experiments demonstrating that the segmentation process does not discard any paths from the original lattice.

BibTeX entry:

@article{smbr_spat03,
   author = {V. Goel and S. Kumar and W. Byrne},
   title = {Segmental Minimum {B}ayes-Risk Decoding for Automatic Speech
	Recognition},
   journal = {IEEE Transactions on Speech and Audio Processing},
   volume = {12},
   pages = {234--249 (16 pages)},
   month = may,
   year = {2004},
   url = {http://dx.doi.org/10.1109/TSA.2004.825678}
}

Back to Bill Byrne publications.