N-gram posterior probability confidence measures for statistical machine translation: an empirical study

N-gram posterior probability confidence measures for statistical machine translation: an empirical study” by A. de Gispert, G. Blackwood, G. Iglesias, and W. Byrne. Machine Translation, 2012, pp. 1-30 (31 pages), Springer Netherlands. Published online 1 September 2012.

Abstract

We report an empirical study of n -gram posterior probability confidence measures for statistical machine translation (SMT). We first describe an efficient and practical algorithm for rapidly computing n -gram posterior probabilities from large translation word lattices. These probabilities are shown to be a good predictor of whether or not the n -gram is found in human reference translations, motivating their use as a confidence measure for SMT. Comprehensive n -gram precision and word coverage measurements are presented for a variety of different language pairs, domains and conditions. We analyze the effect on reference precision of using single or multiple references, and compare the precision of posteriors computed from k -best lists to those computed over the full evidence space of the lattice. We also demonstrate improved confidence by combining multiple lattices in a multi-source translation framework.

BibTeX entry:

@article{springerlink:10.1007/s10590-012-9132-2,
   author = {A. de Gispert and G. Blackwood and G. Iglesias and W. Byrne},
   title = {N-gram posterior probability confidence measures for
	statistical machine translation: an empirical study},
   journal = {Machine Translation},
   pages = {1--30 (31 pages)},
   publisher = {Springer Netherlands},
   year = {2012},
   issn = {0922-6567},
   note = {Published online 1 September 2012},
   url = {http://dx.doi.org/10.1007/s10590-012-9132-2}
}

Back to Bill Byrne publications.