Hierarchical Phrase-based Translation Grammars Extracted from Alignment Posterior Probabilities

Hierarchical Phrase-based Translation Grammars Extracted from Alignment Posterior Probabilities” by A. de Gispert, J. Pino, and W. Byrne. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), (Cambridge, MA), 2010, pp. 545-554 (10 pages).

Abstract

We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distributions provided by the HMM word-to-word alignment model. We define translation grammars progressively by adding classes of rules to a basic phrase-based system. We assess these grammars in terms of their expressive power, measured by their ability to align the parallel text from which their rules are extracted, and the quality of the translations they yield. In Chinese-to-English translation, we find that rule extraction from posteriors gives translation improvements. We also find that grammars with rules with only one nonterminal, when extracted from posteriors, can outperform more complex grammars extracted from Viterbi alignments. Finally, we show that the best way to exploit source-to- target and target-to-source alignment models is to build two separate systems and combine their output translation lattices.

BibTeX entry:

@inproceedings{gispert10:emnlp,
   author = {A. de Gispert and J. Pino and W. Byrne},
   title = {Hierarchical Phrase-based Translation Grammars Extracted from
	Alignment Posterior Probabilities},
   booktitle = {Proceedings of the Conference on Empirical Methods in
	Natural Language Processing (EMNLP)},
   pages = {545--554 (10 pages)},
   address = {Cambridge, MA},
   year = {2010},
   url = {http://www.aclweb.org/anthology/D/D10/D10-1053.pdf}
}

Back to Bill Byrne publications.