Pushdown Automata in Statistical Machine Translation

Pushdown Automata in Statistical Machine Translation” by C. Allauzen, W. Byrne, A. de Gispert, G. Iglesias, and M. Riley. Computational Linguistics, 2014.


This paper describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata (FSA) representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger SCFGs and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of- the-art performance for large-scale SMT.

BibTeX entry:

   author = {C. Allauzen and W. Byrne and A. de Gispert and G. Iglesias
	and M. Riley},
   title = {Pushdown Automata in Statistical Machine Translation},
   journal = {Computational Linguistics},
   year = {2014},
   url = {http://www.aclweb.org/anthology/J/J14/J14-3008.pdf}

Back to Bill Byrne publications.