The Geometry of Statistical Machine Translation

Download: poster.

The Geometry of Statistical Machine Translation” by Aurelien Waite and William Byrne. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2015), June 2015.

Abstract

Most modern statistical machine translation systems are based on linear statistical models. One extremely effective method for estimating the model parameters is minimum error rate training (MERT), which is an efficient form of line optimisation adapted to the highly non-linear objective functions used in machine translation. We will show that MERT can be represented using convex geometry, which is the mathematics of polytopes and their faces. Using this geometric representation of MERT we investigate whether the optimisation of linear models is tractable in general. It has been believed that the number of feasible solutions of a linear model is exponential with respect to the number of sentences used for parameter estimation, however we show that the exponential complexity is instead due to the feature dimension. This result has important ramifications because it suggests that the current trend in building statistical machine translation systems by introducing very large number of sparse features is inherently not robust.

Download: poster.

BibTeX entry:

@inproceedings{waitegeomsmt15,
   author = {Aurelien Waite and William Byrne},
   title = {The Geometry of Statistical Machine Translation},
   booktitle = {Proceedings of the Conference of the North American
	Chapter of the Association for Computational Linguistics - Human
	Language Technologies (NAACL HLT 2015)},
   month = jun,
   year = {2015},
   url = {http://www.aclweb.org/anthology/N/N15/N15-1041.pdf}
}

Back to Bill Byrne publications.