FAST, LOW-ARTIFACT SPEECH SYNTHESIS CONSIDERING GLOBAL VARIANCE

Download: PDF, poster.

FAST, LOW-ARTIFACT SPEECH SYNTHESIS CONSIDERING GLOBAL VARIANCE” by M. Shannon and W. Byrne. In Proceedings of IEEE Conference on Acoustics, Speech and Signal Processing, June 2013.

Abstract

Speech parameter generation considering global variance (GV generation) is widely acknowledged to dramatically improve the quality of synthetic speech generated by HMM-based systems. However it is slower and has higher latency than the standard speech parameter generation algorithm. In addition it is known to produce artifacts, though existing approaches to prevent artifacts are effective. In this paper we present a simple new mathematical analysis of speech parameter generation considering global variance based on Lagrange multipliers. This analysis sheds light on one source of artifacts and suggests a way to reduce their occurrence. It also suggests an approximation to exact GV generation that allows fast, low latency synthesis. In a subjective evaluation the naturalness of our fast approximate algorithm is as good as conventional GV generation.

Download: PDF, poster.

BibTeX entry:

@inproceedings{shannonicassp13,
   author = {M. Shannon and W. Byrne},
   title = {FAST, LOW-ARTIFACT SPEECH SYNTHESIS CONSIDERING GLOBAL VARIANCE},
   booktitle = {Proceedings of IEEE Conference on Acoustics, Speech and
	Signal Processing},
   month = jun,
   year = {2013},
   url = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6639196}
}

Back to Bill Byrne publications.