Investigating Automatic and Human Filled Pause Insertion for Synthetic Speech

“Investigating Automatic and Human Filled Pause Insertion for Synthetic Speech” by Rasmus Dall, Marcus Tomalin, Mirjam Wester, William Byrne, and Simon King. In Proceedings of INTERSPEECH, Sep. 2014.

Abstract

Filled Pauses are pervasive in conversational speech and have been shown to serve a range of psychological and structural purposes. Despite this, they are seldom modelled overtly by state-of-the-art speech synthesis systems. This paper seeks to motivate the incorporation of filled pauses into speech synthesis systems by exploring their use in conversational speech, and by comparing the performance of several automatic systems that insert filled pauses into fluent texts. Two initial experiments are described which seek to determine whether people's predictions about appropriate insertion points for filled pauses are consistent with actual practice and/or with each other. The experiments also investigate whether there are 'right' and 'wrong' places to insert filled pauses in a given sentence. The results summarised in this paper show good consistency between people's predictions of usage and their actual practice, as well as a perceptual preference for the 'right' placement. The third experiment contrasts the performance of several automatic systems that insert filled pauses into fluent sentences. The best performance (as determined by precision, recall and F-measure) was produced by interpolating a Recurrent Neural Network and a 4gram Language Model. The research presented in this paper offers new insights into the way in which filled pauses are used and perceived by humans, and how automatic systems can be used to predict the locations of filled pauses in fluent input text.

BibTeX entry:

@inproceedings{Dall_Interspeech14,
   author = {Rasmus Dall and Marcus Tomalin and Mirjam Wester and William
	Byrne and Simon King},
   title = {Investigating Automatic and Human Filled Pause Insertion for
	Synthetic Speech},
   booktitle = {Proceedings of {INTERSPEECH}},
   pages = {(4 pages)},
   month = sep,
   year = {2014}
}

Back to Bill Byrne publications.