Proc. ESCA workshop: Comparing Speech Signal Representations


Tony Robinson

April 1992

This paper looks at the data representations used in recurrent networks for two of the supplied sentences for the workshop. One sentence from the database on which the network was trained (timit) is used to illustrate the input, state and output representations for clean speech. Another sentence (clean) is used to illustrate the degradation that results from different recording conditions. Gradient descent in the input space is used on the second sentence so as to make the output better conform to the assumed pronunciation.

