Abstract for burrows_icassp95

Proc. ICASSP'95


Tina Burrows and Mahesan Niranjan


In this paper, the speech production system is modelled using the true glottal excitation as the source and a recurrent neural network to represent the vocal tract. The hidden nodes have multiple delays of one and two samples, making the network equivalent to a parallel formant synthesiser in the linear regions of the hidden node sigmoids. An ARX model identification is carried out to initialise the neural network parameters. These parameters are re-estimated in an analysis-by-synthesis framework to minimise the synthesis (output) error. Unlike other analysis-by-synthesis speech production models such as CELP, the source and filter in this approach are decoupled, enabling manipulation of the source time-scale to achieve high quality pitch changes.

(ftp:) burrows_icassp95.ps.Z (http:) burrows_icassp95.ps.Z
PDF (automatically generated from original PostScript document - may be badly aliased on screen):
  (ftp:) burrows_icassp95.pdf | (http:) burrows_icassp95.pdf

If you have difficulty viewing files that end '.gz', which are gzip compressed, then you may be able to find tools to uncompress them at the gzip web site.

If you have difficulty viewing files that are in PostScript, (ending '.ps' or '.ps.gz'), then you may be able to find tools to view them at the gsview web site.

We have attempted to provide automatically generated PDF copies of documents for which only PostScript versions have previously been available. These are clearly marked in the database - due to the nature of the automatic conversion process, they are likely to be badly aliased when viewed at default resolution on screen by acroread.