Matt Shannon
Position: PhD student in statistical speech synthesis
Email: sms46 AT cam ac uk
Supervisor: Dr Bill Byrne
Research
Acoustic modelling for statistical speech synthesis. For the first part of my PhD I was part of the EMIME (Effective Multilingual Interaction in Mobile Environments) project for personalized speech synthesis. I am currently part of the EPSRC Natural Speech Technology project.
Software
-
armspeech
Flexible python framework for probabilistic modelling of speech with a focus on autoregressive models. Allows the use of more complicated autoregressive-style output distributions than the HTS implementation below. Includes example experiments. -
Autoregressive HMM for HTS
An implementation of the autoregressive HMM built on top of HTS (HMM-based Speech Synthesis System). It provides the ability to do embedded re-estimation, decision tree clustering and synthesis using the autoregressive HMM. An example of how to use this implementation is available (autoregressive HMM version of the HTS demo).
Journal papers
-
M. Shannon, H. Zen and W. Byrne (2013)
Autoregressive models for statistical parametric speech synthesis
IEEE Transactions on Audio, Speech, and Language Processing (vol. 21, no. 3, pp. 587–597)
[ postprint | publisher ]
Conference papers
-
M. Shannon and W. Byrne (2013)
Fast, low-artifact speech synthesis considering global variance
Proc. ICASSP 2013
[ postprint ]
-
M. Shannon, H. Zen and W. Byrne (2011)
The effect of using normalized models in statistical speech synthesis
Proc. Interspeech 2011 (pp. 121–124)
[ postprint | publisher | slides ]
-
M. Shannon and W. Byrne (2010)
Autoregressive clustering for HMM speech synthesis
Proc. Interspeech 2010 (pp. 829–832)
[ postprint | publisher | poster ]
-
M. Shannon and W. Byrne (2009)
Autoregressive HMMs for speech synthesis
Proc. Interspeech 2009 (pp. 400–403)
[ postprint | publisher | slides ]
Technical reports
-
M. Shannon and W. Byrne (2013)
Partially analytic speech parameter generation considering global variance
Department of Engineering, University of Cambridge, UK
Technical Report CUED/F-INFENG/TR.682
(to appear; a preliminary version can be obtained by emailing me)
-
M. Shannon and W. Byrne (2012)
Viewing the trajectory HMM as a generalized autoregressive HMM
Department of Engineering, University of Cambridge, UK
Technical Report CUED/F-INFENG/TR.677
(to appear; a preliminary version can be obtained by emailing me)
-
M. Shannon and W. Byrne (2009)
A formulation of the autoregressive HMM for speech synthesis
Department of Engineering, University of Cambridge, UK
Technical Report CUED/F-INFENG/TR.629
[ pdf ]
Theses
-
M. Shannon and M.J.F. Gales (2008)
Sampling methods for instantaneous speaker adaptation
MPhil thesis, University of Cambridge, UK
[ final with corrections | final | slides ]
Other talks and posters
-
M. Shannon (April 2013)
Towards better probabilistic models of speech: why sampled trajectories sound bad and how to fix them – a discussion
Speech group seminar, University of Cambridge, UK
(led an informal discussion)
-
M. Shannon (work with W. Byrne) (Jan 2013)
Fast, low-artifact speech synthesis considering GV
NST project meeting, University of Cambridge, UK
[ slides ]
-
M. Shannon (work with W. Byrne) (Dec 2012)
An analysis of parameter generation considering global variance
UKSpeech two-day conference, University of Birmingham, UK
[ poster ]
-
M. Shannon (Mar 2011)
The effect of normalization – a case study in speech synthesis
Machine Learning RCC, University of Cambridge, UK
[ slides | talks.cam ]
-
M. Shannon and H. Zen (Jan 2011)
Modelling trajectories in statistical speech synthesis
Cambridge statistical speech synthesis (SSS) seminar series
University of Cambridge, UK
[ my slides | Heiga's slides | talks.cam ]
-
M. Shannon and S. Bratières (May 2010)
Topics in statistical machine translation
Machine Learning RCC, University of Cambridge, UK
[ talks.cam ]
-
M. Shannon (work with W. Byrne) (May 2010)
Autoregressive HMMs for speech synthesis
EMIME workshop, Cambridge, UK
[ slides ]
-
M. Shannon (work by Yee Whye Teh and others) (Jan 2009)
A hierarchical Bayesian language model based on Pitman-Yor processes
Machine Learning RCC, University of Cambridge, UK
[ talks.cam ]
Teaching
- demonstrator for the CSTIT MPhil course 2008-9 and 2009-10 and the Advanced Computer Science MPhil course 2012-13
Contact information
Baker Building, BE5-02
Engineering Department
Trumpington Street, Cambridge
CB2 1PZ, United Kingdom
