Speech Processing II

The aim of this module is to provide core material on building modern large vocabulary speech recognition systems, covering acoustic modelling, language modelling and search. It follows on from the material presented in the Michaelmas Speech Processing I course. The course also provides an introduction to speaker recognition.

The lecture notes should be available online just before the lectures.

Lecture 2: Hidden Markov Models I.
Review of HMMs: assumptions, structure, operation and training, continuous density HMMs, continuous density HMMs with Gaussian mixture models.
Notes available in [ps] [pdf]
Lecture 3: Hidden Markov Models II.
Discrete HMMs, semi-continuous HMMs, duration modelling.
Notes available in [ps] [pdf]
Lecture 4: Issues in HMM Training.
Discriminative training, system trainability: MAP training, and parameter tying, number of state components, building an LVCSR HMM system.
Notes available in [ps] [pdf]
Lecture 8: Viterbi decoder Review.
Viterbi algorithm, token passing paradigm, traceback, pruning.
Notes available in [ps] [pdf]
Lecture 9: Issues in LVCSR search.
Incorporating N-gram language models, incorporating context dependent acoustic models, tree structured lexicons.
Notes available in [ps] [pdf]
Lecture 10: Generation and Use of Multiple Solutions.
N-best lists, lattices, lattice expansion,.
Notes available in [ps] [pdf]
Lecture 15: Introduction to Neural Networks.
Neural network architectures, multi-layer perceptrons, error-back propogation algorithm. Notes available in [ps] [pdf]
Lecture 16: Hybrid Systems.
Neural network for speech recognition, hybrid architecture.
Notes available in [ps] [pdf]

For those interested in more details about Neural Networks and how they are trained, there are additional lecture notes on-line from a 4th year undergraduate module.

top

[ Cambridge University | CUED | SVR Group | Home]