Speech Processing II
The aim of this module is to provide core material on building modern
large vocabulary speech recognition systems, covering acoustic
modelling, language modelling and search. It follows on from the
material presented in the Michaelmas Speech Processing I course. The course also
provides an introduction to speaker recognition.
The lecture notes should be available online just before the lectures.
- Lecture 2: Hidden Markov Models I.
Review of HMMs: assumptions, structure, operation and training, continuous
density HMMs, continuous density HMMs with Gaussian mixture models.
Notes available in
[ps]
[pdf]
- Lecture 3: Hidden Markov Models II.
Discrete HMMs, semi-continuous HMMs, duration modelling.
Notes available in
[ps]
[pdf]
- Lecture 4: Issues in HMM Training.
Discriminative training, system trainability: MAP training,
and parameter tying, number of state components, building an LVCSR HMM system.
Notes available in
[ps]
[pdf]
- Lecture 8: Viterbi decoder Review.
Viterbi algorithm, token passing paradigm,
traceback, pruning.
Notes available in
[ps]
[pdf]
- Lecture 9: Issues in LVCSR search.
Incorporating N-gram language models, incorporating
context dependent acoustic models, tree structured lexicons.
Notes available in
[ps]
[pdf]
- Lecture 10: Generation and Use of Multiple Solutions.
N-best lists, lattices,
lattice expansion,.
Notes available in
[ps]
[pdf]
- Lecture 15: Introduction to Neural Networks.
Neural network architectures, multi-layer perceptrons, error-back propogation algorithm.
Notes available in
[ps]
[pdf]
- Lecture 16: Hybrid Systems.
Neural network for speech recognition, hybrid architecture.
Notes available in
[ps]
[pdf]
For those interested in more details about Neural Networks and
how they are trained, there are additional
lecture notes
on-line from a 4th year undergraduate module.
top
[ Cambridge University |
CUED |
SVR Group |
Home]
|