[Univ of Cambridge] [Dept of Engineering]

Speech Reading Club


This page contains information about the Speech Reading Club to be run in Lent term 2005. Any queries or problems please contact me by email mjfg@eng.cam.ac.uk

The speech reading club is only available to part-time students in their first year of study. It is not an option for part-time students in their second year.

If there is a theme, or topic within a theme, that you are very interested in that is not listed contacted me. We can discuss including it.

Aims

The aim of the speech reading club is to investigate a specific area mentioned in the Speech Processing modules, review the state of the art and understand how it might be applied in a modern speech processing task. Students are expected to look at a particular theme, and topic within that theme, and be able to present a concise summary of the area, along with a detailed essay on their particular topic.

top

Structure

The broad structure of the module is described below. The details may vary depending on the number of students that select the module.
  • By 18th November 2004 students will be asked to make a decision whether they are taking the Speech Reading Club module. If the module is chosen a preference for a particular theme/topic may be expressed. The maximum number of students allowed to take the Speech Reading Club is 60% of the CSTIT course.

  • Students will be divided into groups of 3-5. Associated with each group there is a particular theme. Each student in the group will also be assigned a particular topic within that theme. Where possible the assignment of themes and topics will reflect any preferences expressed. However, not all themes or topics will be run.

  • Papers associated with each theme/topic will be distributed, or made available on the web, at the start of Lent term. For each theme a 1 hour discussion session with the theme organiser will be organised during weeks 3 and 4 of Lent term. Specific problems may also be discussed via email.

  • During weeks 5 to 8 of Lent term a one to two hour presentations of each theme will be given. These will consist of a 5-10 minute overview of the theme, jointly prepared by all members of the group. Each student will then give a 15 minute presentation of their particular topic. Each presentation is followed by a short discussion about topic lead by the student and theme organiser.

    ALL students taking the Speech Reading Club module must attend ALL presentations.

  • Each student writes an essay that describes the general issues in the their theme area and a detailed discussion about their particular topic. The maximum word length for the essays is 5000 words. The essay must be handed in by 26th April 2005.
Important dates are:
  • 18th November 2004: decision on whether to do the speech reading club module.
  • 25th November 2004: list of topic and theme assignments circulated.
  • Weeks 5-8 Lent term: speech reading club presentations.
  • 26th April 2005: hand-in date for the speech reading club essays.
top

Themes

Here is a preliminary list of themes. If more details are required for any of the themes or topics contact me, or the theme organiser. If you are interested in a theme (area of speech processing/research) not mentioned below, suggestions are welcome. Not all themes will be run. The exact choice will depend on number of students and preferences.

top

Assessment

The assessment for the Speech Reading Club is by essay. The essay should give an overview of the theme and a detailed discussion of the particular topic within that theme. The maximum length of the essay is 5000 words. The submission date for the essays is April 26th 2005. Late submission will be penalised in the same fashion as late submission of practical work.

top

Themes and Topics References

The following references are quite extensive. If you are interested in getting a brief overview of the theme look in the reference marked with a (*), or look in the associated chapter of
  • X. Huang, A. Acero and H-W Hon, Spoken Language Processing, Prentice Hall.
Note: not all themes, or topics within a theme, will be run. The topics may vary slightly depending on the number of people who select a theme.
  • Acoustic Modelling
  • The references given for each of the themes and topics should be considered as starting points for further investigation. Students are expected to look at additional papers. If further help is required contact the theme organiser.

    For additional reading the last few years proceedings from ICASSP,Eurospeech and ICSLP are available.

    Please do not print out copies of the longer papers - contact me first.

    top


    SPEECH SYNTHESIS (Paul Taylor):

    Topics:
    • Acoustics of Speech Production
      • basics of how sound wvaes are produced and how they travel
      • source/filter model of speech production
      • models of vocal tracts as tubes
      • sound waves in tubes
      • sound sources, formants, linear prediction
    • Prosodic Modelling
      • basic prosody models
      • automatic recognition of prosody
      • synthesis of prosody
      • using prosodic information in speech recogntion
    References: top


    DISCRIMINATIVE TRAINING (Mark Gales and Bill Byrne):

    Topics:
    • Maximum mutual information (MMI) and frame discrimination (FD) training criteria;
    • Minimum classification error rate training criterion;
    • Discriminative training for large vocabulary systems;
    • Discriminative methods for speaker adaptation and feature extraction.
    References: top


    SPEAKER ADAPTATION (Mark Gales):

    Topics:
    • Speaker clustering, eigenvoices and cluster adaptive training;
    • Linear model-based adaptation schemes and adaptive training, maximum likelihood linear regression (MLLR) and constrained MLLR;
    • Maximum a-posteriori (MAP) adaptation schemes and extensions, e.g. adaptation by correlation.
    References: top


    NOISE ROBUSTNESS (Mark Gales):

    Topics:
    • Enhancement schemes (including model-based enhancement);
    • Predictive and adaptive model-based compensation schemes;
    • Inherently robust frontends and models.
    References: top


    STATISTICAL MACHINE TRANSLATION (Bill Byrne):

    Topics:
    • Bitext Alignment
    • Word Alignment in Bitext
    • Models and Algorithms for Statistical Translation
    References: top


    BAYESIAN NETWORKS AND SEGMENT MODELS (Mark Gales):

    Topics:
    • Dynamic Bayesian networks and graphical models for speech recognition;
    • Distributed representations for speech recognition;
    • Linear dynamical and factor analysed systems;
    • Efficient covariance modelling.
    References: top


    [ Cambridge University | CUED | SVR Group | Home]