[Univ of Cambridge] [Dept of Engineering]

Mark Gales - 4th Year Projects


Here are the projects that will be offered for the year 2002-2003. Please look at the papers and associated links for more information. If you are interested in any of these projects it is important that you contact me so that we discuss the work involved in the project. Any queries, or request for further details, please contact me by email mjfg@eng.cam.ac.uk

There are some notes on statistical pattern processing on-line.

Multi-Class SVMs (E-MJFG1)

    Support vector machines (SVMs) are a popular and successful form of classifier. However one of the limitations of "standard" SVMs is that they can only handle binary (two class) classification. There are two basic approaches to handling the multi-class problem for SVMs. The first decomposes the task into a series of binary problems and then combines the results from multiple classifications to give the final result. Schemes of this form include "one-versus-the-rest" and "one-versus-one" systems and more recently error correcting output codes (ECOC). Alternatively the structure of the SVMs is modified to allow the use of multi-class data. The problem with schemes such as multi-class SVMs is that they are computationally expensive.

    This project will look at the various options and play-offs involved in multi-class classification with support vector machines. Initially the work will concentrate on artificial data. If time allows phone classifictaion experiments will be performed.

  • C-W Hsu and C-J Lin, (2001), A Comparison of Methods for Multi-Class Support Vector Machines.
  • J Weston and C Watkins (1998), Multi-Class Support Vector Machines. Royal Holloway Technical Report.
  • See the SVM book page for information about SVMs
top

SVMs as a Frontend for Speech Recognition (E-MJFG2)

    Support vector machines are a powerful static binary classification scheme. In recent years there has been interest in how to apply these schemes to handling dynamic, variable length, data, such as speech. Two approaches have been adopted. The first uses some transformation of the variable length data to generate a fixed length observation. The second converts the distance from the decision boundary into a probability for each frame. These probabilities are then multiplied together to give the probability of the data sequence.

    This project will look at an alternative approach. Rather then using the distance from the decision boundary to determine a probability, the classification output from the SVM will be used to determine a discrete output space. This discrete output space will then be used to train a discrete HMM system. The project will use speech data from a medium vocabulary speech recognition task to evaluate this form of frontend compared to a standard frontend. The appropriate set of SVMs to use for the frontend will be examined. This may make use of similar techniques to those described in the multi-class SVM system.

  • N. Smith and M.J.F. Gales (2001), Speech Recognition using SVMs. NIPS 2001.
  • J.C. Platt (1999), Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Advances in Large Margin Classifiers.
top

Minimax Probability Machine (E-MJFG3)

    One way to decide where to place a decision boundary is to make assumption about the class-conditional distributions of the data. This allows decision boundaries and estimates of the probability of error to be made. However, unless the actual distributions are known, the generality and validity of such an approach is questionable. As an alternative using class-conditional distributions can be dispensed with, for example support vector machines may be used. This project examines a recently proposed scheme that uses moments of the data, rather than assuming a specific form, to determine the position of the decision boundary. The decision boundary is positioned so that it minimises the maximum probability of error over all the distributions having the known (estimated) moments.

    This project will examine the nature of this form of linear classifier. For both artificial data and real data the performance of the classifier will be examined. In addition how well the maximum probability of error corresponds to that which can be obtained in theory on artificial data and empirically on real data. It is expected that the majority of the code implementation will be in matlab.

  • G Lanckriet, L Ghaoui, C Bhattacharyya and M Jordan (2001), Minimax Probability Machine. Presented at NIPS 2001.
top

Boosting Speech Recognition Systems (E-MJFG4)

    The current large vocabulary speech recognition systems used for evaluations typically combine multiple systems together using for example ROVER. Though performance gains have beenb obtained using these schemes, no systematic for generating systems that compliment one another have been investigated. This is the aim of this project.

    Boosting is a technique popular in machine learning for sequentially training and combining a collection of classifiers in such a way that later classifiers make up for deficiencies in earlier classifiers. In this fashion multiple classifiers may be trained and used. Recently it has been applied to a state-of-the-art speech recognition system. This project will look at boosting various complexity speech recognition systems. The various play-offs of number of parameters and recognition performance when systems are trained using convectional techniques versus multiple classifiers trained using boosting will be investigated. For more information about boosting see the references in the paper below.

  • G. Zweig and M. Padmanabhan, (2000), Boosting Gaussian Mixtures in an LVCSR System. Proceedings ICASSP 2000.
  • See the Boosting web page for a variety of related papers.
top
[ Cambridge University | CUED | SVR Group | Home]