MIL Speech Seminars 2006-2007

The MIL Speech Seminar series schedule for the Long Vacation 2007 was as follows:

10th July 2007	Professor Stephen Levinson (University of Illinois)	Can Robots Learn Language the Way Children Do?	Speech recognition machines are in use in more and more devices and services. Airlines, banks, and telephone companies provide information to customers via spoken queries. You can buy hand-held devices, appliances, and PCs that are operated by spoken commands. And, for around $100, you can buy a program for your laptop that will transcribe speech into text. Unfortunately, automatic speech recognition systems are quite error prone, nor do they understand the meanings of spoken messages in any significant way. I argue that to do so, speech recognition machines would have to possess the same kinds of cognitive abilities that humans display. Engineers have been trying to build machines with human-like abilities to think and use language for nearly 60 years without much success. Are all such efforts doomed to failure? Maybe not. I suggest that if we take a radically different approach, we might succeed. If, instead of trying to program machines to behave intelligently, we design them to learn by experiencing the real world in the same way a child does, we might solve the speech recognition problem in the process. This is the ambitious goal of the research now being conducted in my laboratory. To date, we have constructed three robots that have attained some rudimentary visual navigation and object manipulation abilities which they can perform under spoken command.
14th August 2007	Simon Keizer (University of Tilburg)	Multidimensional Dialogue Management and Dialogue Act Recognition using Bayesian Networks	The first part of the talk I will present my current work as a postdoc researcher on the PARADIME project (PARallel Agent-based DIalogue Management Engine). In the project, I have developed a dialogue manager (DM) that takes the multidimensional nature of communication into account. The DM supports the generation of multifunctional system utterances via dialogue act contributions from several agents, that each address one particular dimension of communication. The DM is integrated in the demonstration system for interactive question answering as developed within the Dutch research program IMIX (Interactive Multimodal Information eXtraction). Because of its particular relevance to the research on probabilistic dialogue modelling in Cambridge, the second part of the talk will be about my PhD research on using Bayesian Networks (BNs) for dialogue modelling. This work particularly consisted of experiments on using BNs for dialogue act recognition, involving the classification of Forward- and Backward-looking functions from a DAMSL based annotation scheme.
24th August 2007	Thomas Fang Zheng (Tsinghua University, Beijing)	Dialectal Chinese Speech Recognition	There are eight major dialectal regions in addition to Mandarin (Northern China) in China, which can be further divided into more than 40 sub-categories. Although the Chinese dialects share a written language and standard Chinese (Putonghua) is widely spoken in most regions, speech is still strongly influenced by the native dialects. This great linguistic diversity poses problems for automatic speech and language technology. Automatic speech recognition relies to a great extent on the consistent pronunciation and usage of words within a language. In Chinese, word usage, pronunciation, and syntax and grammar vary depending on the speaker's dialect. As a result speech recognition systems constructed to process standard Chinese (Putonghua) perform poorly for the great majority of the population. Efforts have been made as in JHU Summer Workshop 2004 to develop a general framework to model phonetic, lexical, and pronunciation variability in dialectal Chinese automatic speech recognition tasks. The goal was/is to find suitable methods that employ dialect-related knowledge and training data (in relatively small quantities) to modify the baseline standard Chinese Recognizer to obtain a dialectal Chinese recognizer for the specific dialect of interest. In this talk, work done in JHU Summer Workshop 2004 as well as that in the following years will be introduced, covering several aspects such as Dialectal Chinese database collection (for Wu, Min, Yue, Chuan and so on), Dialectal Lexicon Construction, and acoustic modeling.