Q6.3: How can I build a simple speech recogniser?

QUICKY RECOGNIZER sketch:

Doug Danforth provides a detailed account in article 253 in the comp.speech archives. A summary is provided below. It is also available by anonymous ftp

ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/info/DIY_SpeechRecognition

This is a simple recognizer that should give you 85%+ recognition accuracy. The accuracy is a function of the words you have in your vocabulary. Long distinct words are easy. Short similar words are hard. You can get 98+% on the digits with this recognizer.

Overview:

Many variations upon the theme can be made to improve the performance. Try different filtering of the raw signal and different processing methods.

Public Domain Recognition Software

Q6.5 contains information on public domain speech recognition software including: Lotec and Myers' Hidden Markov Model software.

Discrete Hidden Markov Model Demonstration Software

Hidden Markov Models (HMMs) are widely used in speech recognition systems. Joe Picone has put together some demonstration software for basic discrete HMMs including Viterbi and Baum-Welch training and evaluation, random sequence generation (generating data from a model), and model updating (useful for incremental training). There is a simple demo program that supports all of these modes from command line arguments. This allows experiments to test the classic coin-toss examples commonly described in textbooks. The code closely parallels the following textbook:

The code is written in C++ and is intended to facilitate learning and understanding of the algorithms. The code is available on the ISIP web site:
http://www.isip.msstate.edu/software/

Lecture notes corresponding to the examples are also available:
http://www.isip.msstate.edu/publications/1996/speech_recognition_short_course


Back to Section 6 of the comp.speech FAQ Home Page.
Jump to SpeechLinks, [Q6.1], [Q6.2], [Q6.4], [Q6.5], [Q6.6], [Q6.7]

Administrivia, Copyright, Submit Information : Last Revision: 13:13 07-Aug-1996