Q2.3: Finding start and end points of a speech signal

End-point detection algorithms identify sections in an incoming audio signal that contain speech. Accurate end-pointing is a non-trivial task, however, reasonable behaviour can be obtained for inputs which contain only speech surrounded by silence (no other noises). Typical algorithms look at the energy or amplitude of the incoming signal and at the rate of "zero-crossings". A zero-crossing is where the audio signal changes from positive to negative or visa versa. When the energy and zero-crossings are at certain levels, it is reasonable to guess that there is speech. More detailed descriptions are provided in the papers cited below and in the documentation for the following software.

End-point detection software is available from:

Plenty of research papers have been presented on end-pointing. Try the following:


Back to Section 2 of the comp.speech FAQ Home Page.
Jump to SpeechLinks, [Q2.1], [Q2.2], [Q2.4], [Q2.5], [Q2.6], [Q2.7], [Q2.8]

Administrivia, Copyright, Submit Information : Last Revision: 14:11 13-May-1997