IEE Artificial Neural Networks (ANN 97)

NON-LINEAR SPEECH TRANSITION VISUALIZATION

K. Reinhard and M. Niranjan

July 1997

Modelling context effects and segmental transitions in speech recognition systems is very important. Explicitly modelling segmental transitions in a RNN framework would circumvent these problems. We present an interesting application of {\em Principal Curves}an algorithm to extract a non-linear summary of p-dimensional data firstly published in 1989 by Hastie/Stuetzle. The algorithm can be used to visualize non-linear transient characteristics in speech. We will show that between-phone characteristics found within diphones can be used as discriminant information to distinguish ambiguous phones. The technique used is explained and illustrated on the examples /{\it bah}/, /{\it dah}/ and /{\it gah}/.

