Computer-assisted Pronunciation Teaching
Summary of Research Project
This paper investigates a method of automatic pronunciation scoring
for use in computer-assisted language learning systems. The method
utilises a likelihood-based `Goodness of Pronunciation' (GOP) measure
which is extended to include individual thresholds for each phone
based on both averaged native confidence scores and on rejection
statistics provided by human judges. Further improvements are
obtained by incorporating models of the subject's native language and by
augmenting the recognition networks to include expected pronunciation
errors. The various GOP measures are assessed using a specially
recorded database of non-native speakers which has been annotated to mark
phone-level pronunciation errors. Since pronunciation assessment is
highly subjective, a set of four performance measures has been
designed, each of them measuring different aspects of how well
computer-derived phone-level scores agree with human scores. These
performance measures are used to cross-validate the reference
annotations and to assess the basic GOP algorithm and its refinements.
The experimental results suggest that a likelihood-based pronunciation
scoring metric can achieve usable performance, especially after
applying the various enhancements.
|
Publications:
S.W. Witt and S.J. Young. Performance Measures for Phone-Level Pronunciation Teaching in CALL. In Proceedings STiLL 1998.
S.W. Witt and S.J. Young. Language Learning based on Non-native Speech Recognition. In Proceedings EUROSPEECH 1997.
<
S.W. Witt and S.J. Young. Computer-assisted Pronunciation Teaching based on Automatic Speech Recognition. In Proceedings Language Teaching and Technology 1997.
Speech research related links:
Back to my homepage