Towards automatic transcription of spontaneous Czech speech in the MALACH project

“Towards automatic transcription of spontaneous Czech speech in the MALACH project” by J. Psutka, P. Ircing, J. V. Psutka, V. Radova, W. Byrne, J. Hajic, and S. Gustman. In Proceedings of the Text, Speech, and Dialog Workshop, 2003, pp. 327-332 (6 pages).

Abstract

Our paper discusses the progress achieved during a one-year effort in building the Czech LVCSR system for the automatic transcription of spontaneously produced testimonies in the MALACH project. The difficulty of this task stems from the highly inflectional nature of the Czech language and is further multiplied by the presence of many colloquial words in spontaneous Czech speech as well as by the need to handle emotional speech filled with disfluencies, heavy accents, age-related coarticulation and language switching. In this paper we concentrate mainly on the acoustic modeling issues - the proper choice of front-end paramterization, the handling of non-speech events in acoustic modeling, and unsupervised acoustic adaptation via MLLR. A method for selecting suitable language modeling data is also briefly discussed.

BibTeX entry:

@inproceedings{tsd03_czasr,
   author = {J. Psutka and P. Ircing and J. V. Psutka and V. Radova and W.
	Byrne and J. Hajic and S. Gustman},
   title = {Towards automatic transcription of spontaneous {C}zech speech
	in the {MALACH} project},
   booktitle = {Proceedings of the Text, Speech, and Dialog Workshop},
   pages = {327--332 (6 pages)},
   year = {2003}
}

Back to Bill Byrne publications.