Speak & Improve Challenge 2025

1. Brief Introduction of the Challenge

In connection with the ISCA SLaTE 2025 Workshop and Cambridge University Press and Assessment, we are happy to introduce the Speak & Improve Challenge 2025 to the speech and language learning community. Our goal is to advance the technology in the field of spoken language assessment and feedback by making available a new rich dataset and proposing a variety of associated tasks.

The challenge offers a unique opportunity with the pre-release of the Speak & Improve Corpus 2025 from Cambridge University Press & Assessment. This dataset is derived from the Cambridge English Speak & Improve L2 (second language) English speaking practice tool and contains annotated recordings of a wide variety of L2 English learner speech on open (spontaneous) speaking tasks.

The challenge consists of four tasks designed to advance spoken language technology and improve automated language learning assessment and feedback: Spoken Language Assessment (SLA); Spoken Grammatical Error Correction Feedback (SGECF); Automatic Speech Recognition (ASR); and Spoken Grammatical Error Correction (SGEC). Each task has a closed and open track. Participants can do as many tasks as they would like.

2. Task Descriptions

Task 1: Automatic Speech Recognition (ASR)

This task aims to advance automatic speech recognition (ASR) in the context of L2 English learners’ speech, with a focus on pronunciation, fluency, and accents.

Baseline: OpenAI Whisper small model
Evaluation: Speech Word Error Rate (SpWER)

Task 2: Spoken Language Assessment (SLA)

This task evaluates learners' spoken responses and predicts scores that closely align with human assessments. Key language features such as pronunciation, fluency, intonation, and grammatical accuracy will be assessed.

Baseline: A cascaded ASR and text grader system
Evaluation: RMSE, Pearson correlation (PCC), Spearman’s rank (SRC), and percentage of predictions within 0.5 and 1.0 points

Task 3: Spoken Grammatical Error Correction (SGEC)

Participants will focus on identifying and correcting grammatical errors in spoken language, including tense usage, subject-verb agreement, and sentence structure.

Baseline: ASR, disfluency remover, and text-based GEC system
Evaluation: Word Error Rate (WER) and Translation Edit Rate (TER)

Task 4: Spoken Grammatical Error Correction Feedback (SGECF)

This task focuses on providing clear, actionable feedback on grammatical errors or spoken disfluencies, enhancing the usability of language-learning tools.

Baseline: Same as Task 3
Evaluation: ERRANT F0.5 based on MaxMatch (M2) edits

Closed Track and Open Track

To encourage broad participation and innovation, each challenge task has two tracks:

Closed Track: Participants can only use pre-trained models from the baseline system, the provided dataset, and named datasets used for training.
Open Track: Participants can use any publicly available data and pre-trained models.

Participants can choose to participate in one or more tasks in either track. Baseline systems will be provided for each task to help participants in their development efforts.

3. Data Sets to Be Used

3.1 Challenge Dataset

Participants will be provided with a dataset from the Speak & Improve L2 English speaking practice tool, which includes annotated responses to a range of speaking tasks across proficiency levels from CEFR A2 to C1.

Data Breakdown:

Part 1: Interview with 8 short responses.
Part 2: Read Aloud with 8 sentences.
Part 3: Long Turn 1 - Express an opinion on a specific topic.
Part 4: Long Turn 2 - Present a graphic.
Part 5: Communication Activity with 5 questions on an overall topic.

The read-aloud Part 2 data will not be released as part of this Challenge to focus on open speaking tasks.

3.2 External Data

The rules for using external data sources differ between the closed and open tracks. Participants in the closed track are limited to the released data, while those in the open track may use publicly available external data sources.

4. Rules for Participation

Eligibility: The challenge is open to academic, industry, and independent researchers working on spoken language processing, subject to the data license agreement.
Data Usage: In the closed track, participants can only use the provided dataset, pre-trained models, and datasets used in the baseline systems. In the open track, any publicly available data and pre-trained models can be used.
Evaluation Metrics: Each track has specific evaluation metrics, detailed in the task descriptions.
Submissions: Participants can submit up to one submission per day, with a maximum of 7 submissions per team during the competition period

5. Important Dates

Release of training data, development data, and baseline systems: December 17, 2024
Evaluation data (audio and ASR transcriptions) release and opening of submission site: 26th March 2025 (midnight anywhere in the world, i.e., 12pm UTC on 27th March)
Closing of submission site: 2nd April 2025 (midnight anywhere in the world, i.e., 12pm UTC on 3rd)
Submission for system description: 4th April 2025 (AOE)
Announcement of results: 9th April 2025
SLaTE paper submission deadline: 22nd May 2025

7 Organisers

Mengjie Qian, mq227@cam.ac.uk, University of Cambridge, UK
Kate Knill, kmk1001@cam.ac.uk, University of Cambridge, UK
Stefano Bannò, sb2549@cam.ac.uk, University of Cambridge, UK
Mark Gales, mjfg@eng.cam.ac.uk, University of Cambridge, UK
Penny Karanasou, pk407@cam.ac.uk, University of Cambridge, UK
Diane Nicholls, diane.nicholls@cambridge.org, Cambridge University Press & Assessment, UK
Siyuan Tang, st941@cam.ac.uk, University of Cambridge, UK
Jing Xu, jing.xu@cambridge.org, Cambridge University Press & Assessment, UK

Speak & Improve Challenge 2025: Spoken Language Assessment and Feedback

Table of Contents