[CUPA] [Univ of Cambridge]

Speak & Improve Challenge 2025: Spoken Language Assessment and Feedback

Table of Contents

1. Brief Introduction of the Challenge

In connection with the ISCA SLaTE 2025 Workshop and Cambridge University Press and Assessment, we are happy to introduce the Speak & Improve Challenge 2025 to the speech and language learning community. Our goal is to advance the technology in the field of spoken language assessment and feedback by making available a new rich dataset and proposing a variety of associated tasks.

The challenge offers a unique opportunity with the pre-release of the Speak & Improve Corpus 2025 from Cambridge University Press & Assessment. This dataset is derived from the Cambridge English Speak & Improve L2 (second language) English speaking practice tool and contains annotated recordings of a wide variety of L2 English learner speech on open (spontaneous) speaking tasks.

The challenge consists of four tasks designed to advance spoken language technology and improve automated language learning assessment and feedback: Spoken Language Assessment (SLA); Spoken Grammatical Error Correction Feedback (SGECF); Automatic Speech Recognition (ASR); and Spoken Grammatical Error Correction (SGEC). Each task has a closed and open track. Participants can do as many tasks as they would like.

2. Task Descriptions

Task 1: Automatic Speech Recognition (ASR)

This task aims to advance automatic speech recognition (ASR) in the context of L2 English learners’ speech, with a focus on pronunciation, fluency, and accents.

Task 2: Spoken Language Assessment (SLA)

This task evaluates learners' spoken responses and predicts scores that closely align with human assessments. Key language features such as pronunciation, fluency, intonation, and grammatical accuracy will be assessed.

Task 3: Spoken Grammatical Error Correction (SGEC)

Participants will focus on identifying and correcting grammatical errors in spoken language, including tense usage, subject-verb agreement, and sentence structure.

Task 4: Spoken Grammatical Error Correction Feedback (SGECF)

This task focuses on providing clear, actionable feedback on grammatical errors or spoken disfluencies, enhancing the usability of language-learning tools.

Closed Track and Open Track

To encourage broad participation and innovation, each challenge task has two tracks:

Participants can choose to participate in one or more tasks in either track. Baseline systems will be provided for each task to help participants in their development efforts.

3. Data Sets to Be Used

3.1 Challenge Dataset

Participants will be provided with a dataset from the Speak & Improve L2 English speaking practice tool, which includes annotated responses to a range of speaking tasks across proficiency levels from CEFR A2 to C1.

Data Breakdown:

The read-aloud Part 2 data will not be released as part of this Challenge to focus on open speaking tasks.

3.2 External Data

The rules for using external data sources differ between the closed and open tracks. Participants in the closed track are limited to the released data, while those in the open track may use publicly available external data sources.

4. Rules for Participation

5. Important Dates

6 Resources Links

7 Organisers