The JIBO Kids Corpus

Work with Speech Processing and Auditory Perception Lab at UCLA

The JIBO Kids Corpus is a speech dataset of child-robot interactions recorded in elementary school classrooms, designed to support research on children’s conversational speech, automatic speech recognition for children, and human-robot interaction in education. The corpus captures naturalistic dialogue between students and the JIBO social robot during structured classroom learning activities, providing material for studying spontaneous child speech under realistic acoustic conditions (classroom noise, multi-talker babble, far-field microphones). This paper describes the data collection protocol, recording setup, demographics of the speakers, annotation schema, and provides baseline ASR performance numbers across several modern speech foundation models, illustrating the challenges that classroom-recorded child-robot interaction poses to current systems.

This work was published in JASA Express Letters, and can be accessed here

The associated repository can be accessed here