Age-Agnostic Speaker Verification for Children and Adults
Work with Speech Processing and Auditory Perception Lab at UCLA
Speaker verification (SV) systems trained on adult speech perform poorly when applied to children, because the acoustic mismatch between children’s and adults’ speech degrades verification accuracy on children’s speaker verification (C-SV). Domain adaptation can recover some of this performance, but it typically does so at the cost of a significant drop on adults’ speaker verification (A-SV) — trading one population for the other. This work proposes an Age-Agnostic Speaker Verification (AASV) system that is robust across both C-SV and A-SV. The approach uses a domain classifier to disentangle age-related attributes from speech, then expands the embedding space using the extracted domain information to form a unified speaker representation that stays highly discriminative across age groups. Experiments on the OGI and VoxCeleb datasets show that AASV bridges the verification performance gap between children and adults, laying the foundation for inclusive and age-adaptive speaker verification systems.
This work was presented at WOCCI 2025, and can be accessed here