A dataset and model for recognition of audiologically relevant environments for hearing aids: AHEAD-DS and YAMNet+
Paper
• 2508.10360 • Published
This repository contains the models for OpenYAMNet/YAMNet+, introduced in the paper A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet.
OpenYAMNet is a sound recognition model designed for deployment on edge devices like smartphones connected to hearing devices (e.g., hearing aids and wireless earphones). It serves as a baseline model for sound-based scene recognition.
OpenYAMNet achieved the following results on the testing set of the AHEAD-DS dataset:
The model is optimized for real-time use, with approximately 50ms of latency for loading and a processing time of ~30ms per 1 second of audio on a 2018 Google Pixel 3.
| Classes |
|---|
| cocktail_party |
| interfering_speakers |
| in_traffic |
| in_vehicle |
| music |
| quiet_indoors |
| reverberant_environment |
| wind_turbulence |
| speech_in_traffic |
| speech_in_vehicle |
| speech_in_music |
| speech_in_quiet_indoors |
| speech_in_reverberant_environment |
| speech_in_wind_turbulence |
Licenced under CC BY-SA 4.0. See LICENCE.txt. Licence for original YAMNet weights.
@misc{zhong2026datasetmodelauditoryscene,
title={A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet},
author={Henry Zhong and Jörg M. Buchholz and Julian Maclaren and Simon Carlile and Richard Lyon},
year={2026},
eprint={2508.10360},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2508.10360},
}