Clarity Challenge for Speech Enhancement in Hearing aid

A result of Speech Enhancement

While developing speech enhancement for embedded devices targeting STM32F746, the chance to join 2023 ICASSP Clarity Challenge for Speech enhancement in hearing aid was coming.

In this challenge, it focus on amplificed speech quality using HASPI and HASQI metric, which can assess after processing amplification and compression for hearing-aid application. I focus on the problem of Speech Enhancement, Noise Reduction and Source Separation since the dataset in the challenge included several speakers (2 spks and 2 noise sources). It used Conv-Tasnet using Permutation Invariant Training(PIT). The repositories include two separate parts, one is the deep learning model, and the other is the hearing aid process/evaluation provided in the Clarity challenge. Those details includes belows applications. The results can show Top 5 rank, and I will continue to improve the model and the guide to implement the model to the tiny device for next project.

Applications

Dataset/Dataloader for clean and noisy sound
Training/Evaluating Model
Amplification/Compression for Evaluation
Amplification/Compression using Pytorch

Code for Model

Reference
[1] Kates, J.M. and Arehart, K.H., 2021. The hearing-aid speech perception index (HASPI) version 2. Speech Communication, 131, pp.35-46.
[2] Kates, J.M. and Arehart, K.H., 2014. “The hearing-aid speech quality index (HASQI) version 2”. Journal of the Audio Engineering Society. 62 (3): 99–117.
[3] Luo, Yi, and Nima Mesgarani. “Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation.” IEEE/ACM transactions on audio, speech, and language processing 27.8 (2019): 1256-1266.
[4] Kolbæk, Morten, et al. “Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 25.10 (2017): 1901-1913.

PREVIOUSGit Recipe

NEXTProcess and Thread