This is the companion webpage for the article published at DAFx 2015

Sparse Decomposition of Audio Signals Using a Perceptual Measure of Distortion. Application to Lossy Audio Coding.
Ichrak Toumi and Olivier Derrien.

Abstract:
State-of the art audio codecs use time-frequency transforms derived from cosine bases, followed by a quantification stage. The quantization steps are set according to perceptual considerations. In the last decade, several studies applied adaptive sparse time-frequency transforms to audio coding, e.g. on unions of cosine bases using a Matching-Pursuit-derived algorithm (Ravelli et al. IEEE ASLP Nov 2008). This was shown to significantly improve the coding efficiency. We propose another approach based on a variational algorithm, i.e. the optimization of a cost function taking into account both a perceptual distortion measure derived form a hearing model and a sparsity constraint, which favors the coding efficiency. In this early version, we show that, using a coding scheme without perceptual control of quantization, our method outperforms a codec from the literature with the same quantization scheme. In future work, a more sophisticated quantization scheme would probably allow our method to challenge standard codecs e.g. AAC.

A package containing all the audio files used for listening tests can be downloaded here.