Chapter 7

Audio Coding

7.1. Principles of “perceptual coders”

We show, in Figure 7.1, the amplitude of a violin signal as a function of time (Figure 7.1(a)) and the repartitioning of the signal power as a function of frequency (Figure 7.1(b)).

As for speech coders, an audio coding algorithm is essentially a loop which consists first of filling a buffer with N samples as shown in Figure 7.2, processing these N samples and then passing on to the next analysis frame. The analysis frames always overlap; the shift between two analysis frames is characterized by the parameter M < N. The vector image(m) is, therefore, of the form:

image

where the operator ⊗ represents a multiplication of two vectors component by component and where image is a weighting window. The analysis frames are generally around 20 ms in length, the parameter N which must therefore be 44.1 x 20 = 882 is equal to 512 (MPEG-1 and AC3) or 2048 (MPEG-2 AAC). The value of the parameter M depends on the coder: image for the MPEG-1 coder, M = N/2 for the AC3 and MPEG-2 AAC coders.

The diagram of a perceptual coder shows that it is composed of three ...

Get Tools for Signal Compression: Applications to Speech and Audio Coding now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.