5.7 EXAMPLE CODEC PERCEPTUAL MODEL: ISO/IEC 11172-3 (MPEG - 1) PSYCHOACOUSTIC MODEL 1

It is useful to consider an example of how the psychoacoustic principles described thus far are applied in actual coding algorithms. The ISO/IEC 11172-3 (MPEG-1, layer 1) psychoacoustic model 1 [ISOI92] determines the maximum allowable quantization noise energy in each critical band such that quantization noise remains inaudible. In one of its modes, the model uses a 512-point FFT for high-resolution spectral analysis (86.13 Hz), then estimates for each input frame individual simultaneous masking thresholds due to the presence of tone-like and noise-like maskers in the signal spectrum. A global masking threshold is then estimated for a subset of the original 256 frequency bins by (power) additive combination of the tonal and nontonal individual masking thresholds. The remainder of this section describes the step-by-step model operations. Sample results are given for one frame of CD-quality pop music sampled at 44.1 kHz/16-bits per sample. We note that although this model is suitable for any of the MPEG-1 coding layers I–III, the standard [ISOI92] recommends that model 1 be used with layers I and II, while model 2 is recommended for layer III (MP3). The five steps leading to computation of global masking thresholds are described in the following Sections.

5.7.1 Step 1: Spectral Analysis and SPL Normalization

Spectral analysis and normalization are performed first. The goal of this step is to obtain ...

Get Audio Signal Processing and Coding now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.