Communications Research Centre Canada
Symbol of the Government of Canada

PERCEVAL - A Computational Model of the Ear

Listeners in audio experiments are often asked to compare two or more different signals in order to generate a response consistent with the experimenter's instructions. In particular, some experiments require detection of the difference between two signals, while others require an indication of the affective response to the difference. In many situations where a measure of the perceived difference is required, an experiment with human listeners is impractical. Perceval, a computational model of an average listener, was created to predict task-specific, human responses.

Perceval models the transfer characteristics of the middle and inner ear to form an internal representation of the signal. Successive audio segments of approximately forty msec duration are analyzed by the model, with a fifty percent overlap between successive segments. A Hann window, followed by a Fast Fourier Transform (FFT), is applied to each segment to produce a time-frequency representation. The energy spectrum obtained from the transform is attenuated by a frequency dependent function that models the effect of the ear canal and the middle ear. The resulting spectral energy values are mapped from the frequency scale to a pitch scale that is more linear with respect to both the physical properties of the inner ear and observed psychophysical effects. The energy in the pitch domain is then convolved with a spreading function to simulate the dispersion of energy along the basilar membrane of the ear. Finally, an intrinsic frequency-dependent energy is added to each pitch component to account for the absolute threshold of hearing. Conversion of the energy to decibels results in a basilar membrane representation of the signal. The successive stages of the ear model are shown in Figure 1.

Figure 1: Block diagram of the Calculation of the Basilar Membrane Representation in Perceval.

Figure 1:Basilar membrane representation.

In simulations of listener responses to a signal partly masked by the presence of other audio information, a basilar membrane representation is formed for the masker alone, and for the masker combined with the signal to be detected. The difference between the representations calculated by the ear model is due to the part of the signal that is not masked. The cognitive component of Perceval, shown in Figure 2, uses the difference signal to produce a task-dependent output.

Figure 2: Block Diagram of the Perceval Model.

Figure 2: Cognitive component of Perceval.

For example, detection of a signal embedded in masking noise is simulated by calculating the probability of non-detection of the difference at each detector along the simulated basilar membrane. Assuming that the detectors are statistically independent, the global detection probability for the whole set of detectors is calculated as the complement of the product of the individual non-detection probabilities. Several masking experiments were successfully simulated using this approach [1], and the feasibility of modeling detection performance of an individual listener was also demonstrated [2].

The perceived quality of processed audio information may also be simulated. Quality may be degraded when audio components such as codecs introduce unwanted noise. However, if the noise were partially or entirely masked by the audio signal, the degradation in quality would be less than expected. Again, the ear model calculates the differences between successive representations of the original and processed audio signals. The cognitive model computes a number of perceptually relevant features from these differences and relates these features to a measure of quality of the processed signal.

[1] Perceval: Perceptual evaluation of the quality of audio signals, B. Paillard, P. Mabilleau, S. Morisette, and J. Soumagne, J. Audio Eng. Soc., Vol. 40, pages 21-31, 1992.

[2] Simulation of individual listeners with an auditory model, Treurniet, W.C. (1996) . Proceedings of the Audio Engineering Society, Copenhagen, Denmark, Reprint Number 4154.