Communications Research Centre Canada
Symbol of the Government of Canada

Audio Perception

Audio Perception


The Advanced Audio Systems Group
has expertise in experimental design of listening tests, statistical analysis, and psychoacoustics. The Group conducts research in Audio Perception including subjective evaluation of audio and speech systems as well as modeling of human auditory perception.

Subjective Evaluation of Audio Quality

Integral to the design of any audio device is the need to know how it sounds. Traditional measurements such as frequency response, signal-to-noise ratio and total harmonic distortion can give us a sense of the audio quality of a device but ultimately we wish to know how good it sounds. Such a measurement is inherently subjective. The challenge is then to provide some means of calculating such a measure in an objective and repeatable manner.

Since 1990 the Advanced Audio Systems group has been active in the subjective evaluation of the perceived audio quality of a broad range of audio and speech systems. This includes rigorous tests of high quality audio systems where the discrimination of fine and subtle artifacts is required, as well as speech communication systems where intelligibility and communicability, rather than audio quality is the main concern (see Past and Recent Tests).

Audio Perception Lab
Audio Perception Laboratory

Our expertise in developing and carrying out listening tests and our facilities, unique in North America, have established our reputation worldwide. Building on established methods of experimental psychology (psychophysics and psychometrics), on work in other labs throughout the world, and methods advocated by standards organizations we have developed the most sensitive possible methods of subjective assessment. We developed the first computer-based hard-disk recording and playback system to be used in subjective testing labs. This system, or similar ones now in common use in other labs as well as in our own, allows listeners to switch seamlessly between reference and processed versions of audio material in order to make high resolution comparisons and evaluations. This system has evolved to become CRC-SEAQ, the first hard-disk based multichannel subjective test system and objective measurement tool.

Other innovations introduced by the Audio Perception Lab include: the development of statistical methods for assessing individual subject expertise; the use of analysis of variance (ANOVA) as the most suitable means of analysing data gathered in subjective experiments; and establishing comprehensive training of listeners.

Through our active participation over the years in standards organizations such as ITU-R, MPEG and AES, most of these innovations and improvements have become incorporated into one internationally standardized test procedure, namely ITU-R Recommendation BS.1116 - Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems.

In tests of high quality systems (most notably, tests of codecs for digital audio broadcasting), we rigorously follow BS.1116 in all details, including: the selection of critical audio materials; reference listening room specifications; playback equipment quality; very thorough training of subjects before double blind rating sessions including the use of the grading scale used by subjects; the use of data exclusively from subjects who clearly show sufficiency of expertise; and ANOVA data analysis as the essential step toward interpretation of outcomes. All of these characteristics together define the worst-case scenario for evaluating systems, rather than an "average" scenario.

Using these methods, we typically are able to attain data resolutions on the order of 5% of the subjective grading scale (e.g. 0.20 of a grade, or less, on a 5 category scale where subjects rate to single decimal place accuracy; this is, in effect a 41 point scale). We continue to improve our methods, exploring and implementing further refinements on a regular, ongoing basis.

While BS-1116 is intended to measure small impairments, we have also developed more recently a new methodology for the subjective evaluation of medium and large impairments. With the proliferation of lower bitrate audio codecs (eg. mp3, HE AAC v2) in Internet and multimedia applications, there is a need for a reliable method of evaluating audio systems of lesser subjective quality. The initial results of this work was presented at the AES conference in Florence in September 1999. Our methodology was submitted to the EBU and the ITU-R and has become the basis of ITU-R Recommendation 1534 - Method for the subjective assessment of intermediate audio quality.

Past and Recent Tests

We have conducted a number of subjective tests over the years to evaluate and compare the performance of audio codecs. Among other things, we have studied the performance of five prominent audio coding systems in 1997, namely mp2, mp3, Dolby AC-3, Lucent PAC and AAC. The results were published in an AES Journal article which became one of the most frequently cited paper (see Subjective Evaluation of State-of-the-Art 2-Channel Audio Codecs, AES Journal, March 1998). Prior to that we conducted listening tests for the ITU-R as part of the recommendation of audio codecs for various applications in broadcasting. We also conducted subjective tests for Electronics Industry Association/National Radio Subcommittee of the U.S. as part of their evaluation of potential systems for digital radio (see EIA/NRSC DAR systems subjective tests: Part I, Audio codec quality, and Part II, Transmission impairments, IEEE Transactions on Broadcasting, November, 1997 and December, 1997).

In the area of speech communications we developed and validated new methodologies for assessing interruptibility in communication systems used by the military. These packet-switched communication systems can suffer delays in real-time transmission. We also conducted speech intelligibility tests for the Canadian military to evaluate communication systems in the presence of high level noise.

More recently, the Group has conducted a series of subjective tests to measure the perceived loudness of audio clips in order to build a database of subjective results to evaluate the performance of loudness meters submitted to the ITU-R as candidate systems for an international standard. These were followed by additional listening tests to refine the loudness meter algorithm adopted by the ITU-R with respect to the contribution of LFE channel in a surround sound program, silence and low signal levels to loudness perception. Our in-house expertise includes: experimental psychology (experimental design, statistical analysis, and psychoacoustics) and engineering.

Modeling of Human Auditory Perception

Knowledge of how acoustic stimuli are processed by the auditory system is important in the development of new digital audio technologies. Audio codecs, which are essential components in new multimedia and broadcast services, depend on the characteristics of the auditory system to compress audio information for efficient transmission and storage at low bit rates. Also, objective quality measurement schemes, which also depend heavily on psychoacoustic knowledge, have been developed to simulate subjective ratings of audio quality. Therefore, our lab conducts research to refine and test psychoacoustic models.

The objective of research in psychoacoustics is to understand the relationship between the physical characteristics of a sound and the perception of the sound by a human listener. For example, the physical energy of a sound is perceived as loudness, or the physical frequency of a sound is perceived as pitch. The function that transforms the physical into the perceptual domain is typically a non-linear function of some kind. For example, perceived pitch was described as a function of frequency by Stevens and Volkmann in 1940 using a scaling method that required listeners to select frequencies that doubled the pitch of given sounds. The relationship they discovered is seen below.

Frequency to Pitch Mapping Function

Investigation of the relationships between physical and perceptual quantities in different contexts also helps us to understand the limitations and variability of human perceptual response. For example, listeners vary in their ability to hear sounds below a threshold energy that is frequency-dependent. Also, listeners vary in their ability to hear high frequency sounds. Young listeners are often able to hear frequencies as high as 17-18 kHz, while older listeners often are insensitive to frequencies beyond 8 kHz. Further, some elements of a sound that are physically present in a signal may not be perceived because of masking by other elements adjacent in time and frequency. Such masking also shows considerable variability across listeners.

All modern audio coding and compression systems make extensive use of psychoacoustic knowledge. Psychoacoustic models are used in audio encoders to predict which elements of a sound are likely inaudible. By ignoring such irrelevant information, the bit rate required to transmit the audio signal may be significantly reduced. The performance of coding algorithms typically varies with different types of audio content, and some implementations may be more successful than others in the use of psychoacoustic knowledge. Our Group has developed and patented a psychoacoustic model for use in audio coding systems (see H. Najaf-Zadeh, H. Lahdili and L. Thibault, "Incorporation of Inharmonicity Effects into Auditory Masking Models," 113rd AES Convention, Los Angeles, 2002 and H. Najaf-Zadeh, H. Lahdili, L. Thibault and M. Lavoie, "Use of Auditory Temporal Masking in the MPEG Psychoacoustic Model," 114th AES Convention, Amsterdam, 2003.

Objective measurement methods have also been developed with the help of psychoacoustic knowledge to evaluate the quality of audio systems such as the new generation of audio codecs. One such method, named PEAQ (Perceptual Evaluation of Audio Quality), was standardized by the International Telecommunications Union – Radiocommunications Sector (ITU-R ) in Recommendation BS.1387, “Method for objective measurements of perceived audio quality”.

Some elements of the ITU-R PEAQ quality measurement model come from CRC, more specifically from Perceval, an objective perceptual method for measuring the quality of audio codecs developed by our Group. Perceval's computational model of peripheral auditory processes was developed under a contractual arrangement with Sherbrooke University (Quebec, Canada). A task-specific cognitive model, which analyzes the representation obtained from the ear model, was developed and added subsequently in our laboratory and is included in PEAQ. The cognitive model outputs perceptual or cognitive variables appropriate for modeling audio quality assessments or auditory detection thresholds. Perceval reproduces basic psychoacoustic phenomena, and estimates perceptual degradations that correlate well with average listener quality ratings.

Our Group has developed a commercial implementation of the ITU-R PEAQ model in the Objective Test Module of our CRC-SEAQ software. Research to develop and refine objective methods to predict audio quality is ongoing in our group.

Psychoacoustic models aid in optimizing the performance of audio processing and measurement devices. This is possible because the basic function of the ear is understood well enough to be practically useful. However, there are obvious limitations that need to be addressed to achieve even better performance. Our laboratory has developed a facility and an on-going research program to fill apparent gaps in what is known about perception of audio (see W. Treurniet and D. Boucher, "A masking level difference due to harmonicity", J. Acoust. Soc. Am., 109(1):306-320, 2001).

More recently, our Group has developed an objective method to measure audio loudness. This method was proposed to the ITU-R and was selected in 2006 among 11 candidates to become an international standard known as ITU-R Recommendation BS.1770, “Algorithms to measure audio program loudness and true-peak audio level”. This loudness metering technique has been endorsed by several major broadcast associations (World Broadcast Union, North American Broadcast Association, European Broadcast Union, Advanced Television System Committee, etc.) and is now being deployed in production studios and broadcast facilities around the world. Our Group has developed its own commercial implementation of the ITU-R BS.1770 Recommendation in our CRC Loudness Meter.