Integral to the design of any audio device is the need to know how it sounds. Traditional measurements such as frequency response, signal-to-noise ratio and total harmonic distortion can give us a sense of the audio quality of a device but ultimately we wish to know how good it sounds. Such a measurement is inherently subjective. The challenge is then to provide some means of calculating such a measure in an objective and repeatable manner.
Since 1990 the Advanced Audio Systems group has been active in the subjective evaluation of the perceived audio quality of a broad range of audio and speech systems. This includes rigorous tests of high quality audio systems where the discrimination of fine and subtle artifacts is required, as well as speech communication systems where intelligibility and communicability, rather than audio quality is the main concern (see Past Tests).

Our expertise in developing and carrying out listening tests and our facilities, unique in North America, have established our reputation worldwide. Building on established methods of experimental psychology (psychophysics and psychometrics), on work in other labs throughout the world, and methods advocated by standards organizations we have developed the most sensitive possible methods of subjective assessment. We developed the first computer-based hard-disk recording and playback system to be used in subjective testing labs. This system, or similar ones now in common use in other labs as well as in our own, allows listeners to switch seamlessly between reference and processed versions of audio material in order to make high resolution comparisons and evaluations. This system has evolved to become CRC-SEAQ, the first hard-disk based multichannel subjective test system and objective measurement tool.
Other innovations introduced by the Audio Perception Lab include: the development of statistical methods for assessing individual subject expertise; the use of analysis of variance (ANOVA) as the most suitable means of analysing data gathered in subjective experiments; and establishing comprehensive training of listeners.
Through our active participation over the years in standards organizations such as ITU-R, MPEG and AES, most of these innovations and improvements have become incorporated into one internationally standardized test procedure, namely ITU-R Recommendation BS.1116 - Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems.
In tests of high quality systems (most notably, tests of codecs for digital audio broadcasting), we rigorously follow BS.1116 in all details, including: the selection of critical audio materials; reference listening room specifications; playback equipment quality; very thorough training of subjects before double blind rating sessions including the use of the grading scale used by subjects; the use of data exclusively from subjects who clearly show sufficiency of expertise; and ANOVA data analysis as the essential step toward interpretation of outcomes. All of these characteristics together define the worst-case scenario for evaluating systems, rather than an "average" scenario.
Using these methods, we typically are able to attain data resolutions on the order of 5% of the subjective grading scale (e.g. 0.20 of a grade, or less, on a 5 category scale where subjects rate to single decimal place accuracy; this is, in effect a 41 point scale). We continue to improve our methods, exploring and implementing further refinements on a regular, ongoing basis.
While BS-1116 is intended to measure small impairments, we have also developed more recently a new methodology for the subjective evaluation of medium and large impairments. With the proliferation of lower bitrate audio codecs (eg. MP3) in Internet and multimedia applications there is a need for a reliable method of evaluating audio systems of lesser subjective quality. The initial results of this work was presented at the AES conference in Florence in September 1999. Our methodology was submitted to the ITU-R and has become the basis of ITU-R Recommendation 1534 - Method for the subjective assessment of intermediate audio quality.
Most recently, we have studied the performance of the newest generation of low bitrate codecs (Subjective Evaluation of State-of-the-Art 2-Channel Audio Codecs, AES journal, March 1998). Prior to that we conducted tests for the Electronics Industry Association/National Radio Subcommittee of the U.S. as part of their evaluation of potential systems for digital radio (EIA/NRSC DAR systems subjective tests: Part I, Audio codec quality, and Part II, Transmission impairments, IEEE Transactions on Broadcasting, November, 1997 and December, 1997).
In the area of speech communications we recently developed and validated new methodologies for assessing interruptibility in communication systems used by the military. These packet-switched communication systems can suffer delays in real-time transmission. We also conducted speech intelligibility tests for the Canadian military to evaluate communication systems in the presence of high level noise.
Research to develop objective methods to predict the outcomes of subjective assessment is ongoing and our own group is active in this field (see Perceval and CRC-SEAQ). There are many situations however where the accuracy and reliability of measurements demand the use of human observers.
Our in-house expertise includes: experimental psychology (experimental design, statistical analysis, and psychoacoustics) and engineering.