We need to talk about SPEAKERS: Sorry, 'audiophiles', only IT will break the sound barrier

Design, DSPs and the debunking of traditional hi-fi

Axis of evil

The specifications of loudspeakers are incomplete: any number of speakers having the same specification will all sound different. There is obsession with on-axis frequency response, but neglect of the equally important, possibly more important, parameters of time response and imaging.

I will explain below how human hearing requires accurate time information in sounds, yet in misguided attempts to extend the frequency range, the accuracy in the time domains may actually be damaged. Never mind the quality; feel the bandwidth.

Hubble Space Telescope shows point spread function (left) before servicing (right)

Hubble Space Telescope shows point spread function (left) before servicing (right)

Stereophonic loudspeakers are intended to deliver a sonic image. In photography, SONAR and so on, there are agreed methods of testing image accuracy using concepts such as the point-spread function. Stated simply, an image with an infinite number of pixels would be perfect and each pixel would be a point. If each point were to be spread or smeared out by some defect, it’s the equivalent of making the pixels bigger and the sharpness of the image is lost.

Objective comparisons can be made which result in improvements. Unfortunately there is no standard for stereophonic sound imaging accuracy, no objective comparisons are possible and progress is impeded. Most legacy speakers have massive point spread functions due to diffraction from inept enclosure design and their stereophonic images are badly smeared.

This is just as well, because when the dominant sound sources are massively smeared, they will mask the fact that a compression codec has thrown away the ambience and reverberation. The mediocrity of legacy loudspeakers may be retained so that the poor quality of many compression algorithms and microphone techniques is not revealed. This also applies to earphones supplied with many portable IT based music players. Never mind the quality; look at the iconic styling.

Apple iPod Classic – a design icon but the range has never been lauded for sonic excellence

Apple iPod Classic – a design icon but the range has never been lauded for sonic excellence

Conversely, audio codecs can be used to test and improve loudspeakers. Using a state-of-the-art speaker designed according to psychoacoustic criteria, it becomes immediately obvious how bad DAB, MiniDisc and MP3 are and that the only lossy codec that has any merit is AAC (at an adequate bit rate). It is not uncommon when demonstrating such speakers for people to assume that the signal source is some exotic high-bit-rate recording when it is simply a competently engineered CD.

To make such loudspeakers, the starting point has to be good knowledge of how the human auditory system (HAS) works, since that defines the problem. Once the problem is understood, the solution lies in the application of good engineering.

It is important to realise that the HAS evolved as a survival tool to help find food and a mate, whilst avoiding becoming a meal for something else. Given the dubious biological nature of the transducer itself, sophisticated mental processes have evolved to make the best of it.

The most important contribution hearing can provide to survival is the location of a potential threat and an estimate of its size. The HAS is very good at it, even in the presence of reflections. It does this a lot better than any modern microphone can, because microphones don’t have brains.

Quantum of Solace

Evolutionary Bond: our survival has depended on locating sound direction and identifying the size of the threat
Source: Quantum of Solace, EON Productions

With two horizontally displaced ears, the most reliable directional information comes from the difference in time of arrival of wave fronts at the ears. The true source must be the one that results in the first version of a given sound. The HAS is working in the time domain, constantly attempting to correlate sounds from each ear to identify the first version and sounds from both ears to determine the direction. It can do this most effectively with transients, or events, since these can carry timing information.

The corollary is that a sine wave has no bandwidth and according to Shannon carries no information. This is easy to grasp. Once you have seen a few waves of a sine wave, you are not going to find anything new if the waveform continues indefinitely.

Next page: Back to square one

Biting the hand that feeds IT © 1998–2019