Improving Headphone Performance
In my last article, "The Headphone Geeks of HeadRoom," I discussed headphone-performance measurements with Tyll Hertsens. Hertsens company, HeadRoom, is a haven for self-proclaimed headphone geeks who will stop at nothing -- except perhaps to watch Star Trek TNG -- in the pursuit of headphone excellence.
As youll recall, that article described some of the unique challenges that headphones pose insofar as measurements are concerned. This month, Ill continue my discussions with Hertsens, focusing on another headphone oddity: "blobs" in your head.
"If youve ever listened to headphones critically, youve probably heard artificiality to the sound," says Hertsens. "Most people perceive the audio image produced by headphones to be a blob on the left, a blob on the right, and a blob in the middle."
It is these "blobs" that cause headphone sound to suffer. And, interestingly enough, they are also linked to our ability to locate sound. Only thorough understanding of this curious auditory space perception has allowed audio engineers to eliminate these "blobs."
Auditory space perception
Before discussing blobs, we need to gain sufficient insight into the phenomenon of auditory space perception.
Locating sound is a complicated business requiring two ears, lightning-fast cortical processing, and the ability to discern between the slightest of differences. Distance is the simplest way we can hear an objects position: the closer it is, the louder it is.
Moving objects are detected with the aid of a change in frequency, the Doppler shift -- that sound you hear at race tracks, or when an ambulance or fire truck zooms by, siren wailing. Doppler shift is defined as the increase of frequency as a moving object approaches a stationary listener; when the object then passes and moves away, the frequency decreases similarly.
When an object moves, its sound waves slightly precede it, at a constant speed. These sound waves begin to bunch up in front of the object, which causes the space between them to be decreased. Because frequency depends on wavelength, the shorter the space between waves, the higher the frequency of the sound.
Both distance and the Doppler shift can be perceived by just one ear. However, to really know an objects position, the brain requires both ears.
Localization and two ears
Because our ears are in two different points of space, sound from the same object reaches them at different times; therefore, they each send slightly different information into the auditory cortex. The brain compares and contrasts this incoming information and uses it to calculate the direction and distance of the sound.
For instance, if the sound reaches both ears simultaneously, the object is either directly in front of or directly behind us. However, if a sound reaches our left ear before our right, the brain concludes that the sound is to our left. This cue is called the interaural time difference (ITD), and whats amazing is that our ears can detect the difference between sounds arriving at each ear within as small an interval as 0.0001 seconds.
The brain also uses frequency to determine an objects position. This cue, which determines the different frequency/pitch, is called the interaural amplitude difference (IAD). Our brain uses these differences to calculate the position of the object.
The ITD and IAD can only be used for midrange frequencies. For low frequencies -- below 1000Hz -- where the sound waves length exceeds the distance between the ears, our brain uses phase difference to locate sound. Each ear simultaneously receives a different phase of the same sound wave. The brain uses this information to calculate the position of the object.
With high frequencies, the wavelength becomes shorter than the distance between two ears and sound localization becomes more complicated. When sound reaches our ears, some of it enters the ear directly, while the remainder of it is reflected about within our funny-shaped ear flap. Together these two sounds create a frequency-response phenomenon called a comb filter. The slightest head movement shifts the comb filter, and the brain compares these changes in order to determine the position of the sound. This is why when theyre trying to localize a sound, people's heads will unconsciously shake. Its also why people's ability to locate sound will be drastically impaired if their heads are held in a fixed position.
When it comes to locating sound, our brain is therefore reliant on sound reaching both ears.
"So heres the problem with headphone listening, in a nutshell," says Hertsens. "The sound in the right channel is only heard in the right ear, and the sound in the left channel is only heard in the left ear. Whats missing in headphones is the sound going from each channel to the opposite ear, and this is very unnatural."
In other words, cues such as interaural time, interaural amplitude, and phase difference are no longer present, and this confuses the brain. So what does the brain do?
"The brain perceives the information at the ear, and it feels like there is a bug in that ear, so it creates a blob of sound at both ears and in the middle." explains Hertsens.
Correcting the blobs
Audio researchers have been trying to reduce the effects of these blobs since the 1950s -- and by the 1980s, what with the decreasing cost of integrated circuits, headphones were sounding better and better.
Enter Hertsens and his dream of better headphones for everyone.
Using the early research of the 50s, combined with current technology, Hertsens determined that the best way to improve headphone performance was to remove the blobs. And the best way to remove blobs was to trick the brain into thinking the sound was coming into both ears. So, this is basically what his psychoacoustic processor and headphone amplifiers do.
To create an interaural time difference cue, the HeadRoom crossfeed circuit uses a two-stage active filter that provides a delay of about 400 milliseconds. For the interaural amplitude difference cue, the circuit provides a gentle frequency-response roll-off starting at about 2kHz. As well, the left crossfeed signal is mixed in with the right channels direct signal and vice versa.
"However," continues Hertsens, "note that because audio signals in air mix somewhat differently than they do electronically, and because there are limitations on what can be done with analog filters, the performance characteristics of the processor circuit are not exactly the same as the acoustic-speaker-to-head environment that it models. In air, the crossfeed channel is only about 3dB lower in overall intensity than the direct channel, and the frequency response of the crossfeed signal is not a simple roll-off."
Things become more complicated when mimicking the comb-filter phenomenon.
"It turns out that if you take an audio signal and combine it with itself, after a short delay you will get periodic cancellations in the frequency domain. When we crossfeed a signal from one channel to the other, it is a slightly different signal, but part of it (the mono component) is the same. So, when we crossfeed the signal through a delay, we get a comb-filter effect, the strength of which depends on the amount of mono in the signal. The depth of the notch in the comb filter is also affected by the fact that the crossfeed signal has a smaller amplitude than the direct signal, and the time delay of the crossfeed signal gets shorter as frequency gets higher. In the end, by very subtly playing with crossfeed level and the delay rate, we can create a simulation of the amplitude notch at 2.5kHz by using the comb-filter effect."
Head-related transfer function
Sound localization is a very complex mechanism, and hence extremely difficult to mimic using simple analog circuits. But with careful design the HeadRoom amplifier works well in reducing the blobs and thereby creating a more natural audio image. But is it possible to create a headphone that truly makes the sound appear to be coming from outside your head? It is if youre a pilot in the US military.
For pilots who fly multimillion-dollar aircraft, cost is not a consideration when providing the best possible sound through headphones. Pilots are outfitted with tiny microphones in their ears and sent into an anechoic chamber. Test sounds are then recorded and the computer program develops a personalized and complete picture of how sound reacts when it comes into contact with the pilots head and ears. This "picture" is called a head-related transfer function (HRTF). Each time a pilot enters a plane, he or she loads the individual HRTF into the aircrafts headphone system, thereby creating a customized sound system.
In recent years, this technology has been used, albeit in a limited fashion, with DSP and home-theater products. It improves the sound experience to a point, but still cannot trick the brain into believing the sound is coming from outside your head.
"These products suffer because the main cue your brain uses to localize sounds is the way those sounds change when you move your head," says Hertsens. "When you have headphones on you cannot move your head relative to sound source. Any truly effective headphone acoustic-synthesis system must be able to track your head movements and make changes to each sound source based on the position of the head. But thats not available right now."
Creating great-sounding headphones is a difficult and challenging problem because doing so must address the complexities of our auditory system. Perhaps one day non-military personnel will be able to create their own customized HRTFs, or technology able to cheaply track head movements will be developed. And when these new technologies arise, you can be guaranteed that HeadRoom will be there to provide them for people determined to enjoy music everywhere.
To learn more about HeadRoom and their products, visit www.headphone.com.
Copyright © 2004 SoundStage!
All Rights Reserved