October 2003Reducing Noise from the Digital Format: The Dithering and Noise Shaping of Quantization Error
Every so often, in every field of technology, a quantum leap occurs that renders the previous paradigm almost unthinkable. In personal computing, multi-cell spreadsheet applications such as Lotus 1-2-3 and later the Internet and e-mail made it nearly impossible to imagine how work was accomplished in the primitive pre-PC era. Similarly, in audio technology, the benefits offered by the digital representation of sound served to make the compact disc the "killer app" of its day. And yet, the impressive clarity that persuaded the buying public to make the switch from vinyl to CD 20 years ago is only one facet of digitals strength.
First, the CD is extraordinarily convenient. Digital signal processing (DSP) allows fast and easy error detection. And unlike with analog sound recordings -- vinyl or tapes -- with digital, once an error is found, it can be corrected. Even more convenient is the computer compatibility digital recording allows, making the manipulation of the sound even more user-friendly, precise, and repeatable. Plus, there are no degradations due to storage or repetitive copying of the source material.
Secondly, digital storage is more compact and much cheaper. Compared to vinyl or electromagnetic tape media, the amount of information that can be crammed onto a digital format is truly astonishing. Imagine a 50s-style jukebox being mounted in the trunk of a car the way todays CD carousels often are. Or picture the size of a stack of 8-track tapes that would be required to store the number of songs stored in a multi-gigabyte MP-3 player.
Finally, theres what many consider to be the most important benefit, and the feature that took the audio world by storm at the dawn of the digital era. Background noise caused by electrical signals -- and ubiquitous with analog systems -- is easier to avoid or remove, which is why CDs provide the impressive clarity mentioned earlier. Its also the reason why digitally processed telecommunications enable calls from the other side of the globe to sound as if theyre coming from next door.
Today, after more than 20 years of increasing technical capability matched by equal decreases in cost, digital technology is no longer exclusive to professional recording studios. Amateur studios are now capable of the type of wizardry and capabilities formerly restricted to the most advanced recording facilities. From audio mixing to audio restoration, effects processing to synthesizing, the enthusiast can readily and affordably mix and master 24-bit recordings. A new era in empowerment for the masses and a return to the DIY ethic of the garage band era? Perhaps, but not so fast.
When recordings created using this newly democratized technology are transferred to the CD and play is pressed, something inexplicable happens. The much-hyped -- and expected -- CD clarity is not only absent, but worse, the new CD sounds harsh and unpleasant. What is the cause of this distortion, so potentially frustrating to budding Phil Spectors?
Called quantization error or noise, this distortion stems from three separate factors: dynamic range, storage capacity, and system bandwidth -- and it brings into sharp relief one of the main limitations of digital signal processing.
Before these factors are explored, its worth detouring for a brief description of digital signal processing. When the analog sound signal enters the digital sound system and is sampled, the newly created analog sample enters the analog-to-digital converter (ADC). There, the amplitude of the analog voltage representing the sound is converted into binary code. As an example, if the amplitude of the sound has an analog value of 2 with respect to the ADC scale, it would be assigned a digital -- or binary -- value of 00000010. The same process, only in reverse, occurs in digital-to-analog converters (DACs), such as the ones in CD players, where the digitized recording is turned back into sound. Once the sound is represented by binary numbers, or digitized, the values of the sound can enter digital systems such as computers, or be encoded onto digital storage mediums such as CDs.
These binary values that have been assigned to the analog samples varying amplitudes are called quantization levels. Since there are a limited number of levels available in digital systems, the ADC must assign the level closest to the amplitude of the received analog sample. It does this by assigning the next level up or down, depending on the system. What this means is that the output of the converter is not identical to the input. Rather, as an inevitable consequence of attempting to represent a continuous signal such as an analog sound wave by a set of discrete numbers such as quantization levels, it is merely the closest fit. This imperfect fit, or the difference between the real value of the amplitude and the quantization level assigned, creates quantization error.
The number of quantization levels is determined by the memory of the digital system; the greater the number of bits, the greater the number of levels. A sample with 16 bits (CD, for example) allows for 2 to the 16th , or 65,536, quantization levels. In fact, the number of quantization levels doubles with the addition of each extra binary digit. The more bits, the more levels and the finer the degrees of amplitude can be digitized. Hence, the closer the final sound is to the original. If you could have an infinite number of quantization levels, quantization error would be reduced to zero. However, as we noted previously, the inherent limits of digital systems make quantization error -- and its attendant distortion -- an unpleasant reality.
The first of these limiting factors, dynamic range, is the ratio of the maximum to minimum sound signal amplitude that can be input and then sampled by the system. If the signal falls outside this range -- meaning the sounds amplitude is too high or low -- it cannot enter the systems and is lost. DSPs dynamic range depends upon the number of bits available, and is therefore limited by the second limiting factor -- storage capacity.
Digital processing systems require huge amounts of data. A CD can contain about 700 megabytes of information. And using digital means using applications that perform mathematical operations on the digital audio signal -- operations which invariably increase wordlength still further. This would be fine if the hardware used in the process had unlimited storage, but even with todays relatively affordable memory, this is still not the case. So the norm is that wordlength must be significantly decreased. Many digital-to-analog processors use 24 bits, but if the final destination for the recording is CD, it is still limited to 16 bits.
Recall the harsh and unpleasant sound encountered by our enthusiast earlier, where the final recording on CD sounds drastically inferior to the original? This occurs because the sample must be shortened, which is done by quantizing, a technique that simply chops off the 8 extra bits. If the 8 bits are simply removed by quantizing alone, important information is lost and harmonic distortions are invariably created.
Even with sufficient dynamic range and storage, bandwidth is the ultimate bottleneck. This is because when the digitally stored sound is transferred to another system, the transferable amount is limited by the available bandwidth. It was a comparatively low bandwidth that prevented digital video from being available on television for so many years. Another example is the limited capacity of the reading mechanisms of digital audio tape (DAT) or CD players.
These three factors all contribute to quantization error or noise. It should be noted that this noise differs from normal noise because the quantization error correlates with the original analog sound signal and, therefore, actual harmonic distortions are produced. Noise is uncorrelated, and hence spread throughout the spectrum. Even worse, quantization error is very noticeable since this distortion process is not monotonic (e.g., the distortion does not decrease with decreasing signal level), and so is easily perceived.
The fact that CD has a reputation for clarity lies with a process called dithering. Developed in the mid-70s, dithering basically changes the harmonic distortion of quantization error to normal noise by de-correlating the quantization error from the analog sound signal. Dithering achieves this by adding noise -- actually a pseudo-random error signal -- to the sample. At first this seems strange, adding noise to reduce noise, but in the not-always intuitive world of digital audio, it does the trick.
For the sake of simplification, assume for this example that the quantization levels are whole numbers. If the original analog sound has an amplitude of 8.8, only the quantization levels of 8 or 9 can be assigned as the analog sample is digitized by the ADC. These values are automatically lower or higher, depending on the DSP, regardless of how close they actually are to a quantization level. For instance, even though 8.8 is closer to 9, if the system is set up to choose the next lowest digit, 8 will be assigned.
Dither adds a pseudo-random error signal to the sample. The dither signals amplitude matches the available quantization levels so the addition of this dither now provides a 20% chance that the value will be 8 and an 80% chance that the value will be 9. If a proper number of samples have been created in this system, the result at the DAC will be an averaged analog output with the amplitude of 8.8. In this way, dithering spreads the quantization error more evenly throughout the spectrum. The signal is, of course, noisier and the dynamic range is reduced (adding noise increases the noise floor, decreasing dynamic range), but the horrible harmonic distortion is significantly reduced.
To further improve the recordings quality, the newly created dither noise can be shifted to frequencies where the human ear is least sensitive. This is achieved by a process called noise shaping, in which the dither signal's spectrum if effectively moved away from frequencies of, say, 3kHz -- where hearing is most sensitive -- to high frequencies such as 12kHz or even 20kHz.
Noise shaping uses an error filter within a feedback loop. The error signal -- which is the difference between the original input and the quantized output -- is filtered then sent to the feedback loop, where it is fed back into the input system. Only the error signal is affected, leaving the original input alone. The filter shapes the noise, moving it to frequencies where hearing is least sensitive. Noise shaping does not reduce the total noise of the recording; it only makes it less audible.
Applications of dithering and noise shaping
According to Thierry Heeb of Anagram Technologies, quantizing, dithering, and noise shaping have two main applications: as data converters (ADCs or DACs) and music-production tools. To illustrate, Thierry offers the following :
"In modern DACs, the incoming 24-bit signal is first upsampled to high speed," he says, "then, noise shaping with a dither quantizer is used to reduce to wordlength from 24 to something like 5 or 6 bits. (Modern D/A converters usually use a 5- or 6-bit converter.) Due to oversampling, the noise-shaping process allows us to move the quantization noise due to the reduction (24 to 5 or 6 bits) into a part of the spectrum which is far above our listening capabilities."
As he explains, this method allows the use of a low-bit DAC (5 or 6 bits like, for instance, in the AD1853 D/A converter chip, but almost all D/As nowadays use such a structure) to convert the signal to the analog domain.
"This technique is almost universally used because designing 'linear' (R-2R style) converters with over 20 bits of resolution is impossible with todays technology."
Music-production tools such as professional audio software represent the second application to use dithering and noise shaping . Thierry says, "In this case, noise shaping with psycho-acoustic shaping is used to bring a signal from, say, 24 bits down to 16 bits. In this case there is no oversampling the quantization noise cannot be moved out of the audio frequency band. Instead, a psycho-acoustically weighted noise-shaping filter is used to remove quantization noise from the bands of frequency where the human ear is most sensitive. These techniques are also used for other format changes of audio."
The benefits, the performance improvements, and the decreasing costs of digital audio systems ensure they are here to stay. However, digital systems require a lot of expensive processing power and long-word storage space to preserve resolution. Thierry Heeb believes it is highly probable that future digital media will be able to carry more than 16 bits. "For instance, DVD-A, with its 24-bit capabilities, is already an example of such a medium," he says. "In the same sense, SACD with its 1-bit DSD coding has an apparent resolution of more than 16 bits in the audio band." But, Thierry cautions, there is another, related question: "Which consumers have a playback system allowing for better than 16-bit resolution at all stages? What is the need for a super-potent digital storage medium with more than 16-bit resolution if the rest of the audio chain is not able to reproduce the whole resolution?"
Given these factors, Thierry concludes that the need for higher-resolution digital storage media "is probably more a marketing issue of the major audio and software manufacturers or a toy for high-end-audio enthusiasts" than a real need of the mass market.
"One also has to remember that Philips and Sony will soon lose their royalties on the CD, Thierry points out. "So they need to push another digital medium forward -- they did invent the SACD."
Copyright © 2003 SoundStage!
All Rights Reserved