SoundStage! The Y-Files - Binary Boggles or Digital Druthers: An Inquiry into D/A Conversion (11/2000)

The Y-Files
Back Issue Article

November 2000

Binary Boggles or Digital Druthers: An Inquiry into D/A Conversion

Pavlov’s rats

If you’re male, you’re wired to compete. With audio and cars, this manifests as specmanship. Digital audio right now offers plenty of opportunities to indulge in this silly pastime. There’s upsampling, oversampling, and re-sampling. There’s interpolation, output density and varying word lengths from 16 to 24 bits. Sample-rate conversion offers 96kHz, 192kHz and 705.6kHz rates. What’s it all mean?

Do you secretly feel like a jitterbug, oscillating hither and dither over whether your CD player has what it takes? Is your sampling up to snuff or over the hill? Should you involve Interpol to engage in proper digital detective work and recover the missing pieces? In short, dare you ponder that most vexing of male insecurities -- is your output long enough?

Your digital self-confidence might be fundamentally threatened by now. How to perform a reality check without glancing sideways in the public loo to check out the next fellow’s equipment? Not being a techie myself, I contacted some folks who are. Participants included Gilles Gameiro of Birdland Audio, manufacturer of the Odéon-lite DAC soon to be reviewed by SoundStage!; Kevin Halverson of Muse Audio, mastering central for Cardas, Chesky and Classic Records and manufacturer of highly regarded digital gear; Jeffrey Kalt of Resolution Audio, whose CD55 has also been submitted for review and features internal upsampling to 705.6kHz; Peter Qvortrup of Audio Note Great Britain, who abolishes oversampling altogether; and John Stronczer of Bel Canto Design, whose very popular DAC1 is garnering review honors left and right. These gents lent much-needed technical insight into the subject matter, to portray, in a simplified manner, what the much-touted upsampling in digital audio is all about.

Basic digital architecture

Each modern DAC, regardless of the number of internal chips, is composed of four major parts. There’s a receiver or SPD/IF decoder and eventually a reclocking circuit; a digital filter; a digital-to-analog converter; and an analog low-pass-filter output stage. The receiver usually receives a stream of information coming from a transport (through some SP/DIF, AES/EBU, or TosLink interface cable) and decodes it into audio data at the sampling frequency. Receiver stages may also have some high-stability PLL or local clock source to reduce the jitter introduced by digital cables and circuitry. The digital filter is a data process by which decoded audio data from the receiver are manipulated and transformed before going to the D-to-A conversion stage, where the datastream is converted into the analog signal it is supposed to represent. The analog low-pass filter after the DAC removes residual noise from the digital conversion, called imaging.

Sampling

Oversampling, upsampling and interpolation are terms that are freely interchangeable despite certain conventions that have taken hold to suggest the opposite. The Red Book standard for CD established a sampling rate of 44.1kHz. Any time this sample rate is changed up or down, a sample-rate conversion has occurred. If the sample rate is increased, the data has been upconverted or upsampled. If the sample rate is decreased, the data has been downconverted or downsampled. The term oversampling is typically used when the upconversion is at integer multiples (i.e. 2x, 4x, 8x, 16x, etc.), while upsampling most often relates to 96kHz or 192kHz sample rates that are non-integer multiples of the base 44.1kHz sampling frequency. However, the basic mechanism here is identical and was first introduced with second generation of CD players. Oversampling was required to cut down the cost, complexities and concomitant sonic degradation involved with the very sharp ninth-order or higher "brickwall" filters required to eliminate the aliasing folded noise around 44.1kHz in first-generation CD players. For a brief flashback into history, it was mathematician Augustin-Louis Cauchy who first proposed the sampling theorem in 1841. In 1928, Harry Nyquist proved the sampling theorem. CD players have been using oversampling since they were first released in the early 1980s. Oversampling or upsampling advertised as though recent breakthrough developments have been with us for a while.

Upsampling requires the calculation of intermediate data points. This is interpolation. Downsampling requires the deletion of existing samples. This is decimation. Through a combination of "interpolating up" and "decimating down" of the original 44.1kHz sample frequency, any desired output sample frequency can be achieved. Those interested can refer to an Analog Devices’ white paper on their AD1890 sample rate converter IC: www.analog.com/publications/whitepapers/products/AD1890.html.

Aliases

To get to the bottom of the sampling issue, we need to remember that digital audio, as an encode/storage/decode-sampled system, operates within a deliberately limited bandwidth. Sampling is a characteristic function of all such systems -- regardless of single- or multi-bit -- and is defined by its rate of frequency, commonly expressed as a number of samples per second. As bandwidth-limited systems, no energy can be contained within the system whose frequency is equal to or greater than half the sampling rate. This is often referred to as the Nyquist theorem and can be remembered as "zero permissible energy above the cut-off rate." Musical signals, unlike sine waves, contain an infinite variety of high-frequency spectral components called harmonics or overtones that extend well past this cut-off frequency. In order to satisfy the requirement for a strictly limited bandwidth, a properly designed encoder or A/D converter requires a mechanism that eliminates those components exceeding the limit. This "rev limiter" is the anti-aliasing low-pass filter mentioned earlier and ensures that no energy at or above half the sampling frequency enters the encoder. Without it, ghost images or aliases of the original signal are folded down in frequency by the difference between them and the Nyquist point. If, for example, a 23.05kHz component was allowed to enter our 44.1kHz sampling system, we’d see a 1kHz component (23.05kHz -- (0.5 x 44.1kHz)) in the DAC’s output that is pure distortion and very audible. Since the Nyquist point for CD is 22.05kHz, no energy above this limit should enter the encoder or exit the decoder. Because true "brickwall" filters don’t exist in reality, recording engineers have long since used higher sampling rates to push aliasing artifacts far out into the supersonic stratosphere. This increases the anti-aliasing filters’ effectiveness to remove this distortion. Before mastering, the recorded data needs to be downsampled and incur a certain loss of resolution that can never be fully recovered within the existing 16/44.1 CD format.

Ghosts

Aliases or ghost images don’t restrict their ghastly spectral presence to the encoding half of the process. They also exist in the decoding half of the A/D/A process, where they’re called images. They’re like unintentional artificial "harmonics" that are identical in spectral content to the intended signal but up-shifted in frequency by an amount equal to integer multiples of the sampling rate. A sine wave of 100Hz, for example, will create a 44.2kHz component (100Hz + the 44.1kHz sample rate), another at 88.3kHz (100Hz + 2 x 44.1kHz), another one at 132.4kHz, etc. To remove this distortion, the equivalent of the anti-aliasing filter, called the anti-imaging or reconstruction filter, is needed to remove these out-of-band components. Because of the earlier-mentioned complexity and inherent cost of truly steep high-order filters, the most common anti-imaging filters employed after D/A conversion use first- to sixth-order slopes. Each order adds 6dB of attenuation or reduction of out-of-band energy. A first-order filter only provides a 50% suppression of the first image, passing the other 50% as distortion. A sixth-order low-pass filter will attenuate the same image by 36dB, or six times the number of orders. This equates an image reduction to 1.58% of its original amplitude, much better but still considerable distortion.

By inserting an oversampling or upsampling filter after D/A conversion, these imaging products are pushed up again in frequency to enhance the efficiency of the low-pass reconstruction filter. Doubling the sample rate also doubles the attenuation power of the filter. Our earlier sixth-order filter now reduces the first image to 0.8% of its original strength.

Filters

At this point, it becomes clear that the upconversion issue of digital data increases the effectiveness of the digital and analog filters. By transferring the encoding aliases and decoding images out-of-band into the ultrasonic region, the filters responsible for blocking this distortion can employ gentler slopes that induce less phase shift and time-domain errors and leave less of their signatures on the passing signal. It’s a mistake to think of upsampling in terms of a bigger engine that makes your car go faster because it redlines at higher RPMs. More appropriately, one might think of it in terms of high-octane gas -- it causes less wear and tear on your engine, produces a cleaner burn and improves the ride. But it’s not a go-faster panacea.

To repeat this important point: Upsampling does not add any data that wasn’t present on the original recording. Upsampling does involve the interpolation of intermediate data points, but it doesn’t really result in the creation of new information. One still gets from point A to point B in the same amount of time, over virtually the same path, using the same means of transportation. The only difference is that in the second case one has a few more landmarks to be guided by. These landmarks don’t create a new path or necessarily make the old path any more accurate.

Sampling, again

Different ad campaigns tout different upsampling rates that may make you wonder which is superior. Theoretically, upsampling to a higher frequency that’s a simple multiple of 44.1kHz is better than upsampling to rates like 96kHz because it doesn’t require any data decimation. However, as John Stronczer was quick to point out, the computational horsepower available in today’s chips and their attendant very sophisticated algorithms render this a moot point. For quick confirmation, Analog Devices’ new AD1896 is a 192kHz sample-rate converter with a THD+N below 120dB for any ratio of input to output sampling rates, from 44.1kHz to 192kHz. Its dynamic range is given as 139dB. This clearly proves that any residue from the conversion process is well below the noise floor of any A/D or D/A converter. In fact, the residue from the sample-rate conversion is far below the thermal noise floor of the regular analog resistors, capacitors and coils that are staple ingredients of every CD player or DAC in existence.

To wrap up, oversampling is not used to create or restore lost information. It is used to simplify the analog reconstruction filter. By raising the frequency of the ghost images that are purely artifacts of the D/A-conversion process, one can avoid the brickwall filters that plagued the early CD players. The designer’s goal is to use a fortuitous combination of digital filtering and analog filtering to achieve the best conversion possible for the original datastream. The reason why external upsamplers can make an apparent improvement when inserted into a standard digital chain is, according to all parties queried, that they are being paired with devices that could have done a better job with their own internal digital filter, D/A conversion, and analog output designs. If the devil is in the details, execution -- preferably with garlic and a cross -- is what matters. Unfortunately, the importance of filter choice and filter design in the D/A-conversion process has been overshadowed by the numbers’ war played out in the glossy mags. Numbers, without really explaining anything, seem to tell a story all their own and win converts purely on appearance -- 96kHz must be better than 88.2kHz, which in itself obviously is much superior to 44.1kHz, never mind 24-bit, which absolutely kills anything less.

No sampling

We also must acknowledge that certain designers bypass the entire oversampling process and claim that the pre- and post-echo ringing that’s intrinsic to oversampling is a greater evil than the imaging products left in the datastream without it. Companies that use non-oversampling techniques are Audio Note, 47 laboratories, Zanden, and Illunden. A recent review in Hi-Fi News of Audio Note’s top-line DAC 5 reported superlative sonics, while the measurements, as predicted, showed more distortion products than actual signal.

More spider webs in the belfry

According to Resolution Audio, another very common misconception -- call it wishful thinking -- is that by creating a 24-bit output from a 16-bit input, the dynamic range of a given recording can somehow be increased. This is as transparent a myth as the emperor’s new clothes. Whatever information is contained within the original recording represents the best that can be achieved in playback. If the recording was mastered with 16-bit-resolution dynamic range, that’s all the dynamic range you’ll get during playback retrieval. Dither can be used to extend the resolution below the least significant bit during the recording process. This is done by changing the nature of the process from a strict quantization function to that of a pseudo-pulse-width modulator. Not surprisingly, the side effects are an increase in high-frequency noise and total noise floor power since dither, after all, is random noise un-correlated to the signal. If chosen properly, the dither component can result in a more spectrally benign noise floor, however. On the playback side of the encode/decode process, dither can be used to change the character of the quantization noise. By yet further attenuating non-signal-related garbage that’s generated as an unavoidable function of the A/D process, playback can be made more transparent to the original master tape. But when sample sizes are increased, such as going from 16 to 24 bits, dither does not extend signal resolution because the minimum step size, the least significant bit, has already been defined. Once again, there’s no free lunch. You don’t get something for nothing, even though certain marketing pundits will take countless of your shekels to sell you exactly that -- empty promises. To repeat this point ad nauseam, Muse offers a perfectly logical reminder: "For a given encoding process, resolution and bandwidth are governed by the encoding method employed. Therefore, they cannot be extended by any means on the decode side." For more data density and higher resolution, look toward SACD and DVD-A. The CD format is maxed out. The new processors "merely" use significantly accelerated computational powers to minimize the transmission of anything other than the raw signal. They don’t give us more 44.1kHz signal. They give us less conversion garbage.

Where’s my gun?

But our male hardwiring is bound to fool us time and again. We’ll override what our ears so clearly tell us with what we think we should be hearing. This should itself is fostered by clever marketing propaganda, status considerations, money spent and all the associated preconceptions that come with it. Unlike the hearing of our female companions, for whom specs are a lot less attractive -- they call them mumbo-jumbo if they get fed up -- our competitive programming easily influences our aural perceptions. If you can remember this bit of Men are from Mars pop-psych wisdom next time you read some technobabble, you’ll simultaneously discard a bunch of useless boulders from your musical traveling bags. Whether you can leave them on the ground for long depends on how much you enjoy carrying excess baggage. I can say that because I’m a male myself. Watch my cheeks fall asleep next time I’m hanging in the Loo studying designer specs. Do you assume that we as the supposedly superior species are terribly miswired after all? Should we operate on ourselves and rip out some connections? Heck, let’s see what happens. Where’s my soldering gun? If this becomes my last column ever, you know my lobotomy was successful.

...Srajan Ebaen
srajan@soundstage.com

The Y-Files Current Issue