Today’s DACs, with a few very rare (and expensive) exceptions, all use a process called Sigma Delta Modulation (SDM, sometimes also written DSM) to generate their output signal. A simplistic way to look at SDM DACs is to visualize them as up-converting (or ‘upsampling’) their output to a massively high frequency – sometimes 64, 128 or 256 times 44.1kHz, but often higher than that – and taking advantage of the ability to use a more benign analog filter at the output. In fact the bit depth is also reduced (usually to 1–3 bits) in order to simplify the process of digital-to-analog conversion at ultra-high bit rates. That is a bit of an over-simplification, but for the purposes of the point I am trying to make today, it is good enough.
Doing such high-order up-conversion utilizes a great deal of processing power, and the provision of that processing power adds cost. Additionally, the manufacturers of the most commonly used DAC chipsets give away very little about their internal architectures, and don’t disclose the most significant details behind their approaches. Many DAC manufacturers are therefore quite coy about how their product functions, and this coyness is often expressed through cavalier usage of the terms ‘upsampling’ and ‘oversampling’. Many of those manufacturers employ DAC chipsets with prodigious on-chip DSP capability (such as the well-known and widely used ESS Sabre 9018), and then fail to make full use of it in their implementations.
Let’s consider a hypothetical example. We’ll take a 44.1kHz audio stream that our DAC chip needs to upsample by a factor of 64 to 2.88MHz, before passing it through its SDM. The best way to do this would be using a no-holds-barred high-performance Sample Rate Converter (SRC). However, there are some quite simple alternatives, the simplest of which would be to just repeat each of the original 44.1kHz samples 64 times until the next sample comes along (a process sometimes called a zero-order hold). What this does is to encode the “stairstep” representation of digital audio we often have in mind, in fine detail. (Personally, I would refer to this as oversampling rather than upsampling, but marketing types don’t tend to listen to engineers!)
If we are going to use this approach, though, it comes with consequences. As mentioned, it results in the accurate recreation of the stairstep waveform at the output of the DAC. The effect of this stairstep is to add additional distortion frequencies to the analog output waveform. Fortunately, these distortions will all be at frequencies above 22.05kHz, where no original audio data was encoded in the first place. The analog output filter will therefore require a brick-wall response to strip them out, which means that it is not so ‘benign’ any more.
So, instead of our DAC applying a zero-order hold to the incoming 44.1kHz waveform, suppose it uses a high quality Sample Rate Conversion (SRC) algorithm to properly upsample it. Such algorithms incorporate digital filters to filter out the alias signals which are encoded above the Nyquist frequency of the incoming audio stream. The result is a clean signal that we can pass into the SDM, and which will be precisely regenerated, without any stairstep, at the DAC’s output. A good upsampling algorithm will exhibit essentially no ultrasonic residue, so we no longer need an aggressive, sonically worrisome, analog brick-wall filter.
Let’s take another look at these two scenarios. The first needed an aggressive analog brick-wall filter at the output, but the other in effect had the same brick-wall filter implemented digitally at an intermediate processing stage. If the two sound at all different, it can only be because the two filters sound different. Is this possible? In fact, yes it is. An analog filter has sonic characteristics that derive from both its design, and from the sonic characteristics of the components from which it is constructed. The digital equivalent – if properly implemented – only has sonic consequences arising from its design. There is a further point, which is that digital filters can be designed to have certain characteristics which their analog counterparts cannot, but I’m not going into that here. The bottom line is that, if properly designed, a diligent DAC designer ought to be able to achieve better sound with this ‘upsampling’ approach than with the previously discussed ‘oversampling’ approach (again, I must emphasize this is MY usage of those terminologies, which is not necessarily everybody else’s).
Using the ‘upsampling’ approach I have just described, it should make little difference whether you send your music to the DAC at its native sample rate, or if you choose to upsample it first using your playback software’s built-in upsampler. However, that assumes that the upsampling algorithm used by the DAC is at least as good as the one used by your software. There is no guarantee that this will be so, but to be fair, most half-decent modern DACs do employ sophisticated upsamplers. If your playback software gives you a choice of upsampling algorithms then you can sometimes get to hear this for yourself. A few years back, specialist algorithms such as Izotope were very popular for this purpose.
The bottom line here is that if your DAC is any good, you should expect it to sound better (or at least as good) with your music sent to it at its native sample rate than with it upsampled by your playback software – even if you are using Izotope or something similar. If it doesn’t, the difference is probably down to whose upsampling implementation is better. For some time now it has seemed to me that a good measure of a quality DAC is that it sounds better – or at least as good – with no upsampling applied by the playback software. (FWIW, This is how I use my own PS Audio DirectStream DAC.)