I have mentioned Sigma–Delta Modulators (SDMs) in this column before. These are, in effect, complex filter structures and they are used to produce DSD and other bitstreams. I know I talk about DSD a lot, and I also know that digital audio is more about PCM that it ever is – or ever will be – about DSD. But SDMs are widely used today to make both ADCs and DACs, and so cannot be ignored by anybody who really wants to understand digital audio. So I thought I would devote a column to an attempt to explain what SDMs are, how they work, and what their limitations are. This will be doubly taxing, because it is deeply technical subject, and will make for a very long column. In fact, it will make for two very long columns! At the end of it all, I will conclude by attempting to place the results of my ramblings in the context of the PCM-vs-DSD debate, with perhaps a surprising result.
The words Sigma and Delta refer to two Greek letters, Σ and Δ, which are used by convention in mathematics to denote addition (Σ) and subtraction (Δ). Negative feedback, where the output signal is subtracted from the input signal, is a form of Delta modulation. Similarly, an unstable amplifier, where the output signal is added back into the input signal causing the whole thing to increase uncontrollably, is a form of Sigma modulation. Sigma–Delta Modulators work by combining those functions into a single complex structure. I use the term ‘structure’ intentionally, because SDMs can be implemented both in the analog domain (where they would be referred to as circuits) and in the digital domain (where they would be referred to as algorithms). In this context, analog and digital refer only to the inputs of the SDM, as an SDM’s output is always digital. For the remainder of this post I will refer only to digital SDMs, mainly because it is easier to describe. But you should read it all as being equally applicable to the analog case.
At the core of an SDM lies the basic concept of a negative feedback loop. This is where you take the output of the SDM and subtract it from its input. We’ll call that the Delta stage. If the output of the SDM is identical to its input, then the output of this Delta stage will always be zero. Between the Delta Stage and the SDM output is a Sigma stage. A Sigma stage works by maintaining an accumulated value to which it adds every input value it receives. This accumulated value then becomes its output, and the output of the Sigma stage is also the output of the SDM itself. Therefore, so long as the output of the SDM remains identical to its input, the output of the Delta stage will always be zero, and consequently will continue to add zero to the accumulated output of the Sigma stage which will therefore also remain unchanged. This is what we call the “steady-state case”.
But music is not steady-state. It is always changing. Let’s look at what happens when the input to the SDM increases slightly. This results in a small difference between the input and the output of the SDM. This difference appears at the output of the SDM’s Delta stage, and, consequently, at the input of it’s Sigma stage. This causes the output of the Sigma stage to increase slightly. The output of the Sigma stage is also the output of the SDM, and so the SDM’s output also increases slightly. Now, the output of the SDM is once more identical to its input. The same argument can be followed for a small decrease in the input to the SDM. The SDM as described here is basically a structure whose output follows its input. Which makes it a singularly useless construct.
So now we will modify the SDM described above in order to make it useful. What we will do is to place a Quantizer between the output of the Sigma stage and the output of the SDM, so that the output of the SDM is now the quantized output of the Sigma stage. This apparently minor change will have dramatic implications. For a start, this is what gives it its digital-only output. To illustrate this, we will take it to its logical extreme. Although we can choose to quantize the output to any bit depth we like, we will elect to quantize it to 1-bit, which means the output can only take on one of two values. We’ll call these +1 and -1 (although we will represent them digitally using the binary digits 1 and 0). One result of this is that now the input and output values of the SDM will always be different, and the output of the Delta stage will never be zero. The SDM is still trying to do the same job, which is to try to make the output signal as close as possible to the input signal. This would appear to be a losing cause since the output signal is now constrained to taking on only the values +1 or -1 and the input signal occupies the space between them.
At this point, mathematics takes over, and it becomes a challenge to reduce what I am going to describe to simple illustrative concepts. I hope you will bear with me.
In order to understand what the SDM is actually doing, we need to make some sort of model. In other words we’ll need a set of equations which describe the SDM’s behavior. By solving those equations we can gain an understanding of what the SDM is capable of doing, and what its limits are. There is a problem, though. The Quantizer is a non-linear element. If we know what the input value to the Quantizer is, we can determine precisely what the output value will be, but the opposite is not true. Given the output of the Quantizer, we cannot know a priori what the input value was that resulted in that output value. We cannot solve equations containing non-linear elements. The way we tend to approach problems such as this is to consider the Quantizer instead as a noise source. Instead of trying to model a Quantizer, we consider that we are adding noise (i.e. random values) to the output of the Sigma stage, such that the output values of the SDM just happen to end up being either +1 or -1.
The next thing we do is to observe that one thing we have said about how the SDM works is not entirely correct. We said that at the input to the Delta stage we take the SDM’s input and subtract from it the SDM’s output. In fact what we subtract is the SDM’s output at the previous time step. This is very important, because it means that we can use this one-step delay to express the SDM’s behavior in terms of a digital transfer function, using theories developed to understand how filters work. When we apply all this gobbledygook to the SDM we come up with two properties that we call the Signal Transfer Function (STF) and the Noise Transfer Function (NTF). These are two very useful properties.
The STF tells us how much of the signal applied to the input of the SDM makes it through and appears in the output, whereas the NTF tells us how much of the quantization noise generated by the Quantizer makes it to the SDM’s output. Both of these properties are strongly interrelated, and are strongly frequency dependent. It would be great if we could arrange for STF~1 and NTF~0 across the low frequencies where the audio band is located. By contrast, at the high frequencies we won’t care if the STF ends up being ~0 or the NTF ends up being ~1. So, what exactly does all that gobbledygook mean?
The important thing is that at low frequencies we want the combination of STF~1 and NTF~0. This means that at these low frequencies the output of the SDM contains all of the signal and none of the Quantization noise. If we can arrange it such that those “low frequencies” comprise the entire audio frequency band, then our SDM might be capable of encoding that music signal with surprising precision even though the format has a bit depth of only 1-bit.
A simpler way for the performance potential of this SDM to be viewed is to consider just the Quantization noise. This is nothing more than the difference between what the ideal (not Quantized) input signal looks like and what the actual (Quantized) output signal actually does looks like. If those differences could be stripped off, then the output signal would be identical to the input signal, which is our perfect ideal. What the NTF of the SDM has done is to arrange for all of those differences to be concentrated into a certain band of high frequencies which are quite separate from the audio frequency band containing the ideal output. By the simple expedient of applying a suitable low-pass filter, we can filter them out completely, and thereby faithfully reconstruct the ideal output signal.
Unfortunately, the simplistic SDM I have just described is not quite up to the task I set for it. The NTF is not good enough to meet our requirements. In reality, a final step is required in the design of the SDM which will enable us to fine tune the STF and NTF to acquire the characteristics needed to make a high-performance SDM.
In Part II I will discuss how we approach this challenge, and see how well it works. We will then discuss some of the limitations and challenges of SDM design, and conclude by attempting to place my observations in the context of the PCM-vs-DSD debate