What does “lossless” mean in audio terminology? It seems like a straightforward question, and you will undoubtedly have an answer at the ready that says something along the lines of it being the attribute of an operation which permits you to turn a bunch of numbers into a different bunch of numbers, and then turn them back again exactly as they originally were. But there are shades to losslessness that bear due consideration. To be (lossless) or not to be (lossless), that is the question.
As we discussed last time around, a Fourier Transform takes data in the time domain and expresses it anew in the frequency domain. Both views represent the same data in different ways, and (mathematically) the two views can be losslessly transformed back and forth between one and the other. Let’s take a single channel track of 60 seconds duration sampled at 44.1kHz. There are a total of 5,292,000 audio samples. If I take a Discrete Fourier Transform (DFT) of the whole thing, I end up with a frequency spectrum comprising the frequencies from 0Hz to 22,050Hz, separated into 2,646,001 equally spaced bins (that’s half as many bins as samples, plus one bin). Within each bin I have both the amplitude and phase of that specific frequency (with the exception of the first and last bins, which have no phase information).
In effect the DFT breaks the data down into the exact mathematical formula for the original waveform. It will in this case comprise the sum of 2,646,001 different Sine waves. All I have to do is plug the frequency, amplitude, and phase information from the DFT into each one, sum them all together, and I will have fully reconstructed the original analog waveform. Think about that. Because it is a mathematical formula, it means I can calculate the amplitude of the original waveform at any point in time – even at points that lie arbitrarily between those of the actual samples which comprise the sampled data. This is another way of confirming that the original signal can be perfectly recreated from the sampled data, provided the Nyquist criterion has been met.
This concept is useful, because we can use it to perform some interesting thought experiments. Suppose I decide to mathematically re-sample that waveform at a sample rate of 176.4kHz, or 4 times the 44.1kHz of the original. This will give me, in effect, the original 44.1kHz samples, plus an additional 3 new samples equally spaced between each adjacent pair of original samples. [Here I am choosing to carefully align my 176.4kHz samples so that every fourth sample lines up exactly with one of the original 44.1kHz samples. I don’t necessarily need to do that.]
First I will observe that if I can perfectly recreate the original waveform using only the original 44.1kHz samples, then the additional samples are quite superfluous. Second I will also observe that this particular 176,400Hz data stream can be seen as comprising four distinct interleaved 44,100Hz data streams. I can separate those four data streams out. One of them will comprise the original 44,100Hz samples, but the others – by necessity – will each comprise slightly different numerical values. Although they are different, each of these data streams clearly encodes the exact same original analog waveform, and can (and will) recreate it exactly using the procedure I laid out above. Because of this, each of these different 44,100Hz data streams can therefore be recognized as being lossless transformations of each other.
Let me extend this to a more general principle. If an analog waveform is strictly band limited, then any two digital samplings of that waveform – provided they are carried out at sample rates that meet the Nyquist criterion, and the sampling is executed with absolute precision and perfect timing – will be lossless transformations of each other.
At the risk of hammering on Fawlty-esque at “the bleedin’ obvious”, let me make the key practical point in all this. It relates to whether upsampled audio files are any better than “ordinary” 44,100Hz files, from a perspective of fidelity. If the higher sample rate file was obtained by conversion from the original 44.1kHz file then at best it can be a lossless conversion. But it can never be inherently better. Which isn’t the same as saying your DAC can’t make a better job of converting it to analog, but that’s a different matter entirely.