WEBVTT 1 00:00:08.252 --> 00:00:11.550 Hi, I'm Monty Montgomery from Red Hat and Xiph.Org. 2 00:00:11.550 --> 00:00:18.430 A few months ago, I wrote an article on digital audio and why 24bit/192kHz music downloads don't make sense. 3 00:00:18.430 --> 00:00:23.433 In the article, I mentioned--almost in passing--that a digital waveform is not a stairstep, 4 00:00:23.433 --> 00:00:28.680 and you certainly don't get a stairstep when you convert from digital back to analog. 5 00:00:29.865 --> 00:00:33.865 Of everything in the entire article, that was the number one thing people wrote about. 6 00:00:33.865 --> 00:00:37.221 In fact, more than half the mail I got was questions and comments 7 00:00:37.221 --> 00:00:39.663 about basic digital signal behavior. 8 00:00:39.894 --> 00:00:45.285 Since there's interest, let's take a little time to play with some simple digital signals. 9 00:00:49.747 --> 00:00:51.006 Pretend for a moment 10 00:00:51.006 --> 00:00:54.089 that we have no idea how digital signals really behave. 11 00:00:54.734 --> 00:00:56.841 In that case it doesn't make sense for us 12 00:00:56.841 --> 00:00:59.049 to use digital test equipment either. 13 00:00:59.049 --> 00:01:00.937 Fortunately for this exercise, there's still 14 00:01:00.937 --> 00:01:04.020 plenty of working analog lab equipment out there. 15 00:01:04.020 --> 00:01:05.972 First up, we need a signal generator 16 00:01:05.972 --> 00:01:08.190 to provide us with analog input signals-- 17 00:01:08.190 --> 00:01:12.692 in this case, an HP3325 from 1978. 18 00:01:12.692 --> 00:01:14.153 It's still a pretty good generator, 19 00:01:14.153 --> 00:01:15.614 so if you don't mind the size, 20 00:01:15.614 --> 00:01:16.532 the weight, 21 00:01:16.532 --> 00:01:17.577 the power consumption, 22 00:01:17.577 --> 00:01:18.910 and the noisy fan, 23 00:01:18.910 --> 00:01:20.329 you can find them on eBay. 24 00:01:20.329 --> 00:01:23.863 Occasionally for only slightly more than you'll pay for shipping. 25 00:01:24.617 --> 00:01:28.500 Next, we'll observe our analog waveforms on analog oscilloscopes, 26 00:01:28.500 --> 00:01:31.550 like this Tektronix 2246 from the mid-90s, 27 00:01:31.550 --> 00:01:34.761 one of the last and very best analog scopes ever made. 28 00:01:34.761 --> 00:01:36.807 Every home lab should have one. 29 00:01:37.716 --> 00:01:40.852 And finally inspect the frequency spectrum of our signals 30 00:01:40.852 --> 00:01:43.177 using an analog spectrum analyzer. 31 00:01:43.177 --> 00:01:47.732 This HP3585 from the same product line as the signal generator. 32 00:01:47.732 --> 00:01:50.615 Like the other equipment here it has a rudimentary 33 00:01:50.615 --> 00:01:52.905 and hilariously large microcontroller, 34 00:01:52.905 --> 00:01:56.276 but the signal path from input to what you see on the screen 35 00:01:56.276 --> 00:01:58.537 is completely analog. 36 00:01:58.537 --> 00:02:00.329 All of this equipment is vintage, 37 00:02:00.329 --> 00:02:01.993 but aside from its raw tonnage, 38 00:02:01.993 --> 00:02:03.844 the specs are still quite good. 39 00:02:04.536 --> 00:02:06.868 At the moment, we have our signal generator 40 00:02:06.868 --> 00:02:12.829 set to output a nice 1kHz sine wave at one volt RMS, 41 00:02:13.414 --> 00:02:15.220 we see the sine wave on the oscilloscope, 42 00:02:15.220 --> 00:02:21.428 can verify that it is indeed 1kHz at one volt RMS, 43 00:02:21.428 --> 00:02:24.108 which is 2.8V peak-to-peak, 44 00:02:24.308 --> 00:02:27.561 and that matches the measurement on the spectrum analyzer as well. 45 00:02:27.561 --> 00:02:30.644 The analyzer also shows some low-level white noise 46 00:02:30.644 --> 00:02:32.190 and just a bit of harmonic distortion, 47 00:02:32.190 --> 00:02:36.649 with the highest peak about 70dB or so below the fundamental. 48 00:02:36.649 --> 00:02:38.612 Now, this doesn't matter at all in our demos, 49 00:02:38.612 --> 00:02:40.574 but I wanted to point it out now 50 00:02:40.574 --> 00:02:42.452 just in case you didn't notice it until later. 51 00:02:44.036 --> 00:02:47.142 Now, we drop digital sampling in the middle. 52 00:02:48.557 --> 00:02:51.024 For the conversion, we'll use a boring, 53 00:02:51.024 --> 00:02:53.374 consumer-grade, eMagic USB1 audio device. 54 00:02:53.374 --> 00:02:55.337 It's also more than ten years old at this point, 55 00:02:55.337 --> 00:02:57.257 and it's getting obsolete. 56 00:02:57.964 --> 00:03:02.676 A recent converter can easily have an order of magnitude better specs. 57 00:03:03.076 --> 00:03:07.924 Flatness, linearity, jitter, noise behavior, everything... 58 00:03:07.924 --> 00:03:09.353 you may not have noticed. 59 00:03:09.353 --> 00:03:11.604 Just because we can measure an improvement 60 00:03:11.604 --> 00:03:13.609 doesn't mean we can hear it, 61 00:03:13.609 --> 00:03:16.404 and even these old consumer boxes were already 62 00:03:16.404 --> 00:03:18.643 at the edge of ideal transparency. 63 00:03:20.244 --> 00:03:22.825 The eMagic connects to my ThinkPad, 64 00:03:22.825 --> 00:03:26.121 which displays a digital waveform and spectrum for comparison, 65 00:03:26.121 --> 00:03:28.788 then the ThinkPad sends the digital signal right back out 66 00:03:28.788 --> 00:03:30.921 to the eMagic for re-conversion to analog 67 00:03:30.921 --> 00:03:33.332 and observation on the output scopes. 68 00:03:33.332 --> 00:03:35.582 Input to output, left to right. 69 00:03:40.211 --> 00:03:41.214 OK, it's go time. 70 00:03:41.214 --> 00:03:43.924 We begin by converting an analog signal to digital 71 00:03:43.924 --> 00:03:47.347 and then right back to analog again with no other steps. 72 00:03:47.347 --> 00:03:49.268 The signal generator is set to produce 73 00:03:49.268 --> 00:03:52.649 a 1kHz sine wave just like before. 74 00:03:52.649 --> 00:03:57.428 We can see our analog sine wave on our input-side oscilloscope. 75 00:03:57.428 --> 00:04:01.694 We digitize our signal to 16 bit PCM at 44.1kHz, 76 00:04:01.694 --> 00:04:03.828 same as on a CD. 77 00:04:03.828 --> 00:04:07.156 The spectrum of the digitized signal matches what we saw earlier. and... 78 00:04:07.156 --> 00:04:10.836 what we see now on the analog spectrum analyzer, 79 00:04:10.836 --> 00:04:15.154 aside from its high-impedance input being just a smidge noisier. 80 00:04:15.154 --> 00:04:15.956 For now 81 00:04:18.248 --> 00:04:20.798 the waveform display shows our digitized sine wave 82 00:04:20.798 --> 00:04:23.966 as a stairstep pattern, one step for each sample. 83 00:04:23.966 --> 00:04:26.388 And when we look at the output signal 84 00:04:26.388 --> 00:04:29.054 that's been converted from digital back to analog, we see... 85 00:04:29.054 --> 00:04:32.052 It's exactly like the original sine wave. 86 00:04:32.052 --> 00:04:33.483 No stairsteps. 87 00:04:33.914 --> 00:04:37.193 OK, 1kHz is still a fairly low frequency, 88 00:04:37.193 --> 00:04:40.633 maybe the stairsteps are just hard to see or they're being smoothed away. 89 00:04:40.739 --> 00:04:49.492 Fair enough. Let's choose a higher frequency, something close to Nyquist, say 15kHz. 90 00:04:49.492 --> 00:04:53.545 Now the sine wave is represented by less than three samples per cycle, and... 91 00:04:53.545 --> 00:04:55.838 the digital waveform looks pretty awful. 92 00:04:55.838 --> 00:04:59.798 Well, looks can be deceiving. The analog output... 93 00:05:01.876 --> 00:05:06.033 is still a perfect sine wave, exactly like the original. 94 00:05:06.633 --> 00:05:09.228 Let's keep going up. 95 00:05:17.353 --> 00:05:20.151 16kHz.... 96 00:05:23.198 --> 00:05:25.616 17kHz... 97 00:05:28.201 --> 00:05:29.945 18kHz... 98 00:05:33.822 --> 00:05:35.548 19kHz... 99 00:05:40.457 --> 00:05:42.465 20kHz. 100 00:05:49.097 --> 00:05:52.350 Welcome to the upper limits of human hearing. 101 00:05:52.350 --> 00:05:54.377 The output waveform is still perfect. 102 00:05:54.377 --> 00:05:58.025 No jagged edges, no dropoff, no stairsteps. 103 00:05:58.025 --> 00:06:01.342 So where'd the stairsteps go? 104 00:06:01.342 --> 00:06:03.198 Don't answer, it's a trick question. 105 00:06:03.198 --> 00:06:04.318 They were never there. 106 00:06:04.318 --> 00:06:06.652 Drawing a digital waveform as a stairstep 107 00:06:08.712 --> 00:06:10.772 was wrong to begin with. 108 00:06:10.942 --> 00:06:11.998 Why? 109 00:06:11.998 --> 00:06:14.366 A stairstep is a continuous-time function. 110 00:06:14.366 --> 00:06:16.201 It's jagged, and it's piecewise, 111 00:06:16.201 --> 00:06:19.700 but it has a defined value at every point in time. 112 00:06:19.700 --> 00:06:22.004 A sampled signal is entirely different. 113 00:06:22.004 --> 00:06:23.337 It's discrete-time; 114 00:06:23.337 --> 00:06:27.337 it's only got a value right at each instantaneous sample point 115 00:06:27.337 --> 00:06:32.596 and it's undefined, there is no value at all, everywhere between. 116 00:06:32.596 --> 00:06:36.666 A discrete-time signal is properly drawn as a lollipop graph. 117 00:06:40.020 --> 00:06:42.974 The continuous, analog counterpart of a digital signal 118 00:06:42.974 --> 00:06:45.364 passes smoothly through each sample point, 119 00:06:45.364 --> 00:06:50.153 and that's just as true for high frequencies as it is for low. 120 00:06:50.153 --> 00:06:53.033 Now, the interesting and not at all obvious bit is: 121 00:06:53.033 --> 00:06:55.454 there's only one bandlimited signal that passes 122 00:06:55.454 --> 00:06:57.417 exactly through each sample point. 123 00:06:57.417 --> 00:06:58.708 It's a unique solution. 124 00:06:58.708 --> 00:07:01.246 So if you sample a bandlimited signal 125 00:07:01.246 --> 00:07:02.612 and then convert it back, 126 00:07:02.612 --> 00:07:06.462 the original input is also the only possible output. 127 00:07:06.462 --> 00:07:07.838 And before you say, 128 00:07:07.838 --> 00:07:11.721 "Oh, I can draw a different signal that passes through those points." 129 00:07:11.721 --> 00:07:14.283 Well, yes you can, but... 130 00:07:17.268 --> 00:07:20.521 if it differs even minutely from the original, 131 00:07:20.521 --> 00:07:24.905 it contains frequency content at or beyond Nyquist, 132 00:07:24.905 --> 00:07:26.185 breaks the bandlimiting requirement 133 00:07:26.185 --> 00:07:28.358 and isn't a valid solution. 134 00:07:28.574 --> 00:07:30.036 So how did everyone get confused 135 00:07:30.036 --> 00:07:32.702 and start thinking of digital signals as stairsteps? 136 00:07:32.702 --> 00:07:34.900 I can think of two good reasons. 137 00:07:34.900 --> 00:07:37.956 First: It's easy enough to convert a sampled signal 138 00:07:37.972 --> 00:07:39.294 to a true stairstep. 139 00:07:39.294 --> 00:07:42.409 Just extend each sample value forward until the next sample period. 140 00:07:42.409 --> 00:07:44.414 This is called a zero-order hold, 141 00:07:44.414 --> 00:07:47.913 and it's an important part of how some digital-to-analog converters work, 142 00:07:47.913 --> 00:07:50.089 especially the simplest ones. 143 00:07:50.089 --> 00:07:55.591 So, anyone who looks up digital-to-analog conversion 144 00:07:55.592 --> 00:07:59.550 is probably going to see a diagram of a stairstep waveform somewhere, 145 00:07:59.550 --> 00:08:01.982 but that's not a finished conversion, 146 00:08:01.982 --> 00:08:04.250 and it's not the signal that comes out. 147 00:08:04.944 --> 00:08:05.684 Second, 148 00:08:05.684 --> 00:08:07.529 and this is probably the more likely reason, 149 00:08:07.529 --> 00:08:09.449 engineers who supposedly know better, 150 00:08:09.449 --> 00:08:10.441 like me, 151 00:08:10.441 --> 00:08:13.193 draw stairsteps even though they're technically wrong. 152 00:08:13.193 --> 00:08:15.571 It's a sort of like a one-dimensional version of 153 00:08:15.571 --> 00:08:17.395 fat bits in an image editor. 154 00:08:17.395 --> 00:08:19.241 Pixels aren't squares either, 155 00:08:19.241 --> 00:08:23.081 they're samples of a 2-dimensional function space and so they're also, 156 00:08:23.081 --> 00:08:26.366 conceptually, infinitely small points. 157 00:08:26.366 --> 00:08:28.500 Practically, it's a real pain in the ass to see 158 00:08:28.500 --> 00:08:30.804 or manipulate infinitely small anything. 159 00:08:30.804 --> 00:08:32.212 So big squares it is. 160 00:08:32.212 --> 00:08:35.966 Digital stairstep drawings are exactly the same thing. 161 00:08:35.966 --> 00:08:37.684 It's just a convenient drawing. 162 00:08:37.684 --> 00:08:40.404 The stairsteps aren't really there. 163 00:08:45.652 --> 00:08:48.233 When we convert a digital signal back to analog, 164 00:08:48.233 --> 00:08:50.900 the result is also smooth regardless of the bit depth. 165 00:08:50.900 --> 00:08:53.193 24 bits or 16 bits... 166 00:08:53.193 --> 00:08:54.196 or 8 bits... 167 00:08:54.196 --> 00:08:55.486 it doesn't matter. 168 00:08:55.486 --> 00:08:57.534 So does that mean that the digital bit depth 169 00:08:57.534 --> 00:08:58.953 makes no difference at all? 170 00:08:59.245 --> 00:09:00.521 Of course not. 171 00:09:02.121 --> 00:09:06.046 Channel 2 here is the same sine wave input, 172 00:09:06.046 --> 00:09:09.086 but we quantize with dither down to eight bits. 173 00:09:09.086 --> 00:09:14.174 On the scope, we still see a nice smooth sine wave on channel 2. 174 00:09:14.174 --> 00:09:18.014 Look very close, and you'll also see a bit more noise. 175 00:09:18.014 --> 00:09:19.305 That's a clue. 176 00:09:19.305 --> 00:09:21.273 If we look at the spectrum of the signal... 177 00:09:22.889 --> 00:09:23.732 aha! 178 00:09:23.732 --> 00:09:26.398 Our sine wave is still there unaffected, 179 00:09:26.398 --> 00:09:28.490 but the noise level of the eight-bit signal 180 00:09:28.490 --> 00:09:32.470 on the second channel is much higher! 181 00:09:32.948 --> 00:09:36.148 And that's the difference the number of bits makes. 182 00:09:36.148 --> 00:09:37.434 That's it! 183 00:09:37.822 --> 00:09:39.956 When we digitize a signal, first we sample it. 184 00:09:39.956 --> 00:09:42.366 The sampling step is perfect; it loses nothing. 185 00:09:42.366 --> 00:09:45.626 But then we quantize it, and quantization adds noise. 186 00:09:47.827 --> 00:09:50.793 The number of bits determines how much noise 187 00:09:50.793 --> 00:09:52.569 and so the level of the noise floor. 188 00:10:00.170 --> 00:10:03.646 What does this dithered quantization noise sound like? 189 00:10:03.646 --> 00:10:06.012 Let's listen to our eight-bit sine wave. 190 00:10:12.521 --> 00:10:15.273 That may have been hard to hear anything but the tone. 191 00:10:15.273 --> 00:10:18.740 Let's listen to just the noise after we notch out the sine wave 192 00:10:18.740 --> 00:10:21.683 and then bring the gain up a bit because the noise is quiet. 193 00:10:32.009 --> 00:10:35.049 Those of you who have used analog recording equipment 194 00:10:35.049 --> 00:10:36.670 may have just thought to yourselves, 195 00:10:36.670 --> 00:10:40.382 "My goodness! That sounds like tape hiss!" 196 00:10:40.382 --> 00:10:41.929 Well, it doesn't just sound like tape hiss, 197 00:10:41.929 --> 00:10:43.433 it acts like it too, 198 00:10:43.433 --> 00:10:45.225 and if we use a gaussian dither 199 00:10:45.225 --> 00:10:47.646 then it's mathematically equivalent in every way. 200 00:10:47.646 --> 00:10:49.225 It is tape hiss. 201 00:10:49.225 --> 00:10:51.774 Intuitively, that means that we can measure tape hiss 202 00:10:51.774 --> 00:10:54.196 and thus the noise floor of magnetic audio tape 203 00:10:54.196 --> 00:10:56.233 in bits instead of decibels, 204 00:10:56.233 --> 00:10:59.902 in order to put things in a digital perspective. 205 00:10:59.902 --> 00:11:03.028 Compact cassettes... 206 00:11:03.028 --> 00:11:05.449 for those of you who are old enough to remember them, 207 00:11:05.449 --> 00:11:09.161 could reach as deep as nine bits in perfect conditions, 208 00:11:09.161 --> 00:11:11.209 though five to six bits was more typical, 209 00:11:11.209 --> 00:11:13.876 especially if it was a recording made on a tape deck. 210 00:11:13.876 --> 00:11:19.422 That's right... your mix tapes were only about six bits deep... if you were lucky! 211 00:11:19.837 --> 00:11:22.345 The very best professional open reel tape 212 00:11:22.345 --> 00:11:24.553 used in studios could barely hit... 213 00:11:24.553 --> 00:11:26.473 any guesses?... 214 00:11:26.473 --> 00:11:27.604 13 bits 215 00:11:27.604 --> 00:11:28.980 with advanced noise reduction. 216 00:11:28.980 --> 00:11:32.062 And that's why seeing 'DDD' on a Compact Disc 217 00:11:32.062 --> 00:11:35.208 used to be such a big, high-end deal. 218 00:11:40.116 --> 00:11:42.825 I keep saying that I'm quantizing with dither, 219 00:11:42.825 --> 00:11:44.734 so what is dither exactly? 220 00:11:44.734 --> 00:11:47.284 More importantly, what does it do? 221 00:11:47.284 --> 00:11:49.876 The simple way to quantize a signal is to choose 222 00:11:49.876 --> 00:11:52.329 the digital amplitude value closest 223 00:11:52.329 --> 00:11:54.377 to the original analog amplitude. 224 00:11:54.377 --> 00:11:55.337 Obvious, right? 225 00:11:55.337 --> 00:11:57.545 Unfortunately, the exact noise you get 226 00:11:57.545 --> 00:11:59.220 from this simple quantization scheme 227 00:11:59.220 --> 00:12:02.174 depends somewhat on the input signal, 228 00:12:02.174 --> 00:12:04.596 so we may get noise that's inconsistent, 229 00:12:04.596 --> 00:12:06.142 or causes distortion, 230 00:12:06.142 --> 00:12:09.054 or is undesirable in some other way. 231 00:12:09.054 --> 00:12:11.764 Dither is specially-constructed noise that 232 00:12:11.764 --> 00:12:15.273 substitutes for the noise produced by simple quantization. 233 00:12:15.273 --> 00:12:18.025 Dither doesn't drown out or mask quantization noise, 234 00:12:18.025 --> 00:12:20.190 it actually replaces it 235 00:12:20.190 --> 00:12:22.612 with noise characteristics of our choosing 236 00:12:22.612 --> 00:12:24.794 that aren't influenced by the input. 237 00:12:25.256 --> 00:12:27.081 Let's watch what dither does. 238 00:12:27.081 --> 00:12:30.078 The signal generator has too much noise for this test 239 00:12:30.431 --> 00:12:33.161 so we'll produce a mathematically 240 00:12:33.161 --> 00:12:34.782 perfect sine wave with the ThinkPad 241 00:12:34.782 --> 00:12:38.205 and quantize it to eight bits with dithering. 242 00:12:39.006 --> 00:12:41.342 We see a nice sine wave on the waveform display 243 00:12:41.342 --> 00:12:43.452 and output scope 244 00:12:44.222 --> 00:12:44.972 and... 245 00:12:46.588 --> 00:12:49.375 once the analog spectrum analyzer catches up... 246 00:12:50.713 --> 00:12:53.588 a clean frequency peak with a uniform noise floor 247 00:12:56.864 --> 00:12:58.611 on both spectral displays 248 00:12:58.611 --> 00:12:59.646 just like before 249 00:12:59.646 --> 00:13:01.549 Again, this is with dither. 250 00:13:02.196 --> 00:13:04.225 Now I turn dithering off. 251 00:13:05.779 --> 00:13:07.913 The quantization noise, that dither had spread out 252 00:13:07.913 --> 00:13:09.577 into a nice, flat noise floor, 253 00:13:09.577 --> 00:13:12.286 piles up into harmonic distortion peaks. 254 00:13:12.286 --> 00:13:16.030 The noise floor is lower, but the level of distortion becomes nonzero, 255 00:13:16.030 --> 00:13:19.668 and the distortion peaks sit higher than the dithering noise did. 256 00:13:19.668 --> 00:13:22.318 At eight bits this effect is exaggerated. 257 00:13:22.488 --> 00:13:24.200 At sixteen bits, 258 00:13:24.692 --> 00:13:25.929 even without dither, 259 00:13:25.929 --> 00:13:28.308 harmonic distortion is going to be so low 260 00:13:28.308 --> 00:13:30.708 as to be completely inaudible. 261 00:13:30.708 --> 00:13:34.581 Still, we can use dither to eliminate it completely 262 00:13:34.581 --> 00:13:36.489 if we so choose. 263 00:13:37.642 --> 00:13:39.273 Turning the dither off again for a moment, 264 00:13:40.934 --> 00:13:43.444 you'll notice that the absolute level of distortion 265 00:13:43.444 --> 00:13:47.070 from undithered quantization stays approximately constant 266 00:13:47.070 --> 00:13:49.033 regardless of the input amplitude. 267 00:13:49.033 --> 00:13:51.998 But when the signal level drops below a half a bit, 268 00:13:51.998 --> 00:13:54.036 everything quantizes to zero. 269 00:13:54.036 --> 00:13:54.910 In a sense, 270 00:13:54.910 --> 00:13:58.557 everything quantizing to zero is just 100% distortion! 271 00:13:58.833 --> 00:14:01.588 Dither eliminates this distortion too. 272 00:14:01.588 --> 00:14:03.599 We reenable dither and... 273 00:14:03.599 --> 00:14:06.377 there's our signal back at 1/4 bit, 274 00:14:06.377 --> 00:14:09.076 with our nice flat noise floor. 275 00:14:09.630 --> 00:14:11.220 The noise floor doesn't have to be flat. 276 00:14:11.220 --> 00:14:12.798 Dither is noise of our choosing, 277 00:14:12.798 --> 00:14:15.006 so let's choose a noise as inoffensive 278 00:14:15.006 --> 00:14:17.017 and difficult to notice as possible. 279 00:14:18.142 --> 00:14:22.484 Our hearing is most sensitive in the midrange from 2kHz to 4kHz, 280 00:14:22.484 --> 00:14:25.438 so that's where background noise is going to be the most obvious. 281 00:14:25.438 --> 00:14:29.406 We can shape dithering noise away from sensitive frequencies 282 00:14:29.406 --> 00:14:31.241 to where hearing is less sensitive, 283 00:14:31.241 --> 00:14:33.910 usually the highest frequencies. 284 00:14:34.249 --> 00:14:37.460 16-bit dithering noise is normally much too quiet to hear at all, 285 00:14:37.460 --> 00:14:39.668 but let's listen to our noise shaping example, 286 00:14:39.668 --> 00:14:42.234 again with the gain brought way up... 287 00:14:56.020 --> 00:14:59.977 Lastly, dithered quantization noise is higher power overall 288 00:14:59.977 --> 00:15:04.276 than undithered quantization noise even when it sounds quieter. 289 00:15:04.276 --> 00:15:07.902 You can see that on a VU meter during passages of near-silence. 290 00:15:07.902 --> 00:15:10.537 But dither isn't only an on or off choice. 291 00:15:10.537 --> 00:15:14.712 We can reduce the dither's power to balance less noise against 292 00:15:14.712 --> 00:15:18.313 a bit of distortion to minimize the overall effect. 293 00:15:19.605 --> 00:15:22.790 We'll also modulate the input signal like this: 294 00:15:27.098 --> 00:15:30.206 ...to show how a varying input affects the quantization noise. 295 00:15:30.206 --> 00:15:33.289 At full dithering power, the noise is uniform, constant, 296 00:15:33.289 --> 00:15:35.643 and featureless just like we expect: 297 00:15:40.937 --> 00:15:42.772 As we reduce the dither's power, 298 00:15:42.772 --> 00:15:46.356 the input increasingly affects the amplitude and the character 299 00:15:46.356 --> 00:15:47.977 of the quantization noise: 300 00:16:09.883 --> 00:16:13.844 Shaped dither behaves similarly, 301 00:16:13.844 --> 00:16:16.553 but noise shaping lends one more nice advantage. 302 00:16:16.553 --> 00:16:18.804 To make a long story short, it can use 303 00:16:18.804 --> 00:16:20.937 a somewhat lower dither power before the input 304 00:16:20.937 --> 00:16:23.662 has as much effect on the output. 305 00:16:49.172 --> 00:16:51.508 Despite all the time I just spent on dither, 306 00:16:51.508 --> 00:16:53.012 we're talking about differences 307 00:16:53.012 --> 00:16:56.372 that start 100 decibels below full scale. 308 00:16:56.372 --> 00:16:59.806 Maybe if the CD had been 14 bits as originally designed, 309 00:16:59.806 --> 00:17:01.513 dither might be more important. 310 00:17:01.989 --> 00:17:02.644 Maybe. 311 00:17:02.644 --> 00:17:05.438 At 16 bits, really, it's mostly a wash. 312 00:17:05.438 --> 00:17:08.019 You can think of dither as an insurance policy 313 00:17:08.019 --> 00:17:11.443 that gives several extra decibels of dynamic range 314 00:17:11.443 --> 00:17:12.804 just in case. 315 00:17:12.990 --> 00:17:14.196 The simple fact is, though, 316 00:17:14.196 --> 00:17:16.361 no one ever ruined a great recording 317 00:17:16.361 --> 00:17:19.182 by not dithering the final master. 318 00:17:24.414 --> 00:17:25.790 We've been using sine waves. 319 00:17:25.790 --> 00:17:28.254 They're the obvious choice when what we want to see 320 00:17:28.254 --> 00:17:32.212 is a system's behavior at a given isolated frequency. 321 00:17:32.212 --> 00:17:34.217 Now let's look at something a bit more complex. 322 00:17:34.217 --> 00:17:35.923 What should we expect to happen 323 00:17:35.923 --> 00:17:39.671 when I change the input to a square wave... 324 00:17:42.718 --> 00:17:45.921 The input scope confirms our 1kHz square wave. 325 00:17:45.921 --> 00:17:47.351 The output scope shows.. 326 00:17:48.614 --> 00:17:51.102 Exactly what it should. 327 00:17:51.102 --> 00:17:53.900 What is a square wave really? 328 00:17:54.654 --> 00:17:57.982 Well, we can say it's a waveform that's some positive value 329 00:17:57.982 --> 00:18:00.788 for half a cycle and then transitions instantaneously 330 00:18:00.788 --> 00:18:02.910 to a negative value for the other half. 331 00:18:02.910 --> 00:18:05.076 But that doesn't really tell us anything useful 332 00:18:05.076 --> 00:18:07.241 about how this input 333 00:18:07.241 --> 00:18:09.378 becomes this output. 334 00:18:10.132 --> 00:18:12.713 Then we remember that any waveform 335 00:18:12.713 --> 00:18:15.508 is also the sum of discrete frequencies, 336 00:18:15.508 --> 00:18:18.302 and a square wave is a particularly simple sum 337 00:18:18.302 --> 00:18:19.636 a fundamental and 338 00:18:19.636 --> 00:18:22.228 an infinite series of odd harmonics. 339 00:18:22.228 --> 00:18:24.597 Sum them all up, you get a square wave. 340 00:18:26.398 --> 00:18:27.433 At first glance, 341 00:18:27.433 --> 00:18:29.225 that doesn't seem very useful either. 342 00:18:29.225 --> 00:18:31.561 You have to sum up an infinite number of harmonics 343 00:18:31.561 --> 00:18:33.108 to get the answer. 344 00:18:33.108 --> 00:18:35.977 Ah, but we don't have an infinite number of harmonics. 345 00:18:36.960 --> 00:18:39.902 We're using a quite sharp anti-aliasing filter 346 00:18:39.902 --> 00:18:42.206 that cuts off right above 20kHz, 347 00:18:42.206 --> 00:18:44.158 so our signal is band-limited, 348 00:18:44.158 --> 00:18:46.421 which means we get this: 349 00:18:52.500 --> 00:18:56.468 ..and that's exactly what we see on the output scope. 350 00:18:56.468 --> 00:18:59.550 The rippling you see around sharp edges in a bandlimited signal 351 00:18:59.550 --> 00:19:00.926 is called the Gibbs effect. 352 00:19:00.926 --> 00:19:04.137 It happens whenever you slice off part of the frequency domain 353 00:19:04.137 --> 00:19:07.006 in the middle of nonzero energy. 354 00:19:07.006 --> 00:19:09.854 The usual rule of thumb you'll hear is the sharper the cutoff, 355 00:19:09.854 --> 00:19:11.188 the stronger the rippling, 356 00:19:11.188 --> 00:19:12.777 which is approximately true, 357 00:19:12.777 --> 00:19:14.900 but we have to be careful how we think about it. 358 00:19:14.900 --> 00:19:15.774 For example... 359 00:19:15.774 --> 00:19:19.529 what would you expect our quite sharp anti-aliasing filter 360 00:19:19.529 --> 00:19:23.181 to do if I run our signal through it a second time? 361 00:19:34.136 --> 00:19:37.588 Aside from adding a few fractional cycles of delay, 362 00:19:37.588 --> 00:19:39.348 the answer is... 363 00:19:39.348 --> 00:19:40.857 nothing at all. 364 00:19:41.257 --> 00:19:43.302 The signal is already bandlimited. 365 00:19:43.656 --> 00:19:46.590 Bandlimiting it again doesn't do anything. 366 00:19:46.590 --> 00:19:50.686 A second pass can't remove frequencies that we already removed. 367 00:19:52.070 --> 00:19:53.737 And that's important. 368 00:19:53.737 --> 00:19:56.233 People tend to think of the ripples as a kind of artifact 369 00:19:56.233 --> 00:19:59.945 that's added by anti-aliasing and anti-imaging filters, 370 00:19:59.945 --> 00:20:01.737 implying that the ripples get worse 371 00:20:01.737 --> 00:20:03.913 each time the signal passes through. 372 00:20:03.913 --> 00:20:05.950 We can see that in this case that didn't happen. 373 00:20:05.950 --> 00:20:09.492 So was it really the filter that added the ripples the first time through? 374 00:20:09.492 --> 00:20:10.537 No, not really. 375 00:20:10.537 --> 00:20:12.126 It's a subtle distinction, 376 00:20:12.126 --> 00:20:15.252 but Gibbs effect ripples aren't added by filters, 377 00:20:15.252 --> 00:20:18.836 they're just part of what a bandlimited signal is. 378 00:20:18.836 --> 00:20:20.798 Even if we synthetically construct 379 00:20:20.798 --> 00:20:23.508 what looks like a perfect digital square wave, 380 00:20:23.508 --> 00:20:26.206 it's still limited to the channel bandwidth. 381 00:20:26.206 --> 00:20:29.140 Remember the stairstep representation is misleading. 382 00:20:29.140 --> 00:20:32.222 What we really have here are instantaneous sample points, 383 00:20:32.222 --> 00:20:36.148 and only one bandlimited signal fits those points. 384 00:20:36.148 --> 00:20:39.614 All we did when we drew our apparently perfect square wave 385 00:20:39.614 --> 00:20:43.198 was line up the sample points just right so it appeared 386 00:20:43.198 --> 00:20:47.785 that there were no ripples if we played connect-the-dots. 387 00:20:47.785 --> 00:20:49.449 But the original bandlimited signal, 388 00:20:49.449 --> 00:20:52.742 complete with ripples, was still there. 389 00:20:54.004 --> 00:20:56.542 And that leads us to one more important point. 390 00:20:56.542 --> 00:20:59.550 You've probably heard that the timing precision of a digital signal 391 00:20:59.550 --> 00:21:02.409 is limited by its sample rate; put another way, 392 00:21:02.409 --> 00:21:05.140 that digital signals can't represent anything 393 00:21:05.140 --> 00:21:08.041 that falls between the samples... 394 00:21:08.041 --> 00:21:11.422 implying that impulses or fast attacks have to align 395 00:21:11.422 --> 00:21:14.473 exactly with a sample, or the timing gets mangled... 396 00:21:14.473 --> 00:21:16.219 or they just disappear. 397 00:21:16.711 --> 00:21:20.820 At this point, we can easily see why that's wrong. 398 00:21:20.820 --> 00:21:23.742 Again, our input signals are bandlimited. 399 00:21:23.742 --> 00:21:26.036 And digital signals are samples, 400 00:21:26.036 --> 00:21:29.340 not stairsteps, not 'connect-the-dots'. 401 00:21:31.572 --> 00:21:34.592 We most certainly can, for example, 402 00:21:36.777 --> 00:21:39.337 put the rising edge of our bandlimited square wave 403 00:21:39.337 --> 00:21:42.004 anywhere we want between samples. 404 00:21:42.004 --> 00:21:44.354 It's represented perfectly 405 00:21:47.508 --> 00:21:50.218 and it's reconstructed perfectly. 406 00:22:04.620 --> 00:22:06.526 Just like in the previous episode, 407 00:22:06.526 --> 00:22:08.393 we've covered a broad range of topics, 408 00:22:08.393 --> 00:22:10.868 and yet barely scratched the surface of each one. 409 00:22:10.868 --> 00:22:13.620 If anything, my sins of omission are greater this time around... 410 00:22:13.620 --> 00:22:16.286 but this is a good stopping point. 411 00:22:16.286 --> 00:22:17.833 Or maybe, a good starting point. 412 00:22:17.833 --> 00:22:18.708 Dig deeper. 413 00:22:18.708 --> 00:22:19.710 Experiment. 414 00:22:19.710 --> 00:22:21.374 I chose my demos very carefully 415 00:22:21.374 --> 00:22:23.668 to be simple and give clear results. 416 00:22:23.668 --> 00:22:26.217 You can reproduce every one of them on your own if you like. 417 00:22:26.217 --> 00:22:28.766 But let's face it, sometimes we learn the most 418 00:22:28.766 --> 00:22:30.516 about a spiffy toy by breaking it open 419 00:22:30.516 --> 00:22:32.553 and studying all the pieces that fall out. 420 00:22:32.553 --> 00:22:35.230 That's OK, we're engineers. 421 00:22:35.230 --> 00:22:36.350 Play with the demo parameters, 422 00:22:36.350 --> 00:22:37.972 hack up the code, 423 00:22:37.972 --> 00:22:39.774 set up alternate experiments. 424 00:22:39.774 --> 00:22:40.692 The source code for everything, 425 00:22:40.692 --> 00:22:42.398 including the little pushbutton demo application, 426 00:22:42.398 --> 00:22:44.361 is up at Xiph.Org. 427 00:22:44.361 --> 00:22:45.940 In the course of experimentation, 428 00:22:45.940 --> 00:22:47.401 you're likely to run into something 429 00:22:47.401 --> 00:22:49.950 that you didn't expect and can't explain. 430 00:22:49.950 --> 00:22:51.198 Don't worry! 431 00:22:51.198 --> 00:22:54.537 My earlier snark aside, Wikipedia is fantastic for 432 00:22:54.537 --> 00:22:56.788 exactly this kind of casual research. 433 00:22:56.788 --> 00:22:59.956 If you're really serious about understanding signals, 434 00:22:59.956 --> 00:23:03.337 several universities have advanced materials online, 435 00:23:03.337 --> 00:23:07.380 such as the 6.003 and 6.007 Signals and Systems modules 436 00:23:07.380 --> 00:23:08.798 at MIT OpenCourseWare. 437 00:23:08.798 --> 00:23:11.593 And of course, there's always the community here at Xiph.Org. 438 00:23:12.792 --> 00:23:13.929 Digging deeper or not, 439 00:23:13.929 --> 00:23:14.974 I am out of coffee, 440 00:23:14.974 --> 00:23:16.436 so, until next time, 441 00:23:16.436 --> 00:23:19.316 happy hacking!