|
|
Line 1: |
Line 1: |
| = Introduction =
| |
|
| |
|
| Noise shaping is a technique that is often used in combination with
| |
| [[Dithering|dithering]] in the context of [[Quantize|quantization]]. The purpose of noise shaping
| |
| is to alter the spectral ''shape'' of the error that is introduced due
| |
| to [[Dithering|dithering]] and [[Quantize|quantization]]. Sometimes it may be desirable to
| |
| concentrate most of the noise power in a certain frequency band so
| |
| that the overall noise becomes less perceivable.
| |
|
| |
| = Applications =
| |
|
| |
| In the field of digital image processing a popular noise shaping
| |
| technique is known as "Floyd Steinberg dithering". This algorithm is
| |
| used to reduce the noise power at lower spatial frequencies which
| |
| results in an improved perceptual quality.
| |
|
| |
| For word length reductions of audio signals (i.e conversion from a 24bit
| |
| signal to a 16bit signal) noise shaping can be used to push quantization
| |
| noise out of perceptually important frequency bands to perceptually less
| |
| important frequency bands (>20 kHz) in order to reduce the perceived
| |
| loudness of the quantization noise.
| |
|
| |
| Another example is the
| |
| [[Super Audio Compact Disc|Super Audio CD]]. The SACD format represents
| |
| the most extreme trade-off between word length (1bit/sample) and
| |
| sampling frequency (2.8 Mhz). Due to the low word length high power
| |
| noise is introduced. With the help of noise shaping this noise is
| |
| usually rejected by about 120 dB in the audible band (0-20 kHz) at the
| |
| cost of a higher noise power in the ultrasonic range (+3.5dB for
| |
| frequencies above 100 kHz).
| |
|
| |
| = Limits & filter design restrictions =
| |
|
| |
| Noise Shaping without [[Dithering|''dithered'']] quantization is also possible but
| |
| less useful because the spectral characteristics of ''undithered''
| |
| quantization noise is not easily predictable and depends heavily on the
| |
| digital signal that is to be quantized.
| |
|
| |
| It's not possible to lower the noise power for ''all'' frequencies
| |
| via noise shaping because this would be in violation of the
| |
| rate/distortion principles. But it ''is'' possible to lower the noise
| |
| in certain frequency bands at the cost of increased noise power in other
| |
| frequency bands. This property is also known as the
| |
| ''noise shaping theorem''. It states that the average logarithmic magnitude response of
| |
| the corresponding noise shaping ''filter'' over the whole linear
| |
| frequency axis cannot drop below 0dB.
| |
|
| |
| The noise shaping application poses some restrictions on the fiter
| |
| design. The first sample of the filter's impulse response always equals
| |
| one. Also, interesting filters for noise shaping are so-called
| |
| ''minimum phase filters''. This type of filter performs optimal in the sense of
| |
| the noise shaping theorem because the average logarithmig magnitude
| |
| response ''is'' 0dB.
| |
|
| |
| For the special [[Super Audio Compact Disc|SACD]] case an additional
| |
| constraint comes into play: The magnitude response should not exceed
| |
| 3.5dB. This is enforced to prevent so-called limit cycles.
| |
|
| |
| The design of these kinds of filters is however not trivial. This is not
| |
| covered by this article.
| |
|
| |
| = Digital filter reminder =
| |
|
| |
| A common representation of digital filters is their ''transfer function''
| |
| in form of a rational polynomial in z^-1. These polynomials are defined
| |
| by their coefficients b0,b1,b2,...,a1,a2,...:
| |
|
| |
| <math>N(z) = \frac{B(z)}{A(z)} = \frac{b_0 + b_1 z^{-1} + b_2 z^{-2} + \ldots + b_m z^{-m}}
| |
| {1 + a_1 z^{-1} + a_2 z^{-2} + \ldots + a_m z^{-m}} </math>
| |
|
| |
| where 'N' is the transfer function and 'm' the filter's ''order''.
| |
|
| |
| == FIR filters ==
| |
|
| |
| In case a1=a2=a3=...=0 the filter's impulse response is finite which is
| |
| why these filters are called FIR ([[Finite Impulse Response Filter|finite impulse response]])
| |
| filters as opposed to IIR
| |
| [[Infinite Impulse Response Filter|infinite impulse response]] filters. Note that in case of
| |
| FIR filters the coefficients b0, b1, ... correspond to the impulse
| |
| response. The impulse response is another way a filter can be
| |
| characterized and describes what the filter does to a unit pulse. Let's
| |
| assume for the moment that a1=a2=a3=...=0. The the impulse response
| |
| would be:
| |
|
| |
| x: 0 0 1 0 0 0 0 ... 0 0 0 ... (input signal)
| |
| y: 0 0 b0 b1 b2 b3 b4 ... 0 0 0 ... (output signal)
| |
|
| |
| It helps to think of negative powers of z in transfer functions as a
| |
| delay: z^-1 = delay of one sample. z^-2 = delay of two samples etc.
| |
| Since a digital filter is a linear and time-invariant system (LTI) it is
| |
| possible to determine the filter's output by a superposition of scaled
| |
| and shifted impulse responses. This leads to the following relationship
| |
| between input and output in the case of FIR filters:
| |
|
| |
| time domain Z-transform
| |
| ----------- -----------
| |
| y[k] = b0 x[k] + b1 x[k-1] + b2 x[k-2] + ... Y(z) = X(z) * B(z)
| |
|
| |
| where x[] is the input signal, y[] the output signal and k the current
| |
| sample index. This is a simple convolution directly computed using the
| |
| given impulse response. The convolution in the time domain corresponds
| |
| to a multiplication in the Z-domain.
| |
|
| |
| == IIR filters ==
| |
|
| |
| IIR filters are now a simple extension to the FIR case:
| |
|
| |
| time domain Z-transform
| |
| ----------- -----------
| |
| y[k] + a1 y[k-1] + a2 y[k-2] + ... Y(z) * A(Z) = X(Z) * B(Z)
| |
| = b0 x[k] + b1 x[k-1] + b2 x[k-2] + ...
| |
|
| |
| Solving for Y we get
| |
|
| |
| time domain Z-transform
| |
| ----------- -----------
| |
| y[k] = b0 x[k] + b1 x[k-1] + b2 x[k-2] + ... Y(Z) = X(Z) * N(z)
| |
| - a1 y[k-1] - a2 y[k-2] + ...
| |
|
| |
| which is the well known ''difference formula''. Its signal flow graph
| |
| representation is known as
| |
| [http://ccrma.stanford.edu/~jos/filters/Direct_Form_I.html direct form I].
| |
| Now, let's see how we can apply this kind of filter to quantization
| |
| errors...
| |
|
| |
| = Noise Shaper Implementation =
| |
|
| |
| Usually a filter's transfer function is called 'H'. In this article 'N'
| |
| has been used because this filter is also known as noise transfer
| |
| function (NTF) in the context of noise shaping. Later, another filter
| |
| 'H' is introduced that has a special relation to 'N'.
| |
|
| |
| == Direct Form I ==
| |
|
| |
| For convenience let us substitute the actual difference in the
| |
| difference formula for 'u'. Also, b0 must be equal to one for noise
| |
| shaping filters as mentioned above. This leads to
| |
|
| |
| y[k] = x[k] + u[k] with u[k] = b1 x[k-1] + b2 x[k-2] + ...
| |
| - a1 y[k-1] - a2 y[k-2] + ...
| |
|
| |
| Note that since the filter is supposed to be applied on the quantization
| |
| noise y[] corresponds to the filtered noise and x[] to the unfiltered
| |
| noise. Suppose s[] is the signal we're trying to quantize. The following
| |
| pseudo code snippet shows the inner loop over k of ''one possible''
| |
| noise shaping implementation:
| |
|
| |
| (1) compute u[k]
| |
| (2) wanted[k] = s[k] + u[k]
| |
| (3) quantized[k] = quantize( wanted[k] + dither() )
| |
| (4) y[k] = quantized[k] - s[k]
| |
| (5) x[k] = y[k] - u[k]
| |
|
| |
| That's about it. It may not be obvious at first, though. Take a closer
| |
| look at what happens at step 3. Note that the following is true
| |
|
| |
| time domain Z-transform
| |
| ----------- -----------
| |
| quantized[k] = s[k] + u[k] + x[k] Q(z) = S(z) + X(z) * N(z)
| |
| = s[k] + y[k] = S(z) + Y(z)
| |
|
| |
| u[k], wanted[k] are temporary variables. The values are not needed
| |
| outside this iteration, so you don't need to remember these. But we
| |
| ''do'' need to remember the past filtered and unfiltered error samples
| |
| y[] & x[] upto some amount of samples determined by the filter order.
| |
|
| |
| == Decoupling filtering and quantization ==
| |
|
| |
| [[Image:Noiseshaper3.png|thumb|right|200px]]
| |
| As mentioned, the previous example is ''one possible'' implementation.
| |
| It's not the smartest one. But it's hopefully easy to understand.
| |
| There's another common formulation. We now require some more transfer
| |
| function fiddeling. First, we add one and subtract one again:
| |
|
| |
| <math> N(z) = \frac{B(z)}{A(z)} = 1 - \frac{A(z)}{A(z)} + \frac{B(z)}{A(z)}</math>
| |
|
| |
| Since now the two fractions have a common denominator we can combine
| |
| them easily to one fraction. Also, let's factor z^-1 out of the
| |
| fraction:
| |
|
| |
| <math>N(z) = 1 - \frac{A(z)-B(z)}{A(z)} = 1 - z^{-1} \frac{z ( A(z)-B(z) )}{A(z)} =: 1 - z^{-1} H(z)</math>
| |
|
| |
| The last thing we did is writing H(z) for z*(A(Z)-B(Z)). Note that with
| |
| b0=1 it follows that z*(1-b0)=0. So, H(z) can be written as
| |
|
| |
| <math>H(z) = \frac{c0 + c1 z^{-1} + c2 z^{-2} + \ldots}{1 + a1 z^{-1} + a2 z^{-2} + \ldots} </math>
| |
|
| |
| with <math>c_{k-1} = a_k - b_k</math> for <math>k = 1 \ldots m</math>. H(z)
| |
| is some ''arbitrary'' filter. The big advantage now is that it is
| |
| possible to plug nearly any digital filter implementation for H(z) into
| |
| the noise shaping loop. One obvious approach is the difference formula
| |
| from above but there are other alternatives with desirable properties.
| |
| However, the purpose of this article is not to discuss digital filters
| |
| in general but their application as noise shaping filters. So,
| |
| throughout the rest of this article H(z) is simply treated as a black
| |
| box. This black box has a certain state and gives us an output sample
| |
| for each input sample. The modified pseudo code loop looks like this:
| |
|
| |
| (1) h_out[k] = H.filter(last_error)
| |
| (2) wanted[k] = s[k] - h_out[k]
| |
| (3) quantized[k] = quantize( wanted[k] + dither() )
| |
| (4) last_error = quantized[k] - wanted[k]
| |
|
| |
| which is equivalent to the shown signal flow graph where the unit
| |
| sample delay z^-1 is implemented by the state variable 'last_error':
| |
| Remember that this black box filter H(z) doesn't correspond the the
| |
| actual noise shaping filter N(z). Depending on the context published
| |
| coefficients for noise shaping correspond either to N(z), H(z), or
| |
| -H(z).
| |
|
| |
| = Example filters =
| |
|
| |
| The following two filters have been designed by Sebastian Gesemann
| |
| via non-linear optimization to match a slightly modified version of
| |
| the LAME MP3 encoder's ATH (absolute threshold of hearing) curve. They
| |
| were first published in the
| |
| [http://www.hydrogenaudio.org/forums/index.php?showtopic=47980&pid=551558&st=0 Hydrogenaudio forum]
| |
| including the noise transfer function's second order factorizations.
| |
|
| |
| == Sampling rate of 44100 Hz, derived from LAME's ATH formula ==
| |
|
| |
| The noise transfer function is given by
| |
| <math>N(z) = \frac{1 -1.1474 z^{-1} +0.5383 z^{-2} -0.3520 z^{-3} +0.3475 z^{-4}}
| |
| {1 +1.0587 z^{-1} +0.0676 z^{-2} -0.6054 z^{-3} -0.2738 z^{-4}}</math>
| |
|
| |
| The corresponding filter H is given by
| |
| <math>H(z) = \frac{2.2061 -0.4706 z^{-1} -0.2534 z^{-2} -0.6214 z^{-3}}
| |
| {1 +1.0587 z^{-1} +0.0676 z^{-2} -0.6054 z^{-3} -0.2738 z^{-4}}</math>
| |
|
| |
| == Sampling rate of 48000 Hz, derived from LAME's ATH formula ==
| |
|
| |
| The noise transfer function is given by
| |
| <math>N(z) = \frac{1 -1.3344 z^{-1} +0.7455 z^{-2} -0.4602 z^{-3} +0.3463 z^{-4}}
| |
| {1 +0.9030 z^{-1} +0.0116 z^{-2} -0.5853 z^{-3} -0.2571 z^{-4}}</math>
| |
|
| |
| The corresponding filter H is given by
| |
| <math>H(z) = \frac{2.2374 -0.7339 z^{-1} -0.1251 z^{-2} -0.6033 z^{-3}}
| |
| {1 +0.9030 z^{-1} +0.0116 z^{-2} -0.5853 z^{-3} -0.2571 z^{-4}}</math>
| |