Huffman coding
Most lossy audio encoders (MP3, Vorbis, AAC) use a common algorithmic technique, known as Huffman coding. This is not a compression algorithm, but rather a coding method, known as (entropy coding). This coding creates variable length codes on a whole number of bits. Higher probability symbols have shorter codes. Huffman codes have the property to have a unique prefix, they can therefore be decoded correctly in spite of their variable length. The decoding step is very fast (via a correspondence table). This kind of coding allows to save on the average a bit less than 20% of space.
It is an ideal complement of the perceptual coding: During big polyphonies, the perceptual coding is very efficient, due to the fact that many sounds are masked or lessened, but little information is identical, so the Huffman algorithm is very seldomly efficient. During "pure" sounds there are few masking effects, but Huffman is then very efficient because digitalized sound contains many repetitive bytes. These bytes will later be replaced by shorter codes.
text © Gabriel Bouvigne - http://mp3-tech.org/
Additional reading:
- Huffman Original Research Paper from 1952 on Entropy Coding.
- Huffman soure code an ANSI C implementation of the algorithm.
- n-ary Huffman Template algorithm written in C++ (GPL liscense).