Lossless

From Hydrogenaudio Knowledgebase

Compression is lossless when decoding the compressed data gives a result which is identical bit-by-bit to the uncompressed original. Also, a format that stores data uncompressed is lossless if it can be reverted back to the original bit-by-bit.

Lossless compression has been used for long in various applications, for example generic file compressors like ZIP or RAR or Windows NTFS file compression feature; this article is about lossless compression of audio signals.

Lossless audio compression and formats

Compressing audio with generic file compressors to e.g. .7z or .rar is not efficient for typical audio signals: file sizes end up fairly close to the uncompressed original. Lossless audio formats might measure closer to half size of the original uncompressed linear PCM (.wav or .aiff) file, utilizing knowledge about real-world audio data. File sizes will still be larger than audio compressed with any (reasonable) lossy encoder, as lossy compression aims at saving space by replacing the original signal by an approximation which is perceptually "close" but easier to compress.

Lossless audio file formats typically have features that generic file compressors are lacking (but most lossy audio formats possess): for playback they can be read block by block rather than having to unpack the whole file first, and a decoder might pick up the audio mid-stream and play from there (like when tuning in radio on a channel). Furthermore, they can be tagged with metadata like artist, album, title, track number etc. Because this feature is designed for metadata to be altered by users at their discretion, a lossless audio format need not transfer metadata bit-by-bit, only the audio - although certain lossless codecs can also store the original's metadata in a separate chunk to be recreated.

Just like e.g. two .zip-compressed copies of the same file might differ due to e.g. effort made to find a smaller file with the same information - try for example 7-zip with different compression options - then the same original audio file might encode to different size depending on both codec format and the settings used upon encoding - possibly the compressor's internal choices could depend on the CPU and process different files with the same command given on two different computers.

The phrase "lossless" is not restricted to files, it also refers to data streams (like a video file with lossless audio) or not in files (an audio CD has no files) - or furthermore, to the process that generates a signal. E.g. reducing a 16-bit signal to 8 bits is not a "lossless" operation, and it does not become lossless even if the output signal is stored in a "lossless" format like FLAC (or even uncompressed .wav or .aiff). MQA is lossy processing even if delivered with a codec that could deliver the lossless signal.

Notable lossless codecs in current use

Different codecs - i.e., formats and encoders/decoders - have been developed with different priorities in mind, as trade-off between compressed file size vs encoding CPU load (time taken to encode) vs decoding CPU load (to play or decompress for e.g. creating lossy files for portable use). Also they differ as to features and OS/third party support. Thus there is no single 'superior for all' format. To compare features and performance, see the HA Wiki's Lossless comparison - though arguably, performance was more of an issue with storage/CPU costs of the early 2000s when most popular lossless formats were launched and when the first version of the comparison and this article were written.

Some formats in current use - some widespread and available from online music stores, others arguably restricted to the enthusiast user segments - in alphabetical order:

Also Blu-Ray/DVD discs are certainly widespread, carrying a variety of audio formats of which the lossless compressed formats are Meridian Lossless Packing (MLP), Dolby TrueHD (uses the MLP algorithm) and DTS-HD MA (hybrid). FFmpeg has support for these.

Other (once) notable formats

These formats once have at some stage been widely used or otherwise notable, though end-users would hardly encode to them anymore (as of 2022). Some of them can be decoded by ffmpeg even if original decoding executables are not available.

  • Shorten (SHN): The major lossless compressor of the 1990s. ffmpeg can decode.
  • WMA lossless: Once aggressively pushed by Microsoft, support for the WMA formats has waned to the point where certain Windows 10 releases could not handle WMA lossless(ly). Not recommended. ffmpeg can decode.
  • ATRAC Advanced Lossless: a lossless hybrid extension of Sony's ATRAC format (MiniDisc etc.). Like WMA, a once-corporate-backed format now considered legacy. ffmpeg can decode.
  • mp3HD: A short-lived similar extension of MP3, hybrid with a lossless correction stream.
  • Real Lossless. Before the Windows Media suite, Real Networks had theirs, and it was expanded with a lossless audio format and a freeware encoder. Real would later support the development of MPEG-4 ALS.
  • MPEG-4 ALS. Despite being an ISO standard, with an encoder/decoder available, the format scarcely caught on. Its predecessors LPAC/LTAC once enjoyed some popularity in competition with Shorten. ffmpeg can decode ALS' most common compression mode.
  • MPEG-4 SLS. Also ISO-standardized, but hardly in use, and obviously not intended for end-users, witnessed by the pricing of the only known encoder.
  • Lossless Audio (La). Notable for its very high compression levels, and would therefore appear in comparison tests. Unmaintained since 2004.
  • Sac. Only semi-notable for its even higher compression levels, not for ever being practically useful other than for benchmarking.
  • RK Audio (RKAU), also with a lossy compressor, and the later general-purpose compressor WinRK. RKAU offered good compression for year 2000 standards. ffmpeg can decode the lossless codec since version 6.0.
  • Bonk. Also with a lossy compressor, both abandoned around 2002. More notable for the project evolving into the BonkEnc CD ripper, which later changed name to fre:ac. Bonk itself was redeveloped into a lossy/lossless codec called sonic which has ffmpeg support. ffmpeg can decode the lossless Bonk codec since version 6.0.
  • aptX Lossless is a codec to be used in Bluetooth streaming. Hardware support announced September 2021, future popularity unknown at time of writing.

Also several audio editing software have (had) their own formats, several of which are still in use.

Oddball legacy formats

There are several old lossless formats that never made it to a significant userbase. Most of those would have disappeared by now, but several are being preserved for posterity at rjamorim's Rarewares/ReallyRareWares website.

  • a-Pac (by sound card manufacturer MARIAN)
  • Advanced Digital Audio (ADA)
  • AudioZip
  • Dakx WAV
  • Entis Lab MIO
  • Kexis
  • LiteWave
  • mkw
  • OggSquish (Xiph, discontinued in favour of FLAC).
  • Pegasus SPS
  • Split2000
  • Sonarc (possibly the first known lossless audio compressor, apparently predating both Shorten and VocPack)
  • VocPack
  • WavArc
  • WaveZip/MUSICompress

And finally, HA will sometimes see codecs created more for educational purposes than indended to acquire a userbase, like SLAC by the WavPack developer and SELA (HA thread links).

Further reading

This wiki has a Lossless Codec Comparison, originally by Rjamorim.
External links, covering both lossless and lossy codecs/formats: