Difference between revisions of "LossyWAV"

From Hydrogenaudio Knowledgebase
Jump to: navigation, search
(External links)
(Supported input formats)
Line 106: Line 106:
  
 
==Supported input formats==
 
==Supported input formats==
*[[WAV]]: 16-bit, 24-bit; sample rate ≥ 32kHz [[Pulse Code Modulation|PCM]]. Very high sample rates (>48kHz) have not been extensively tested. Tunings have been focussed on 16-bit, 44.1kHz samples (i.e. [[Wikipedia:Red Book (audio CD standard)|CD]] PCM).
+
*[[WAV]]: 4-bit to 32-bit integer; sample rate ≥ 32kHz [[Pulse Code Modulation|PCM]]. Very high sample rates (>48kHz) have not been extensively tested. Tunings have been focussed on 16-bit, 44.1kHz samples (i.e. [[Wikipedia:Red Book (audio CD standard)|CD]] PCM).
  
 
==Codec compatibility==
 
==Codec compatibility==

Revision as of 11:06, 24 January 2008

lossyWAV
{{{logo}}}

Developer(s) Nick.C
Release information
Initial release {{{released}}}
Stable release v0.6.7 RC2
Preview release beta v0.7.1
Compatibility
Operating system MS-DOS
Additional information
Use Digital signal processing
License LGPL license
Website Hydrogenaudio

lossyWAV is a new free lossy pre-processor for PCM audio contained in the WAV file format. It reduces bit depth of the input signal, which, when used in conjunction with certain lossless codecs, reduces the bitrate of the encoded file significantly compared to unpreprocessed compression. lossyWAV's primary goal is to maintain transparency with a high degree of confidence when processing any audio data.

History

lossyFLAC is an idea started by 2Bdecided at hydrogenaudio, utilising the wasted bits feature of the FLAC lossless codec with the aim of transparently reducing audio bit depth (making some lower significant bits (LSB's) zero), consequently taking advantage of FLAC's detection of consistently zeroed lower significant bits within each single frame and significantly increasing coding efficiency.[1] In this way the user can enjoy audio encoded using the same codec (which may be all important from a hardware compatibility perspective) at a reduced bitrate compared to the lossless version.

Nick.C ported the original MATLAB implementation to Delphi (Many thanks CodeGear for Turbo Explorer!!) with a liberal sprinkling of IA-32 and x87 Assembly Language for speed.

Subsequently, lossyFLAC proved itself to work with other lossless codecs, so the application name was changed to lossyWAV.

Since then, Nick.C has heavily developed and built upon lossyWAV, with valuable tuning performed by halb27 at hydrogenaudio.

Indicative bitrate reduction

It must be stressed that lossyWAV is a pure variable bit rate pre-processor. Bits-to-remove from the audio data are calculated on a block-by-block basis (default codec-block length = 512 samples) using overlapping fast Fourier transform (FFT) analyses of at least two lengths (default = 64 & 1024 samples). After some manipulation, the results of each FFT analysis for a specific codec-block are then grouped and the minimum value used to determine bits-to-remove for the whole codec-block. Each sample in the codec-block is then rounded such that the first <bits-to-remove> LSB's are zero. In this way the wasted bits feature of FLAC et al is exploited.

lossyWAV Test Set FLAC -8 Version lossyWAV -1 lossyWAV -2 lossyWAV -3
10 Album Test Average 850kbps beta v0.5.8 480kbps 426kbps 376kbps
53 sample "problem" set 784kbps beta v0.5.8 543kbps 491kbps 434kbps
10 Album Test Average 850kbps v0.6.7 RC2 ---- ---- 402kbps
53 sample "problem" set 784kbps v0.6.7 RC2 558kbps 515kbps 462kbps
Large Foobar2000 Conversion: (lossyWAV -3; FLAC -5 -b 512)
Album FLAC -8 Version lossyWAV -3
3686 Tracks; 290 Discs; 100913MB > 42391MB (42%) 884kbps beta v0.6.2 371kbps

The 3686 track conversion took 7 hours 15 minutes on an Intel C2D @ 3GHZ, 2GB RAM.

File identification

lossyWAV-processed WAV files are named with a double filename extension, .lossy.wav, to make them instantly identifiable. e.g. ".lossy.flac" would indicate an audio file which was processed using lossyWAV, and subsequently encoded using FLAC.[2]

From beta v0.6.1, the -correction parameter is used when processing to create a correction file which is named with the .lwcdf.wav double filename extension. When "added" to the corresponding .lossy.wav, using a not yet implemented parameter of lossyWAV, the original file will be reconstituted.

Combinations of lossyWAV with each specific encoder are referred to as lossyX, where X is an abbreviation of the lossless codec name. Combination names are listed in the "known supported codecs" section below.

From beta v0.5.9, lossyWAV inserts a variable length FACT chunk into the WAV file immediately after the FMT chunk. This takes the form:
fact/<size>/lossyWAV beta vx.y.z : dd/mm/yyyy hh:mm:ss
-2 -cbs 512 -nts 0.00 -snr 21.00 -skew 36.00
-spf 22224-22235-22336-12347-12358 -fft 10101
Where the version, date & time and user settings are copied. Additionally, if a lossyWAV FACT chunk is found in a file, the processing will be halted (exit code = 16) to prevent re-processing of an already processed file.

The -check parameter can be used to determine whether a file has previously been processed without trying to process it, exit code = 16 if already processed; exit code = 0 if not.

Quality presets

  • -1: Highest quality preset, disc space-saving alternative to lossless archiving for large audio collections.
  • -2: Default preset; a compromise between -1 and -3.
  • -3: High quality preset for usage on a compatible DAP, approx. 400kbps for "normal" music. [3]

All tuning has been performed on quality preset -3 with -2 and -1 being more conservative. Quality preset -3 is generally accepted to be (and from testing so far is) transparent. If you find a track which -3 fails to achieve transparency after processing, please post a sample (no more than 30 seconds) in the development thread.

Apart from the quality presets the -nts (noise threshold shift) parameter is the most important parameter to control quality. Without noise threshold shifting (-nts 0) the number of bits to be removed is computed in a theoretically optimal way. -nts 0 is defaulted when using quality preset -3.

In order to be defensive a negative -nts value can be chosen up to -48. -nts -2 is defaulted when using quality preset -2, and -1 defaults to -nts -4. For archiving purposes and/or very cautious users even more conservative values may be of interest.

Because of internal precautions in addition to 2Bdecided's principles experience so far tells us that a small positive -nts value keeps the encoding transparent or brings up only subtle differences. This way file size can be decreased. A -nts value of more than 10 however is not recommended.

Supported input formats

  • WAV: 4-bit to 32-bit integer; sample rate ≥ 32kHz PCM. Very high sample rates (>48kHz) have not been extensively tested. Tunings have been focussed on 16-bit, 44.1kHz samples (i.e. CD PCM).

Codec compatibility

Known supported codecs

Recommended settings
Codec lossyWAV parameters Encoder parameters Combination name
FLAC -5 -b 512 --keep-foreign-metadata[4] lossyFLAC
LPAC -b512 lossyLPAC
MPEG-4 ALS -l -n512 lossyALS
TAK -fsl512 lossyTAK
WavPack --blocksize=512 lossyWV
WMA Lossless lossyWMALSL

There is also evidence — so-called "Bit Shifting" — to suggest that lossyWAV may work with MLP, but this remains untested due to prohibitive prices of encoders.

A comparison of portable media players is here, which shows FLAC and WMA Lossless compatibility among listed players. Any player supported by Rockbox can use FLAC or wavPack files after installing Rockbox.

Known unsupported codecs

Using lossyWAV

Application settings

lossyWAV v0.6.7 RC2, Copyright (C) 2007,2008 Nick Currie. Portions (C) 1996
Don Cross. lossyWAV is issued with NO WARRANTY WHATSOEVER and is free software.

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            highest quality preset, circa 500kbps for 44.1khz, 2ch;
-2            default settings, circa 450kbps for 44.1khz, 2ch;
-3            DAP preset, circa 400kbps for 44.1khz, 2ch.

Standard Options:

-o <folder>   destination folder for the output file(s).
-nts <n>      set noise_threshold_shift to n dB (-48.0dB<=n<=+12.0dB);
              (-ve values reduce bits to remove, +ve values increase).
-snr <n>      set minimum average signal to added noise ratio to n dB;
              (0.0dB<=n<=48.0dB) Increasing value reduces bits to remove.
-force        forcibly over-write output file if it exists; default=off.
-check        check if WAV file has already been processed. default=off;
              errorlevel=16 if already processed, 0 if not.
-correction   write correction file while processing WAV file. default=off;

System Options:

-quiet        significantly reduce screen output.
-nowarn       suppress lossyWAV warnings.
-detail       enable detailled output mode.

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

David Robinson for the method itself and motivation to implement it.
Don Cross for the original Pascal source for the FFT algorithm used.
Horst Albrecht for valuable tuning input and feedback.

Example Foobar2000 converter settings

Foobar2000 Converter Settings.PNG

Example flossy3.bat file called from Foobar2000

@echo off
z:\bin\lossyWAV %1 %3 %4 %5 %6 %7 %8 %9 -below -nowarn -quiet
z:\bin\flac.exe -5 -f -b 512 "%~N1.lossy.wav" -o"%~N2.flac" 
del "%~N1.lossy.wav"

Frequently asked questions

  • Question: Is it VBR?
  • Short answer: Yes.
  • Question: Is it transparent?
  • Short answer: Almost certainly.
  • Question: Is it lossless?
  • Short answer: No.
  • Question: Why should I use this?
  • Answer:
  • high quality
  • extremely low chance of audible artefacts
  • reasonable bitrates
  • usable with unmodified, established lossless formats.

Current test settings

No current test settings

External links