Helix MP3 Encoder
Developer(s) | RealNetworks,
maikmerten maintains GitHub repo |
Release information | |
---|---|
Stable release | |
Compatibility | |
Operating system | Linux, Windows |
Additional information | |
Use | Encoder |
License | RPSL |
Website | GitHub repo |
The Helix MP3 Encoder was open-sourced by RealNetworks ca. 2005 via the (long-defunct) Helix community project. It originated from the Xing MP3 encoder, which was purchased by RealNetworks.
A current version ("hmp3"), with contributions from HydrogenAudio members, is available as source code over at https://github.com/maikmerten/hmp3. This Wiki page discusses that version.
Features
- Encodes MP3 in MPEG-1 and MPEG-2 modes
- 48 kHz, 44.1 kHz, 32 kHz (MPEG-1)
- 24 kHz, 22.05 kHz, 16 kHz (MPEG-2)
- LAME headers for gapless playback
- CBR and VBR encoding
Listening tests
The Helix MP3 encoder participated in several listening tests and demonstrated to be amongst the highest-quality encoders for MP3 available.
- Public MP3 listening test, October 2008
- Personal listening test by Kamedo2, ~224 kbps, May 2013
- Personal listening test by Kamedo2, ~192 kbps, December 2016
Encoder switches
hmp3 is a command-line operated application. The most basic invocation to generate a MP3 file from WAV:
hmp3 input.wav output.mp3
This creates a ~128 kbps VBR file for 44.1 kHz stereo input.
Switch | Function | Example |
---|---|---|
-B | Set per-channel bitrate. Selects CBR encoding. | -B64 for a 128 kbps stereo CBR file |
-F | Frequency cutoff for the encoder lowpass filter. To actually encode anything beyond 16 kHz, also specify the -HF switch. | -F19000 for a 19 kHz lowpass |
-HF | Controls encoding of high frequency content (> 16 kHz). Disabled by default. Valid values are 0 (disabled), 1 (partial, only "mode-1 granules"), 2 (full, "all granules"). Note that high-frequency content will only be encoded if the psychoacoustic model deems encoding high frequencies as beneficial for the given bitrate/quality settings.
High frequencies will only be encoded if -V >= 80 or -B >= 96. |
HF2 for unrestricted high-frequency encoding |
-M | Stereo-mode/Mono selection. 0: stereo, 1: M/S stereo (default), 2: dual channel, 3: mono | -M3 to downmix to mono |
-N | Enable use of Intensity Stereo. Only works with CBR and makes the encoder use "Bit Allocator 1" (see section "Bit Allocators") | -N8 to enable Intensity Stereo with 8 bands of M/S stereo |
-SBT | Threshold for short-block decisions. Lower values mean more short-block usage. Default is 700. | -SBT500 for more short-blocks (more responsive to transients, might increase bitrate in VBR) |
-U |
Select assembly optimizations. 0: Only generic optimizations, 1: unused (was supposed to be AMD's 3Dnow!) 2: use SSE assembly optimizations (Intel Pentium 3). This only has an effect if the encoder is compiled with Visual Studio (up to version 2015) for 32-bit Windows. No effect if the encoder is compiled for Linux, 64-bit Windows or, e.g., ARM processors. |
-U2 to use SSE assembly (where applicable) |
-V | Quality setting for VBR encoding. Ranges from 0 to 150. Default is 50. | -V115 for a ~180-200 kbps stereo VBR file |
-X | Control writing of Xing/LAME header information. 0: No headers, 1: only basic Xing information header, 2: Xing header with VBR-TOC and LAME header (gapless information) (default) | -X0 to disable headers (in very rare cases of incompatibility) |
Reasonable Settings
Here's a short list of settings for different encoding needs. Note that while comparisons to LAME's VBR settings are provided, these are only very rough estimates to provide guidance regarding potential use cases. LAME and Helix are very different encoders and are expected to perform better and worse in comparison, depending on audio material.
Setting | Approx. Bitrate | Description |
---|---|---|
-HF2 -V150 |
~ 256 kbps | Maximum quality VBR encoding, with full audio spectrum. (ca. LAME -V 0) |
-F19000 -HF2 -V110 |
~ 195 kbps |
High-quality VBR encoding, audio spectrum up to 19 kHz. (ca. LAME -V 2) This should be close to transparent to most people in most situations. |
-F18000 -HF2 -V80 |
~ 160 kbps | Medium-quality VBR encoding, audio spectrum up to 18 kHz. (ca. LAME -V 4) |
-V50 |
~ 128 kbps |
Low-medium-quality VBR encoding, audio spectrum up to 16 kHz. (ca. LAME -V 5-6) Default setting of the Helix MP3 Encoder. Should be sufficient for casual listening on space-constrained devices, but is not expected to be universally transparent. |
Technical details
The following bits might not be relevant for daily-use of the Helix MP3 Encoder, but might be interesting to developers.
Bit allocators
The Helix MP3 Encoder, apparently for historical reasons, has two distinct bit allocators, which are selected depending on operating modes. Bit Allocator 1 (bitallo1.cpp) appears to be the older one, most likely inherited from early Xing days, while Bit Allocator 3 (bitallo3.cpp) is a newer, overall more-capable mechanism that is utilized by default.
Feature | Bit Allocator 1 | Bit Allocator 3 |
---|---|---|
CBR | supported | supported |
VBR | not supported | supported |
>16 kHz encoding | not supported | supported |
Long/Short block switching | not supported | supported |
Stereo | supported | supported |
M/S-Stereo | supported | supported |
Dual Channel Stereo | supported | not supported |
Intensity stereo | supported | not supported |
Bit Allocator 1 thus is mostly interesting for very low-bitrate CBR encodings, where intensity stereo can lead to bitrate savings to spend somewhere else. Example:
hmp3 input.wav output.mp3 -F16000 -B48 -N8
for somewhat bearable low-bitrate stereo-ish MP3 encoding (the -N parameter enables intensity stereo).
Detection of transients
To detect transients that warrant a switch to short blocks, the Helix MP3 Encoder uses output from the 32-band polyphase filterbank. The encoder computes "energy" values from the filterbank output and compares current energy values with values for the previous granule (detect.c). This is accomplished by an "energy history" (defined as "attack_buf" in mp3enc.h).
If the energy values differ enough (above the threshold for short block detection), short blocks will be used. The encoder will use short blocks for both channels, even if only the signal of one channel triggered the transients detection.
Assembly optimizations
In platform/win/i386, there are optimized assembly versions of speed-critical routines. These target 32-bit x86-CPUs in general (but usually optimized for the Pentium 2), with some routines also being available for SSE, targeting the Pentium 3 (assembly files starting with "x"). These SSE optimizations can be selected via the -U parameter.
These routines are somewhat outdated. They only work in Visual Studio up to version 2015, only for the Windows platform - and only for 32-bit targets. It has been demonstrated that modern compilers generate faster code from the pure C source code. As such, these assembly optimizations appear to be superfluous.
That modern compilers can generate faster code might (but this is speculation) have something to do with the hand-written assembly version mixing x87 FPU instructions (in the routines for general i386 CPUs) with SSE instructions (which have their own register set). Modern CPUs appear to prefer doing all (even scalar) floating point operations in SSE or AVX registers.
External links
- Helix MP3 encoder (Windows command-line executable) at RareWares
- Resurrecting/Preserving the Helix MP3 encoder on hydrogenaudio