Fraunhofer FDK AAC: Difference between revisions
(official url changed) |
(→Bandwidth: update) |
||
Line 69: | Line 69: | ||
==Bandwidth== | ==Bandwidth== | ||
The default bandwidth (or low-pass filter cutoff) for each [[#Bitrate Modes|bitrate mode]] | The default bandwidth (or low-pass filter cutoff) for each [[#Bitrate Modes|bitrate mode]] will be the minimum of the appropriate value in the tables below or half the [[#Sample Rates|sample rate]]. This can be overridden, but the maximum value is 20000 Hz. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/bandwidth.cpp] | ||
=== VBR Modes === | === VBR Modes === | ||
{|class="wikitable" | {|class="wikitable" | ||
Line 86: | Line 86: | ||
=== CBR Mode === | === CBR Mode === | ||
{|class="wikitable" | {|class="wikitable" | ||
! Sample Rates !! | ! AOT/Sample Rates !! Bitrate<br />per channel !! Mono !! Two or More Channels | ||
|- | |||
|rowspan=8| LC / Any | |||
| Below 12kbps || 3700 Hz || 5000 Hz | |||
|- | |||
| 12-20 kbps || 5000 Hz || 6400 Hz | |||
|- | |||
| 20-28 kbps || 6900 Hz || 9640 Hz | |||
|- | |||
| 28-40 kbps || 9600 Hz || 13050 Hz | |||
|- | |||
| 40-56 kbps || 12060 Hz || 14260 Hz | |||
|- | |||
| 56-72 kbps || 13950 Hz || 15500 Hz | |||
|- | |||
| 72-96 kpbs || 14200 Hz || 16120 Hz | |||
|- | |||
| 96kbps and above ||colspan=2| 17000 Hz | |||
|- | |- | ||
|colspan=4|... | |colspan=4|... | ||
|- | |- | ||
|rowspan=2| 44100 | |rowspan=2| LD / 44100 Hz | ||
| | | 56kbps || 11000 Hz || 12900 Hz | ||
|- | |- | ||
| | | 64kbps || 14400 Hz || 15500 Hz | ||
|- | |- | ||
|colspan=4|... | |colspan=4|... |
Revision as of 23:51, 18 September 2014
Developer(s) | Fraunhofer IIS |
Release information | |
---|---|
Stable release | 3.4.12 [1] |
Compatibility | |
Operating system | Android, Linux |
Additional information | |
Use | Encoder |
License | Very liberal (NOTICE), but somehow considered non-free [2] |
Website | Offical web page |
Current AAC encoders (most to least recommended) | |
---|---|
1 | Apple AAC M/W |
2 | FhG AAC (Winamp) W |
3 | Fraunhofer FDK AAC S/L/M/W |
4 | Nero AAC L/W |
5 | FFmpeg 3.0+ AAC encoder S/L/M/W |
6 | FAAC S/L/M/W |
7 | Libav (pre-3.0 FFmpeg) AAC encoder S/L/M/W |
S Source code available; L Linux; M macOS; W Windows | |
List of AAC encoders |
The Fraunhofer FDK AAC is a high-quality open-source AAC encoder library developed by Fraunhofer IIS. It was officially released for Android, but has been ported to other platforms.
The licensed Fraunhofer AAC codec included in Winamp (often called FhG AAC) is not the same as the FDK AAC codec. While they use the same same approach, they are developed by different teams, and target different platforms. The FDK library is built around fixed-point math and originally targeted mobile devices.
FDK AAC is considered a favorable alternative to the Nero AAC codec, which is no longer developed.
Afterburner
Afterburner is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature.
Audio Object Types
The library supports the following MPEG-2/4 AOTs:
Object Type ID | Audio Object Type | Description |
---|---|---|
2 | AAC-LC | "AAC Profile" MPEG-2 Low-complexity (LC) combined with MPEG-4 Perceptual Noise Substitution (PNS) |
5 | HE-AAC | AAC LC + SBR (Spectral Band Replication) |
29 | HE-AAC v2 | AAC LC + SBR + PS (Parametric Stereo) |
23 | AAC-LD | "Low Delay Profile" used for real-time communication |
39 | AAC-ELD | Enhanced Low Delay |
129 | MPEG-2 AAC LC | |
132 | MPEG-2 HE-AAC (SBR) | |
156 | MPEG-2 HE-AAC v2 (SBR+PS) |
Bitrate Modes
AACENC_BITRATEMODE | Bitrate | Comment |
---|---|---|
0 | CBR @ AACENC_BITRATE | |
1 | VBR, about 32 kbps/channel | |
2 | VBR, about 40 kbps/channel | |
3 | VBR, about 48-56 kbps/channel | Max for HE and HEv2 |
4 | VBR, about 64 kbps/channel | |
5 | VBR, about 80-96 kbps/channel | |
6 | Fixed frame mode. | |
7 | Superframe mode. | |
8 | LD/ELD full bitreservoir for packet based transmission |
Bandwidth
The default bandwidth (or low-pass filter cutoff) for each bitrate mode will be the minimum of the appropriate value in the tables below or half the sample rate. This can be overridden, but the maximum value is 20000 Hz. [3]
VBR Modes
AACENC_BITRATEMODE | Mono | Two or More Channels |
---|---|---|
1 | 13050 Hz | |
2 | 13050 Hz | |
3 | 14260 Hz | |
4 | 15500 Hz | |
5 | 48000 Hz |
CBR Mode
AOT/Sample Rates | Bitrate per channel |
Mono | Two or More Channels |
---|---|---|---|
LC / Any | Below 12kbps | 3700 Hz | 5000 Hz |
12-20 kbps | 5000 Hz | 6400 Hz | |
20-28 kbps | 6900 Hz | 9640 Hz | |
28-40 kbps | 9600 Hz | 13050 Hz | |
40-56 kbps | 12060 Hz | 14260 Hz | |
56-72 kbps | 13950 Hz | 15500 Hz | |
72-96 kpbs | 14200 Hz | 16120 Hz | |
96kbps and above | 17000 Hz | ||
... | |||
LD / 44100 Hz | 56kbps | 11000 Hz | 12900 Hz |
64kbps | 14400 Hz | 15500 Hz | |
... |
Sample Format
The FDK library is based on fixed-point math and only supports 16-bit integer PCM input.
Sample Rates
FDK library officially supports sample rates for input of 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, and 96000 Hz.
Not all combinations of audio object types, bitrate modes, channel layouts, and sample rates can be used together. For example, using 96kHz stereo input with the AAC-LC audio object type and bitrate mode 5 (VBR 80-96kHz/channel) will result in catastrophic failure. [4]. See Recommended Sampling Rate and Bitrate Combinations for recommended combinations.
Channel Layouts
Channels | Layout | Mode | Description |
---|---|---|---|
1 | C | MODE_1 | Mono |
2 | L+R | MODE_2 | Stereo |
3 | C, L+R | MODE_1_2 | |
4 | C, L+R, Rear | MODE_1_2_1 | fdkaac calls it "C L R Cs" |
5 | C, L+R, LS+RS | MODE_1_2_2 | |
5.1 | C, L+R, LS+RS, LFE | MODE_1_2_2_1 | |
7.1 | C, LC+RC, L+R, LS+RS, LFE | MODE_1_2_2_2_1 MODE_7_1_FRONT_CENTER |
|
7.1 (Rear) | C, L+R, LS+RS, Lrear+Rrear, LFE | MODE_7_1_REAR_SURROUND |
Recommended Sampling Rate and Bitrate Combinations
This table is from the documentation included in the FDK library source code. (PDF section 2.12 or source code: [5])
The following table provides an overview of recommended encoder configuration parameters which [Fraunhofer] determined by virtue of numerous listening tests.
Audio Object Type | Bit Rate Range [bit/s] |
Supported Sampling Rates [kHz] |
Recommended Sampling Rate [kHz] |
Number of Channels |
---|---|---|---|---|
[29] HE-AAC v2 (AAC LC + SBR + PS) |
8000 - 11999 | 22.05, 24.00 | 24.00 | 2 |
12000 - 17999 | 32.00 | 32.00 | 2 | |
18000 - 39999 | 32.00, 44.10, 48.00 | 44.10 | 2 | |
40000 - 56000 | 32.00, 44.10, 48.00 | 48.00 | 2 | |
[5] HE-AAC (AAC LC + SBR) |
8000 - 11999 | 22.05, 24.00 | 24.00 | 1 |
12000 - 17999 | 32.00 | 32.00 | 1 | |
18000 - 39999 | 32.00, 44.10, 48.00 | 44.10 | 1 | |
40000 - 56000 | 32.00, 44.10, 48.00 | 48.00 | 1 | |
16000 - 27999 | 32.00, 44.10, 48.00 | 32.00 | 2 | |
28000 - 63999 | 32.00, 44.10, 48.00 | 44.10 | 2 | |
64000 - 128000 | 32.00, 44.10, 48.00 | 48.00 | 2 | |
[5] HE-AAC (AAC LC + SBR) |
64000 - 69999 | 32.00, 44.10, 48.00 | 32.00 | 5, 5.1 |
70000 - 159999 | 32.00, 44.10, 48.00 | 44.10 | 5, 5.1 | |
160000 - 245999 | 32.00, 44.10, 48.00 | 48.00 | 5 | |
160000 - 265999 | 32.00, 44.10, 48.00 | 48.00 | 5.1 | |
[2] AAC LC | 8000 - 15999 | 11.025, 12.00, 16.00 | 12.00 | 1 |
16000 - 23999 | 16.00 | 16.00 | 1 | |
24000 - 31999 | 16.00, 22.05, 24.00 | 24.00 | 1 | |
32000 - 55999 | 32.00 | 32.00 | 1 | |
56000 - 160000 | 32.00, 44.10, 48.00 | 44.10 | 1 | |
160001 - 288000 | 48.00 | 48.00 | 1 | |
[2] AAC LC | 16000 - 23999 | 11.025, 12.00, 16.00 | 12.00 | 2 |
24000 - 31999 | 16.00 | 16.00 | 2 | |
32000 - 39999 | 16.00, 22.05, 24.00 | 22.05 | 2 | |
40000 - 95999 | 32.00 | 32.00 | 2 | |
96000 - 111999 | 32.00, 44.10, 48.00 | 32.00 | 2 | |
112000 - 320001 | 32.00, 44.10, 48.00 | 44.10 | 2 | |
320002 - 576000 | 48.00 | 48.00 | 2 | |
[2] AAC LC | 160000 - 239999 | 32.00 | 32.00 | 5, 5.1 |
240000 - 279999 | 32.00, 44.10, 48.00 | 32.00 | 5, 5.1 | |
280000 - 800000 | 32.00, 44.10, 48.00 | 44.10 | 5, 5.1 |
[lib]fdk-aac
The opencore-amr project maintains a source code distribution of the Fraunhofer library as fdk-aac, often packaged as libfdk-aac. It is distributed in a binary form in Debian (and Debian derivatives like Ubuntu) as libfdk-aac0.
The latest libfdk-aac release is 0.1.3 (2013-11-25), based on FDK version 3.4.12. [6].
aac-enc
fdk-aac includes a very, very basic command-line interface encoding utility, called aac-enc, that can encode to AAC from WAV.
Usage:
aac-enc [-r bitrate] [-t aot] [-a afterburner] [-s sbr] [-v vbr] in.wav out.aac
- -r <bitrate>
- Bitrate in bits per seconds (for CBR). Default is 64000.
- -t <aot>
- The Audio Object Type. Default is 2 (AAC-LC).
- -a <0,1>
- Enable Afterburner. 0=Disabled, 1=Enabled (recommended). Default is 1.
- -s <-1,0,1>
- Spectral Band Replication (ELD AOT only). -1=Use ELD SBR auto configurator (default,recommended), 0=Disabled, 1=Enabled. Default is -1.
- -v <0-5>
- Bitrate mode. Only 0-5 used. 0=CBR @ value given in -r. Default is 0.
fdkaac
Developer(s) | nu774 |
Release information | |
---|---|
Stable release | 0.5.3 |
Compatibility | |
Operating system | Linux, Windows, others |
Additional information | |
Use | Encoder |
License | Zlib |
Website | github page for fdkaac |
fdkaac is a command-line interface encoding and metadata utility. It uses libfdk-aac.
Example:
# Convert a FLAC file to m4a using fdkaac configured for AAC-LC at about 50kbps/channel (100kbps for stereo). flac -s -d -c song.flac | fdkaac --ignorelength -profile 2 --bitrate-mode 3 -o song.m4a -
Usage:
fdkaac [options] input_file
- -p, --profile <n>
- The Audio Object Type.
- -b, --bitrate <n>
- Bitrate in bits per seconds (for CBR)
- -m, --bitrate-mode <n>
- Bitrate mode. Only 0-5 used. 0=CBR.
- -w, --bandwidth <n>
- Frequency bandwidth in Hz (AAC LC only)
- -a, --afterburner <n>
- Enable Afterburner. 0=Disabled, 1=Enabled (recommended). Default is 1.
- -L, --lowdelay-sbr <-1,0,1>
- Configure SBR activity on AAC ELD
-1 Use ELD SBR auto configurator 0 Disable SBR on ELD (default) 1 Enable SBR on ELD
- -s, --sbr-ratio <0,1,2>
- Controls activation of downsampled SBR
0 Use lib default (default) 1 Downsampled SBR (default for ELD+SBR) 2 Dual-rate SBR (default for HE-AAC)
- -f, --transport-format <n>
- Transport format
0 RAW (default, muxed into M4A) 1 ADIF 2 ADTS 6 LATM MCP=1 7 LATM MCP=0 10 LOAS/LATM (LATM within LOAS)
- -C, --adts-crc-check
- Add CRC protection on ADTS header
- -h, --header-period <n>
- StreamMuxConfig/PCE repetition period in transport layer
- -o <filename>
- Output filename
- -G, --gapless-mode <n>
- Encoder delay signaling for gapless playback
0 iTunSMPB (default) 1 ISO standard (edts + sgpd) 2 Both
- --include-sbr-delay
- Count SBR decoder delay in encoder delay. This is not iTunes compatible, but is default behavior of FDK library.
- -I, --ignorelength
- Ignore length of WAV header
- -S, --silent
- Don't print progress messages
- --moov-before-mdat
- Place moov box before mdat box on m4a output
Options for raw (headerless) input:
- -R, --raw
- Treat input as raw (by default WAV is assumed)
- --raw-channels <n>
- Number of channels (default: 2)
- --raw-rate <n>
- Sample rate (default: 44100)
- --raw-format <spec>
- Sample format, default is "S16L". Spec is as follows:
1st char S(igned), U(nsigned), or F(loat) 2nd part bits per channel Last char L(ittle) or B(ig)
- Last char can be omitted, in which case L is assumed. Spec is case insensitive, therefore "u16b" is same as "U16B".
- Up to 32-bit integer or 64-bit floating point format is supported as input. The FDK library, however, is implemented based on fixed point math and onlysupports 16-bit integer PCM. Therefore, be wary of clipping. You might want to dither/noise shape beforehand when your input has higher resolution.
Tagging options:
- --tag <fcc>
- <value>: Set iTunes predefined tag with four char code. See iTunes Metadata.
- --tag-from-file <fcc>:<filename>
- Same as above, but value is read from file.
- --long-tag <name>:<value>
- Set arbitrary tag as iTunes custom metadata.
- --tag-from-json <filename[?dot_notation]>
- Read tags from JSON. By default, tags are assumed to be direct children of the root object(dictionary). Optionally, position of the dictionary that contains tags can be specified with dotted notation.
Option/Usage | MP4 Block Modified | Comment |
---|---|---|
--title <string> | ©nam | |
--artist <string> | ©ART | |
--album <string> | ©alb | |
--genre <string> | ©gen | Appears to always store the string the "user-defined" ©gen even if there is an ID3 genre id that could be used with the gnre block. |
--date <string> | ©day | YYYY[-MM[-DD]] format |
--composer <string> | ©wrt | |
--grouping <string> | ©grp | |
--comment <string> | ©cmt | |
--album-artist <string> | aART | |
--track <number[/total]> | trkn | Block stores both track and totaltracks in one binary value |
--disk <number[/total]> | disk | Block stores both disc and totaldiscs in one binary value |
--tempo <n> | tmpo | Beats per minute, stored as a 16-bit integer |
FFmpeg
libfdk-aac can be used with FFmpeg, but requires a custom build of FFmpeg. FFmpeg provides significant documentation for using libfdk_aac in the FFmpeg wiki.
CBR mode:
ffmpeg -i <input> -c:a libfdk_aac -b:a 128k <output>
VBR mode:
ffmpeg -i <input> -c:a libfdk_aac -vbr 3 <output>
- -vbr
- Values 1-5. See Bitrate mode.
- --cutoff
- The low-pass filter cut-off in Hz. See Bandwidth for default values. FFmpeg maximum value is 20000.
Libav/avconv
libfdk-aac can be used with Libav's avconv, but requires a custom build of avconv with "--enable-libfdk-aac" passed to configure.
CBR mode:
avconv -i <input> -c:a libfdk_aac -b:a <bitrate> -afterburner 1 <output>
VBR mode:
avconv -i <input> -c:a libfdk_aac -flags +qscale -global_quality [1-5] -afterburner 1 <output>
- -afterburner
- See afterburner.
- -global_quality
- Values 1-5. See Bitrate mode.