Fraunhofer FDK AAC

From Hydrogenaudio Knowledgebase
Revision as of 20:46, 26 July 2014 by Bp (talk | contribs) (→‎fdkaac: note about sample format input)


Fraunhofer FDK AAC

Developer(s) Fraunhofer IIS
Release information
Initial release {{{released}}}
Stable release 3.4.12 [1]
Preview release
Compatibility
Operating system Android, Linux
Additional information
Use Encoder
License Very liberal (NOTICE), but somehow considered non-free [2]
Website Offical web page

The Fraunhofer FDK AAC is a high-quality open-source AAC encoder library developed by Fraunhofer IIS. It was officially released for Android, but has been ported to other platforms.

Afterburner

Afterburner is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature.

MPEG-4 Audio Object Types

The library supports the following MPEG-2/4 AOTs:

Object Type ID Audio Object Type Description
2 AAC-LC "AAC Profile" MPEG-2 Low-complexity (LC) combined with MPEG-4 Perceptual Noise Substitution (PNS)
5 HE-AAC AAC LC + SBR (Spectral Band Replication)
29 HE-AAC v2 AAC LC + SBR + PS (Parametric Stereo)
23 AAC-LD "Low Delay Profile" used for real-time communication
39 AAC-ELD Enhanced Low Delay
129 MPEG-2 AAC LC
132 MPEG-2 HE-AAC (SBR)
156 MPEG-2 HE-AAC v2 (SBR+PS)

Bitrate modes

AACENC_BITRATEMODE Bitrate Comment
0 CBR @ AACENC_BITRATE
1 VBR, about 32 kbps/channel
2 VBR, about 40 kbps/channel
3 VBR, about 48-56 kbps/channel Max for HE and HEv2
4 VBR, about 64 kbps/channel
5 VBR, about 80-96 kbps/channel
6 Fixed frame mode.
7 Superframe mode.
8 LD/ELD full bitreservoir for packet based transmission

Sample Format

The FDK library is based on fixed-point math and only supports 16-bit integer PCM.

Channel Layouts

Channels Layout Mode Description
1 C MODE_1 Mono
2 L+R MODE_2 Stereo
3 C, L+R MODE_1_2
4 C, L+R, Rear MODE_1_2_1 fdkaac calls it "C L R Cs"
5 C, L+R, LS+RS MODE_1_2_2
5.1 C, L+R, LS+RS, LFE MODE_1_2_2_1
7.1 C, LC+RC, L+R, LS+RS, LFE MODE_1_2_2_2_1
MODE_7_1_FRONT_CENTER
7.1 (Rear) C, L+R, LS+RS, Lrear+Rrear, LFE MODE_7_1_REAR_SURROUND

[lib]fdk-aac

The opencore-amr project maintains a source code distribution of the Fraunhofer library as fdk-aac, often packaged as libfdk-aac. It is distributed in a binary form in Debian (and Debian derivatives like Ubuntu) as libfdk-aac0.

The latest libfdk_aac release is 0.1.3 (2013-11-25), based on FDK version 3.4.12. [3].

aac-enc

fdk-aac includes a very, very basic command-line interface encoding utility, called aac-enc, that can encode to AAC from WAV.

Usage:

aac-enc [-r bitrate] [-t aot] [-a afterburner] [-s sbr] [-v vbr] in.wav out.aac
-r <bitrate>
Bitrate in bits per seconds (for CBR). Default is 64000.
-t <aot>
The Audio Object Type. Default is 2 (AAC-LC).
-a <0,1>
Enable Afterburner. 0=Disabled, 1=Enabled (recommended). Default is 1.
-s <-1,0,1>
Spectral Band Replication (ELD AOT only). -1=Use ELD SBR auto configurator (default,recommended), 0=Disabled, 1=Enabled. Default is -1.
-v <0-5>
Bitrate mode. Only 0-5 used. 0=CBR @ value given in -r. Default is 0.

fdkaac

fdkaac

Developer(s) nu774
Release information
Initial release {{{released}}}
Stable release 0.5.3
Preview release
Compatibility
Operating system Linux, Windows, others
Additional information
Use Encoder
License Zlib
Website github page for fdkaac

fdkaac is a command-line interface encoding and metadata utility. It uses libfdk-aac.

Example:

# Convert a FLAC file to m4a using fdkaac using AAC-LC at about 100kbps for stereo
flac -s -d -c song.flac | fdkaac --ignorelength -profile 2 --bitrate-mode 3 -o song.m4a -

Usage:

fdkaac [options] input_file
-p, --profile <n>
The Audio Object Type.
-b, --bitrate <n>
Bitrate in bits per seconds (for CBR)
-m, --bitrate-mode <n>
Bitrate mode. Only 0-5 used. 0=CBR.
-w, --bandwidth <n>
Frequency bandwidth in Hz (AAC LC only)
-a, --afterburner <n>
Enable Afterburner. 0=Disabled, 1=Enabled (recommended). Default is 1.
-L, --lowdelay-sbr <-1,0,1>
Configure SBR activity on AAC ELD
-1 Use ELD SBR auto configurator
0 Disable SBR on ELD (default)
1 Enable SBR on ELD
-s, --sbr-ratio <0,1,2>
Controls activation of downsampled SBR
0 Use lib default (default)
1 Downsampled SBR (default for ELD+SBR)
2 Dual-rate SBR (default for HE-AAC)
-f, --transport-format <n>
Transport format
0 RAW (default, muxed into M4A)
1 ADIF
2 ADTS
6 LATM MCP=1
7 LATM MCP=0
10 LOAS/LATM (LATM within LOAS)
-C, --adts-crc-check
Add CRC protection on ADTS header
-h, --header-period <n>
StreamMuxConfig/PCE repetition period in transport layer
-o <filename>
Output filename
-G, --gapless-mode <n>
Encoder delay signaling for gapless playback
0 iTunSMPB (default)
1 ISO standard (edts + sgpd)
2 Both
--include-sbr-delay
Count SBR decoder delay in encoder delay. This is not iTunes compatible, but is default behavior of FDK library.
-I, --ignorelength
Ignore length of WAV header
-S, --silent
Don't print progress messages
--moov-before-mdat
Place moov box before mdat box on m4a output

Options for raw (headerless) input:

-R, --raw
Treat input as raw (by default WAV is assumed)
--raw-channels <n>
Number of channels (default: 2)
--raw-rate <n>
Sample rate (default: 44100)
--raw-format <spec>
Sample format, default is "S16L". Spec is as follows:
1st char S(igned), U(nsigned), or F(loat)
2nd part bits per channel
Last char L(ittle) or B(ig)
Last char can be omitted, in which case L is assumed. Spec is case insensitive, therefore "u16b" is same as "U16B".
Up to 32-bit integer or 64-bit floating point format is supported as input. The FDK library, however, is implemented based on fixed point math and onlysupports 16bit integer PCM. Therefore, be wary of clipping. You might want to dither/noise shape beforehand when your input has higher resolution."

Tagging options:

--tag <fcc>
<value>: Set iTunes predefined tag with four char code. See iTunes Metadata.
--tag-from-file <fcc>:<filename>
Same as above, but value is read from file.
--long-tag <name>:<value>
Set arbitrary tag as iTunes custom metadata.
--tag-from-json <filename[?dot_notation]>
Read tags from JSON. By default, tags are assumed to be direct children of the root object(dictionary). Optionally, position of the dictionary that contains tags can be specified with dotted notation.
Option/Usage MP4 Block Modified Comment
--title <string> ©nam
--artist <string> ©ART
--album <string> ©alb
--genre <string> ©gen Appears to always store the string the "user-defined" ©gen even if there is an ID3 genre id that could be used with the gnre block.
--date <string> ©day YYYY[-MM[-DD]] format
--composer <string> ©wrt
--grouping <string> ©grp
--comment <string> ©cmt
--album-artist <string> aART
--track <number[/total]> trkn Block stores both track and totaltracks in one binary value
--disk <number[/total]> disk Block stores both disc and totaldiscs in one binary value
--tempo <n> tmpo Beats per minute, stored as a 16-bit integer


FFmpeg

libfdk-aac can be used with FFmpeg, but requires a custom build of FFmpeg. FFmpeg provides significant documentation for using libfdk_aac in the FFmpeg wiki.

CBR mode:

ffmpeg -i <input> -c:a libfdk_aac -b:a 128k <output>

VBR mode:

ffmpeg -i <input> -c:a libfdk_aac -vbr 3 <output>
-vbr
Values 1-5. See Bitrate mode.

Libav/avconv

libfdk-aac can be used with Libav's avconv, but requires a custom build of avconv with "--enable-libfdk-aac" passed to configure.

CBR mode:

avconv -i <input> -c:a libfdk_aac -b:a <bitrate> -afterburner 1 <output>

VBR mode:

avconv -i <input> -c:a libfdk_aac -flags +qscale -global_quality [1-5] -afterburner 1 <output>
-afterburner
See afterburner.
-global_quality
Values 1-5. See Bitrate mode.

Links