Advanced Audio Coding: Difference between revisions

From Hydrogenaudio Knowledgebase
(added internal linking)
 
(81 intermediate revisions by 30 users not shown)
Line 1: Line 1:
=Introduction=
'''Advanced Audio Coding''' ('''AAC''') forms part of the latest specifications from the MPEG committee, and is their official successor to the popular [[MP3]] format. As with MP3, the AAC format is an international standard, and is backed by several big-name companies, including Dolby, Sony and Nokia.
'''AAC''' or 'Advanced Audio Coding' forms part of the latest specifications from the MPEG comittee, and is their official successor to the popular [[MP3]] format. As with [[MP3]], the '''AAC''' format is an international standard, and is backed up by several big-name companies, including Dolby, Sony and Nokia.


With the 8 <small>(this is just a guess)</small> years that passed since the creation of the [[MP3]] format, many improvements have been realised leading to a seemingly complex specification with several flavours of '''AAC''' available. To potentially add to the confusion, '''AAC''' is usually wrapped inside an [[MP4]] container to provide tagging, seeking and possibly other benefits??  For this reason, '''AAC''' can also be referred to as [[MP4]] audio..  
With the 26 years that had passed since the creation of the MP3 format, many improvements had been realised leading to a seemingly complex specification with several flavours of AAC available. To potentially add to the confusion, AAC is usually wrapped inside an [[MP4]] container to provide tagging and seeking benefits. For this reason, AAC can also be referred to as MP4 audio.


There are several '''AAC''' encoders to choose from, coming from large names such as Apple and Nero, or the open source F'''AAC''' which is analogous to the [[LAME]] encoder. '''AAC''' is supported on many hardware players, and is available in online stores..
There are several AAC encoders to choose from, coming from large names such as Apple ([[iTunes]] and [[QuickTime AAC]]), Real Networks and Nero AG (Creators of Nero Burning Rom), or the open source [http://www.audiocoding.com FAAC] which is analogous to the [[LAME]] encoder. AAC is supported on some hardware players, most notably the [[Apple iPod]] and some cell phones, and is available in Apple's online store.


In terms of quality, it outperforms [[MP3]] by a nice margin, being on par with OGG and other great codecs, and with '''AAC-HE''' provide the best low bitrate quality.
In terms of quality, the AAC format is on par with (Ogg) [[Vorbis]], [[LAME]] MP3, [[WMA]] Pro and other modern codecs, and with added SBR coding (HE AAC) it can provide quite high quality at low bitrates.


Recent developments have led to [[aacPlus]], later standardized as MPEG-4 HE-AAC, which is able to give subjectively good results at low bitrates. The website [http://www.tuner2.com Tuner2] has several Internet radio stations which are sending out streams at low rates – such as 40 kbps – and some of these are surprisingly good considering the bit rates used.


==Pros==
== Pros ==
 
* An international standard approved by the [http://www.iso.ch ISO]
* An international standard approved by the [http://www.iso.ch ISO]
* Flexible: supports several [[sampling rate]]s (8000-96000 Hz), bit depths, and [[multichannel]] (up to 48 channels)
* Flexible: supports several [[sampling rate]]s (8000–96000 Hz), bit depths, and [[multichannel]] (up to 48 channels)
* Several implementations, including a free and high quality one ([http://www.itunes.com iTunes])
* Several implementations, including free and high quality ones ([http://www.itunes.com iTunes] or [http://www.nero.com/nerodigital/eng/Nero_Digital_Audio.html Nero Digital])
* Reaches transparency in most samples and for most users at around 150kbps
* Reaches transparency in most samples and for most users at around 150 kbps
* Part of MPEG4 specs
* Part of [[MPEG-4]] specs
* Fast decoding (using [http://www.audiocoding.com FAAD])
* Anyone can create his or her own implementation (specifications and demo sources available)
* Anyone can create it's own implementation (Specifications and demo sources available)
* Almost everything supports it, including Android devices, Apple devices, most of the modern portable players, etc.
* Some portable players support it (Philips Expanium, iPod, cell phones from Nokia)
 
 
==Cons==


== Cons ==
* Problem cases that trip out all transform codecs
* Problem cases that trip out all transform codecs
* Relatively slow encoding
* Heavily patented
* Heavily patented
* Increased complexity
* Increased complexity
* '''AAC''' comes in different "flavors" (object types: '''AAC LC''', '''AAC HE''', '''AAC PS''' etc.). Many (especially portable) players only support LC (at the moment) so you can have files that a valid but your player won't play them.
* AAC comes in different "flavors" (object types: '''AAC LC''', '''AAC HE''', '''AAC PS''' etc.). Many (especially portable) players only support LC (at the moment) so you can have files that are valid but your player won't play them or play at a reduced quality.
 


=Technical Information=
== Technical Information ==
'''AAC''' stands for 'Advanced Audio Coding' and is part of the MPEG-4 Systems Standard. Originally known as MPEG-2 Non-Backwards Compatible (As apposed to MPEG-2 Backwards Compatible) it is the succesor to MPEG-1/2 Layer III ([[MP3]]). It uses the [[MP4]] [[container]] (which is based on Apple's MOV [[container]]) to store metadata (i.e. tag information).
AAC stands for 'Advanced Audio Coding' and is part of the [[MPEG-4]] Systems Standard. Originally known as MPEG-2 Non-Backwards Compatible (As apposed to MPEG-2 Backwards Compatible) it is the succesor to MPEG-1/2 Layer III ([[MP3]]). It uses the [[MP4]] [[container]] (which is based on Apple's [[MOV]] container) to store metadata (i.e. tag information).


As part of the MPEG-4 Systems Standard, an '''AAC''' encoded file can include up to 48 full-bandwith audio  
As part of the MPEG-4 Systems Standard, an AAC encoded file can include up to 48 full-bandwith audio channels (up to 96 kHz) and 15 Low Frequency Enhancement channels (limited to 120 Hz) plus 15 data streams.
channels (up to 96 kHz) and 15 Low Frequency Enhancement channels (limited to 120 Hz) plus 15 data streams.


'''AAC''' encoding methods are organised into Profiles (MPEG-2) or Object Types (MPEG-4). These different Object Types are not necessarily compatible with each other and may not be playable with various decoders. The various Object Types are:
AAC encoding methods are organised into Profiles (MPEG-2) or Object Types (MPEG-4). These different Object Types are not necessarily compatible with each other and may not be playable with various decoders. Some of the various Object Types are (see [https://en.wikipedia.org/wiki/MPEG-4_Part_3#MPEG-4_Audio_Object_Types Wikipedia] for a full list):


* MPEG-2 AAC LC / Low Complexity
* MPEG-2 AAC LC / Low Complexity
* MPEG-2 AAC Main
* MPEG-2 AAC Main
* MPEG-2 AAC SSR / Scalable Sampling Rate
* MPEG-2 AAC SSR / Scalable Sampling Rate
* MPEG-4 AAC LC
* MPEG-4 AAC LC / Low Complexity
* MPEG-4 AAC Main
* MPEG-4 AAC Main
* MPEG-4 AAC SSR
* MPEG-4 AAC SSR / Scalable Sampling Rate
* MPEG-4 AAC LTP / Long Term Prediction
* MPEG-4 AAC LTP / Long Term Prediction
* MPEG-4 AAC HE / High Efficiency
* MPEG-4 AAC HE / High Efficiency
* MPEG-4 AAC LD / Low Delay
* MPEG-4 AAC LD / Low Delay
* USAC / Unified Speech and Audio Coding ["xHE-AAC" being a profile]


Different Object Types vary in complexity. Some take longer to encode/decode as a result of the different complexities. Furthermore, the benefits of the more complex profiles are often not worth the CPU power required to encode/decode them. As a result the Low Complexity/LC Object Type has become the profile used by most encoders. However, the High Efficiency Object Type has become more popular recently with its addition to the Nero '''AAC''' encoder which now supports HE '''AAC''' encoding.
Different Object Types vary in complexity. Some take longer to encode/decode as a result of the different complexities. Furthermore, the benefits of the more complex profiles are often not worth the CPU power required to encode/decode them. As a result the Low Complexity/LC Object Type has become the profile used by most encoders and supported by most decoders. However, the High Efficiency (HE) Object Type has become more popular recently with its addition to the Nero and Quicktime AAC encoder.
 
Currently all players support the LC Object Type. Players based on the FAAD2 decoder (eg. foobar2000,
Winamp Plugins) support almost all Object Types including HE '''AAC'''. 3ivX also supports all Object Types
except SSR.
 


==Technologies used for compression==
Currently all players support the LC Object Type, although some will work on only MPEG2 or MPEG4 streams. Players based on the FAAD2 decoder (eg. [[foobar2000]], [[Winamp]] plugins) support almost all Object Types including HE AAC. 3ivX also supports all Object Types except SSR.


== Technologies used for compression ==
* [[Huffman coding]]
* [[Huffman coding]]
* [[Quantization]] and scaling
* [[Quantization]] and scaling
Line 66: Line 56:
* Modified Discrete Cosine Transform (I[[MDCT]])
* Modified Discrete Cosine Transform (I[[MDCT]])
* Gain control and hybrid filter bank (polyphase quadrature filter (IPQF)+IMDCT)
* Gain control and hybrid filter bank (polyphase quadrature filter (IPQF)+IMDCT)
* Long Term Predictor (LTP) - MPEG4 '''AAC''' only
* Long Term Prediction (LTP) MPEG4 AAC only
* Perceptual Noise Substitution (PNS) - MPEG4 '''AAC''' only
* Perceptual Noise Substitution (PNS) MPEG4 AAC only
* Spectral Band Replication ([[SBR]]) - HE '''AAC'''
* Spectral Band Replication ([[SBR]]) HE AAC
* Parametric Stereo (PS) - HE '''AAC'''
** Enhanced SBR (eSBR) - USAC/xHE-AAC
 
* [[Parametric stereo]] (PS) – HE AAC
 
** "MPEG" parametric surround - USAC/xHE-AAC
=Encoders=
* Algebraic code-excited linear prediction (ACELP)-like - USAC/xHE-AAC (speech-optimized method)
There are several encoders listed at [[AAC implementations]].
 
 
=Decoders=
* [[FAAD]]
 
 
=FAQ=
 
==Great, so you've given me all the technical stuff, but what is AAC really?==
AAC is the culmination of the current state of the art audio encoding techniques. It is designed
to improve upon and replace [[MP3]] as the defacto Audio Encoding standard. It usually offers (depending on
the codec) equivalent quality to [[MP3]] at a lower bitrate.
 
==What is the difference between *.[[MP4]] and *.M4A?==
Besides the extension, absolutely nothing. Apple came up with extension to distiguish between files with
Video and Audio (the [[MP4]] extension) and files with Audio only (the M4A extension). As far as the internal
structure of the file, nothing is different.
 
==What extensions does the Apple iPod Accept?==
The iPod accepts files with both the [[MP4]] extension and the M4A extension. It will not accept unwrapped AAC files
(files with the .AAC extension).
 
==What is the difference between LC (Low Complexity) and HE (High Efficiency)?==
These are two of the various Object Types in the MPEG4 Systems Standard. LC is the most popular Object Type
with all encoders/decoders supporting it. Currently, Nero, Coding Technolgies, and Panasonic have incorporated
the HE '''AAC''' standard into their encoders, which allows for higher quality sound at lower bitrates then the LC
Object Type does (at the same bitrate). The HE Object Type is only used for music with a bitrate of less than
~80kbps.


==What's the best AAC encoder?==
== Encoders / Decoders (Supported Platforms) ==
There is no best '''AAC''' encoder as such. It can be said with reasonable confidence (based on Roberto's last test,  
{{aac-encoders}}
see above) that [http://www.quicktime.com QuickTime/iTunes] is the best '''AAC''' encoder at 128kbps. However, the  
=== Current ===
quality of any encoder is not linear and therefore these results can not be extrapolated to other bitrates. It
* [[Fraunhofer "FhG" AAC]] (Windows) Distributed as binary library only, included in Winamp. Can be extracted and used with a CLI wrapper.
can also be said with reasonable confidence that both the iTunes encoder and the [http://www.ahead.com Nero '''AAC''' encoder]
* [[Fraunhofer FDK AAC]] (Android, Linux, others)
are 'mature' and should not fail badly on any particular sample at an average bitrate of 128kbps (i.e. Internet Profile
* [[Nero AAC]] (Linux,Windows)
for Nero '''AAC''') or above. Beyond that, only you can decide through [[ABX]] testing. See the [[Audio format guide]]  
** No longer developed nor maintained by Nero, but the latest release is stable and very good.
for more information. However, that being said, QuickTime/iTunes and Nero '''AAC''' are considered to be the "safe"
* [[Apple AAC]] (MacOS X,Windows)
encoders if you wish to archive your music collection on your computer.
* [[libavcodec AAC|FFmpeg 3.0+ native AAC encoder]] (Multiplatform)
* [https://gitlab.com/ecodis/exhale exhale] (xHE-AAC only, Multiplatform, experimental)


==Do AAC encoded files play back gaplessly?==
=== Obsolete ===
Gapless playback is not part of the '''AAC''' standard and as such is not mandatory. However, certain companies can
* [[libavcodec AAC|Libav/FFmpeg pre-3.0 native AAC encoder]] (Multiplatform)
choose to add gapless encoding/decoding if they desire, providing it doesn't break compatibility with previous
** Use new FFmpeg encoder.
decoders. This is what Ahead have done with their Nero '''AAC''' codec. The files get encoded with information that
* [[FAAC]]/[[FAAD]] (Multiplatform)
allows the gap heard between files to be removed. This however is only possible with supported players (currently
** Development appears stagnant, but the latest release is stable.
these include foobar2000 and Nero ShowTime). Currently Nero '''AAC''' is the only '''AAC''' codec to have gapless encoding/decoding
support.


==What players can play back AAC music?==
=== Past ===
There are now a number of players that can play back this new format. [http://www.foobar2000.org/ foobar2000]
* [[aacplusenc]] (Multiplatform) (Dead?)
is considered by many to be the most powerful audio player in existence, and it is certainly capable of playing back
* [[PsyTEL]], developed into Nero AAC, now obsolete. (Windows)
'''AAC''' encoded files. Other players include the [http://www.itunes.com iTunes Digital Jukebox], [http://www.winamp.com
* HHI/zPlane [[Compaact!]], short-lived closed-source project, disappeared. (Windows)
Winamp] and [http://www.real.com/ Real Player].


<br clear="right" />


=Other links=
== Patent situation ==
Known [[AAC implementations]].
The owners of the AAC patent pool expects to charge a per-unit fee from codec makers.<ref>[https://www.sec.gov/Archives/edgar/data/1649009/000121390020023370/ea125930ex10-9_siyatamobile.htm "AAC PATENT LICENSE AGREEMENT"]</ref> In theory at least, compiling your own version of an encoder ''or'' decoder means you are "making" an unlicensed product. To this day very few people even care about this: FFmpeg happily links you to binary builds that contain its native decoder, capable of decoding even AAC-LD, which has a patent that lasts till 2030.<ref>See [https://hydrogenaud.io/index.php/topic,121109.0.html List of AAC related patents]</ref>


Read the [[AAC guide]] to learn how to obtain '''AAC'''/[[MP4]] files out of WAV files and CDs.
== External References ==
* [[AAC FAQ]]
* Read the [[AAC guide]] to learn how to obtain AAC/[[MP4]] files out of WAV files and CDs.
* Detailed AAC comparisons can be found at Roberto's listening tests page.
<references />


Detailed '''AAC''' comparisons can be found at [http://www.rjamorim.com/test/ Roberto's listening tests page].
[[Category:Codecs]]
[[Category:Lossy]]

Latest revision as of 14:03, 23 April 2023

Advanced Audio Coding (AAC) forms part of the latest specifications from the MPEG committee, and is their official successor to the popular MP3 format. As with MP3, the AAC format is an international standard, and is backed by several big-name companies, including Dolby, Sony and Nokia.

With the 26 years that had passed since the creation of the MP3 format, many improvements had been realised leading to a seemingly complex specification with several flavours of AAC available. To potentially add to the confusion, AAC is usually wrapped inside an MP4 container to provide tagging and seeking benefits. For this reason, AAC can also be referred to as MP4 audio.

There are several AAC encoders to choose from, coming from large names such as Apple (iTunes and QuickTime AAC), Real Networks and Nero AG (Creators of Nero Burning Rom), or the open source FAAC which is analogous to the LAME encoder. AAC is supported on some hardware players, most notably the Apple iPod and some cell phones, and is available in Apple's online store.

In terms of quality, the AAC format is on par with (Ogg) Vorbis, LAME MP3, WMA Pro and other modern codecs, and with added SBR coding (HE AAC) it can provide quite high quality at low bitrates.

Recent developments have led to aacPlus, later standardized as MPEG-4 HE-AAC, which is able to give subjectively good results at low bitrates. The website Tuner2 has several Internet radio stations which are sending out streams at low rates – such as 40 kbps – and some of these are surprisingly good considering the bit rates used.

Pros

  • An international standard approved by the ISO
  • Flexible: supports several sampling rates (8000–96000 Hz), bit depths, and multichannel (up to 48 channels)
  • Several implementations, including free and high quality ones (iTunes or Nero Digital)
  • Reaches transparency in most samples and for most users at around 150 kbps
  • Part of MPEG-4 specs
  • Anyone can create his or her own implementation (specifications and demo sources available)
  • Almost everything supports it, including Android devices, Apple devices, most of the modern portable players, etc.

Cons

  • Problem cases that trip out all transform codecs
  • Heavily patented
  • Increased complexity
  • AAC comes in different "flavors" (object types: AAC LC, AAC HE, AAC PS etc.). Many (especially portable) players only support LC (at the moment) so you can have files that are valid but your player won't play them or play at a reduced quality.

Technical Information

AAC stands for 'Advanced Audio Coding' and is part of the MPEG-4 Systems Standard. Originally known as MPEG-2 Non-Backwards Compatible (As apposed to MPEG-2 Backwards Compatible) it is the succesor to MPEG-1/2 Layer III (MP3). It uses the MP4 container (which is based on Apple's MOV container) to store metadata (i.e. tag information).

As part of the MPEG-4 Systems Standard, an AAC encoded file can include up to 48 full-bandwith audio channels (up to 96 kHz) and 15 Low Frequency Enhancement channels (limited to 120 Hz) plus 15 data streams.

AAC encoding methods are organised into Profiles (MPEG-2) or Object Types (MPEG-4). These different Object Types are not necessarily compatible with each other and may not be playable with various decoders. Some of the various Object Types are (see Wikipedia for a full list):

  • MPEG-2 AAC LC / Low Complexity
  • MPEG-2 AAC Main
  • MPEG-2 AAC SSR / Scalable Sampling Rate
  • MPEG-4 AAC LC / Low Complexity
  • MPEG-4 AAC Main
  • MPEG-4 AAC SSR / Scalable Sampling Rate
  • MPEG-4 AAC LTP / Long Term Prediction
  • MPEG-4 AAC HE / High Efficiency
  • MPEG-4 AAC LD / Low Delay
  • USAC / Unified Speech and Audio Coding ["xHE-AAC" being a profile]

Different Object Types vary in complexity. Some take longer to encode/decode as a result of the different complexities. Furthermore, the benefits of the more complex profiles are often not worth the CPU power required to encode/decode them. As a result the Low Complexity/LC Object Type has become the profile used by most encoders and supported by most decoders. However, the High Efficiency (HE) Object Type has become more popular recently with its addition to the Nero and Quicktime AAC encoder.

Currently all players support the LC Object Type, although some will work on only MPEG2 or MPEG4 streams. Players based on the FAAD2 decoder (eg. foobar2000, Winamp plugins) support almost all Object Types including HE AAC. 3ivX also supports all Object Types except SSR.

Technologies used for compression

  • Huffman coding
  • Quantization and scaling
  • M/S matrixing
  • Intensity stereo
  • Channel coupling
  • Backward adaptive prediction
  • Temporal Noise Shaping (TNS)
  • Modified Discrete Cosine Transform (IMDCT)
  • Gain control and hybrid filter bank (polyphase quadrature filter (IPQF)+IMDCT)
  • Long Term Prediction (LTP) – MPEG4 AAC only
  • Perceptual Noise Substitution (PNS) – MPEG4 AAC only
  • Spectral Band Replication (SBR) – HE AAC
    • Enhanced SBR (eSBR) - USAC/xHE-AAC
  • Parametric stereo (PS) – HE AAC
    • "MPEG" parametric surround - USAC/xHE-AAC
  • Algebraic code-excited linear prediction (ACELP)-like - USAC/xHE-AAC (speech-optimized method)

Encoders / Decoders (Supported Platforms)

Current AAC encoders
(most to least recommended)
1 Apple AAC M/W
2 FhG AAC (Winamp) W
3 Fraunhofer FDK AAC S/L/M/W
4 Nero AAC L/W
5 FFmpeg 3.0+ AAC encoder S/L/M/W
6 FAAC S/L/M/W
7 Libav (pre-3.0 FFmpeg) AAC encoder S/L/M/W
S Source code available; L Linux; M macOS; W Windows
List of AAC encoders

Current

Obsolete

Past

  • aacplusenc (Multiplatform) (Dead?)
  • PsyTEL, developed into Nero AAC, now obsolete. (Windows)
  • HHI/zPlane Compaact!, short-lived closed-source project, disappeared. (Windows)


Patent situation

The owners of the AAC patent pool expects to charge a per-unit fee from codec makers.[1] In theory at least, compiling your own version of an encoder or decoder means you are "making" an unlicensed product. To this day very few people even care about this: FFmpeg happily links you to binary builds that contain its native decoder, capable of decoding even AAC-LD, which has a patent that lasts till 2030.[2]

External References

  • AAC FAQ
  • Read the AAC guide to learn how to obtain AAC/MP4 files out of WAV files and CDs.
  • Detailed AAC comparisons can be found at Roberto's listening tests page.