Monkey's Audio

From Hydrogenaudio Knowledgebase
(Redirected from APE)

Monkey's Audio (APE) is a lossless audio codec with its own file format (.ape). It is distributed as a free open-source (since August 2023) encoder/decoder with a Windows GUI & CLI for conversion, and a development kit to facilitate support in audio players and other software. Stereo decoding is supported by ffmpeg, offering playback for a larger number of platforms, and is also found in Rockbox-equipped portable players. An older version for *n*x platforms does exist, and apparently also current versions can be compiled.[1]

Recent (official) Monkey's Audio supports multi-channel and high resolution audio and several input formats. A player which incorporates a recent official version can play back all these, while older software and ffmpeg-based players might be restricted to mono/stereo at 24-bit resolution or less. Tagging is widely supported.

Performance-wise, Monkey's would in the early 2000s be the to-go codec to save storage, achieving better compression ratios than most competitors, at a then-significant CPU cost both for encoding and decoding, possibly taxing (now legacy!) portable players so much that users would have to select a lighter mode.[2] As of 2024, no open-source end-user codec achieves similar ratios (at normal resolution). Monkey's Audio's compression performance was a major inspiration and yardstick in the development of TAK.[3] Monkey's in turn, was inspired by early WavPack;[4] both of these codecs have adopted Monkey's tagging scheme, called APEv2, and so has the even heavier-compressing OptimFROG.

The Monkey's developer is also developing the JRiver Media Center player.[5] The naming similarity to the MediaMonkey player is an apparent coincidence which did cause some confusion at the time, especially since it could play APE files before changing name from Songs-DB to MediaMonkey in 2003, about at the same time as MediaJukebox changed name to JRiver Media Center.

Features

Historically, Monkey's Audio users have largely based their choice on CDDA compression ratios. Several of the features listed above have been added over time; the HA Wiki's Lossless Codec Comparison gives a rough overview, but not all software supports the features that the codec and file format might offer. As a rule of thumb, features that in the following list are indicated with versions 4 or above, might require (a player that uses) a recent official tool. "All" versions refer to 3.99 (2004) or above, as apparently that is what the third-party implementations are based on.

  • Seekable playback (but not streamability). A player like VLC may disable its seekbar, but that is not a limitation of the format itself.
  • High resolution audio support: 24-bits in all versions, 32-bit integer since version 5 (2019), 32-bit float since version 10 (2023).
  • Multichannel support starting version 4.86 (2019).
  • Supports linear PCM in as good as every relevant input format: WAVE/AIFF/W64/RF64/BW64/CAF/AU and can handle > 4 GiB input (2022/2024, use version 10.65 or above).
  • Non-audio chunks (RIFF and similar) stored; Monkey's will not only encode/decode the audio losslessly, but store the non-audio chunks and restore them into a file bit-identical to the original.
  • Piping support (since 7.26 in 2022, also earlier available in a special patch provided by shntool).
  • Tagging: APEv2 tags, also used in a few other lossless formats. Also ID3, while usually not advisable, is provided as an option for players which do not support other tag formats.[6]
  • Cuesheet support.
  • Unicode support.
  • Fast-verification or full verification both available.

The official command-line tool can perform all the above, and also re-encode Monkey's to Monkey's with tags transfer.

The Windows GUI offers further functionality:

  • Convert from other lossless codecs (included in the distribution), with transfer of APEv2 tags; FLAC tag transfer recently added and might still be developed.
  • Bulk file handling in parallel, spawning one CPU thread per file up to user-defined maximum.

For users of other codecs: peculiarities, design solutions, limitations, ...

Monkey's Audio has had several special design solutions. One is not at all unique anymore, having seen wider adoption: the custom APE tags, which after a redesign into APEv2 by Musepack developer Frank Klemm, was taken into not only the Monkey's Audio format, but also WavPack, OptimFROG and TAK, and has (though limited) use in MP3 files as well. APEv2 tags are at the end of the file; that means retagging will not trigger the full file rewrite sometimes necessary in front-tagged formats (FLAC and ALAC), but sometimes applications may spend longer time scanning for artwork and other tags.

Several of the Monkey's design choices still sets it apart from other codecs and their implementations:

  • There is a "cuesheet alternative" in Ape Link files (APL) – optional, and users who find it unfamiliar can ignore.
  • The format is not streamable. Users should note that streamability is not needed in a playback solution that has access to the entire file; an end-user can simply check if a player solution does support Monkey's.
  • Error resilience is not implemented in the official version: Should a file be corrupted, if only by a single bit flipped, the official decoder will halt upon encountering it, dropping the rest of the file. One can seek past an error as long as one points it to a subsequent block, and ffmpeg can also decode on further (assuming it is stereo, for ffmpeg to support it at all).
    • Even when salvaging audio that way, dropouts might be severly long for the higher modes, due to the large block size. ("All" lossless compressed formats will have a sample calculated from past samples until a block boundary, but most others are a fraction of a second.)
  • Monkey's offers error detection, including through its MD5 checksum – but unlike formats where the MD5 identifies the (unencoded) audio signal, Monkey's will checksum the encoded stream. Enabling MD5 upon encoding a CD rip in FLAC (on by default), WavPack, TAK and OptimFROG will all store the same MD5 and can be used to identify the audio; encoding in Monkey's "Normal" or "High" will yield two distinct ones.
  • For the most common audio formats, the bitstream has largely been frozen permanently; a file encoded with 3.99 in "High" mode compresses to the same file as one encoded with the most recent version. That is, not only do they represent the same audio and non-audio, the encoded files are bit-identical. This property is in stark contrast to the exceptions:
  • On several occasions, new features – including new signal support – have been altered soon after introduction in a compatibility-breaking way,[7] and the official website does not offer older versions to be downloaded. The third party site Videohelp.com offers several older versions for download[8], but the files themselves rarely indicate what version was used to encode them.
    • Also, third-party decoding is usually limited to two channels and at most 24 bits.
  • Non-audio metadata is not only included, there is no (documented) way to discard them without piping.
    • The user might take note that the Monkey's help file's proposal for encoding from pipe – namely, using ffmpeg – is not lossless for > 16 bits. This is not a Monkey's limitation, it is a design choice in ffmpeg. It is possible for power users to read off audio format properties and extract audio for piping in a lossless manner, but it does require command-line skills. Likely one would rather use a player with conversion support; below is a guide on how to do it with foobar2000.
    • Users might note that full file preservation is irrelevant for CD rips. Contrary to a common misconception, CD audio is not stored as WAVE – nor in any sort of file. Thus, using a "full file compressor" gives no more "true" copy of a CD.

Monkey's has traditionally had less hardware support than FLAC, but there have been some hardware solutions available (at least for CDDA), including in-car units. The distinction between "hardware" players and "software" players is arguably blurred with embedded Linux-based hardware and with ffmpeg-based playback through Android and iOS players, which have brought both Monkey's playback and several other formats to more devices. Android users might note that this OS does not natively support other lossless audio formats than WAVE and FLAC, so there have been issues getting certain Android solutions to recognize other media files including Monkey's;[9] apparently, using SD card solves it.

The license controversy

From version 10.18 (August 2023), Monkey's has been released under the free and open-source 3-clause BSD license, which is the same as WavPack and and the official FLAC libraries.

Earlier on it was released under its own unorthodox license, which made source available on terms deemed non-free.[10] Controversy arose both because these terms were perceived to encourage using the software to violate a major FOSS license, and also because several users forked it to FOSS repositories, checking off one of the site's approved licenses, hence violating the Monkey's license. The remains of one *n*x fork at Github is available at Github and a Java port (see the Software support subsection below).

A legacy license issue might be insignificant to end-users, but as of May 2024 there seems not (yet?) to be any recent version ported to and maintained for non-Windows platforms, nor included in 3rd party implementations like ffmpeg.

Performance – file size, CPU load

Rewind twenty years to 2004 and Monkey's version 3.99 (the basis of the third-party implementations), Monkey's was the to-go codec for many users who gave priority to file size and were willing to wait for encoding (and decoding, including for conversion). Fast forward to now, enthusiasts will still compare performance even if ordinary users may take note that impact on cost is often negligible. Different considerations may apply when a drive is near full. The HA wiki's lossless comparison article compares speed and compression ratio on CDDA on the respective codecs' default setting (interpreting Monkey's "Normal" as the default), all taken from Martijn van Beurden's comparison studies.[11] The test corpus is chosen to be a "wide" selection, and most music collections are biased in one direction or another genre-wise; users who are sufficiently concerned about compression might look up the individual sources in the study and/or make a test sample from their own collection. By and large, the following observations can be made:

  • CDDA: By and large, Monkey's spends more time (both encoding and decoding) and achieves smaller files than any other format still alive and maintained, except OptimFROG: Stepping up from OptimFROG's default, it can outcompress any Monkey's, but at higher computational effort, especially in encoding.
    • Exception: TAK can compress like Monkey's "High" and in this study also catches "Extra high", and faster (and spend only a fraction of the decoding computing effort).
    • Exception: Monkey's "Fast" is arguably not competitive; flac -7 would compress slightly better and slightly faster, and decode much lighter. But "Fast" isn't Monkey's main selling point, and for "Normal" and up, any end-user compressor able to reach Monkey's file sizes on CDDA, would be closed-source.
  • Multichannel: largely as CDDA, except:
    • TAK soundly outperforms any competition on 5.1 (it is capped at 6 channels), and Monkey's was also out-compressed by the little-used MPEG-4 ALS codec.
    • On 5.1, WavPack could with considerable encoding effort catch Monkey's "Normal" and maybe "High", but not "Extra high".
  • High resolution: Inconsistent results, sometimes as CDDA and sometimes losing out to both FLAC, WavPack and TAK.
    • Part of this inconsistency has a well-known explanation, which one may call "fake" high resolution: if a 16-bit signal is padded up with zeroes and stored in a 24-bit WAVE file, then other codecs can notice and exploit it. The Monkey's format has no such provision to deal with it (and neither has ALAC), and apparently such a revision would break compatibility.
    • What is not known, is what fraction of high-bit depth signals are actually of this kind. A particular user's collection may have a lot or very little; YMMV, and by a lot.
  • Floating-point signals (from version 10) do not obtain competitive compression ratios compared to WavPack. Apparently the purpose was to ensure that Monkey's can actually handle these signals when they emerge from certain editing software.
  • Monkey's "Insane" setting is hardly worth it – the official help file says the same about "Extra High" and the developer himself uses "High". However, while "Insane" might at some signal types creating larger files than "Extra High", there were very few signals where "Extra High" could be tricked into worse compression ratio.

These results are based on version 10, which rectifies some slowdown from older versions.

Verification speed

Like WavPack (from version 5) and OptimFROG, Monkey's offers two integrity verification modes:

  • Verification by decoding. Takes the time that decoding takes (slightly more than encoding does). External applications like the foobar2000 player will do this when asked to verify.
  • Fast verification, that tests whether the encoded bitstream is valid. Since it does not decode, it works much faster than even FLAC verification, but not at all as fast as WavPack's (due to the latter's faster block checksum algorithm).

For checking an entire hard drive, the time difference is potentially huge. Note however that if a drive cannot be trusted, it should better be backed up first – fast verification reads all the encoded audio, it is merely the CPU that has less to do.

Other formats (FLAC, TAK, ...) do employ block checksums and could implement a fast verification, although as of writing it has not been implemented.


Using Monkey's Audio (for Windows)

Monkey's Audio comes with an installer (32-bit or 64-bit), which will install as a normal Windows application.

The graphical user interface

The GUI has several self-explanatory features, explained at the official help page. Opening it, the top-left button will allow you to select action (encode/decode/...), and with that in place, drag and drop files. Delete any that ended up in the window by mistake. If the task is encoding (or .ape to .ape reencoding), you may want to select compression setting. Press the button.

There is an options page, where you can for example set the number of files to run concurrently (each will spawn a new thread), setting process priority, and whether to use full verification (that decodes) or fast verification (that only checks the encoded bitstream).

The command-line utility (for Windows: command-line hints here) and the .bat file

Most users will be satisfied with the GUI, which is a graphical front-end for the Monkey's Audio Console, called MAC.exe on Windows. Sometimes one may want to avoid the GUI. To that end, one can find the console in the installation directory, normally C:\Program Files\Monkey's Audio x64\ or C:\Program Files (x86)\Monkey's Audio\ . There is also a .bat file there to support encoding by drag and drop.

MAC.exe can be copied stand-alone, after which users who do not want to have the full application installed, can uninstall it. It can also be extracted from the installation executable by opening it as an archive. The .bat can be copied/extracted as well, to the same directory as where the MAC.exe resides.

The following are the basic commands for the console:

  • The help text: MAC -h
  • For encoding, decoding and conversion from .ape to .ape, the basic command-line is MAC infile outfile [option] where the [option] will be as follows:
    • For decoding, when infile is .ape and outfile is e.g. WAVE, -d
    • For encoding to .ape, -c1000 or -c2000 or ... -c5000 for "Fast", "Normal", "High", "Extra High" and "Insane" mode. According to the help file, the developer himself uses "High".
    • For transcoding .ape to .ape: Replace the "c" by "n", giving option -n1000 to -n5000
  • For verification: MAC apefile -v or MAC apefile -V

Also tags can be added, although end-users will likely resort to a more user-friendly application.

With players and other applications

Decoding and playback

Several players support Monkey's Audio out-of-the-box. The developer is affiliated with JRiver Media Center, which integrates Monkey's Audio support. The foobar2000 player supports Monkey's decoding out-of-the-box in version 2.x, so that the foo_input_monkey is not anymore needed. 1.6 users will still have to install it.

Encoding with ExactAudioCopy

The Wiki has a guide for Configuring EAC and Monkey's Audio, for CD ripping.

Encoding with foobar2000

To encode to Monkey's Audio with foobar2000, one needs to set it up as a custom encoder. Follow the wiki's guideline on conversion and select a custom encoder. You will get a panel like displayed in the custom presets article. The entries could look like the following:

      Encoder: C:\Program Files\Monkey's Audio x64\MAC.exe
      Extension: ape
      Parameters: %s %d -c3000
      Format is: lossless (or hybrid)
      Highest BPS mode supported: 32
      Encoder name: APE (Monkey's Audio)
      Bitrate: (ignore this, it is for lossy)
      Settings: high

The first line should match your actual path to MAC.exe; if you installed elsewhere (for example the 32-bit to C:\Program Files (x86), you have to modify accordingly. In the third line, the "-c3000" indicates "High" mode. Use -c1000/-c2000/-c3000/-c4000/-c5000 according to preference, as explained above. The "Highest BPS mode supported:" is here set to 32. That means it will allow 32 bits per sample files to be encoded to Monkey's; if you want to use them on an ffmpeg-based player, you could set it to 24 bits to avoid creating files that ffmpeg's decoder will reject. The remaining lines are free text that will show up for your information; if you write wrong there, the encoder will ignore it.

Software support

3rd party ports and implementations:

  • FFmpeg - decoding, stereo only
  • JMAC - Java implementation of version 3.99
  • *n*x port - based on 3.99 with pipe support from shntool
  • Shntool - conversion which also supports legacy formats sometimes found in live show trading communities

Other converters that support Monkey's, include CUETools (CDDA only).

Players: Players like JRiver and foobar2000 employ the official Monkey's Audio SDK for support. VLC and several players based on ffmpeg and the Bass audio library do have some support, which may be limited to stereo and 24 bits; a user who wants to play .ape files with higher resolution or channels might have to simply try, as detailed information is often not stated. For macOS, there is Cog.

Tagging and audio info: Several players support tagging, and the APEv2 tag scheme is also widely supported among stand-alone taggers. Not all of the following have been tested with all possible Monkey's files (like, high channel count float from .au source).


Further reading

  1. Version History, note on 10.60 compiling on Gentoo
  2. Rockbox codec overview
  3. http://thbeck.de/Tak/Tak.html#Entwicklung Thomas Becker explains the TAK development (in German)
  4. Monkey's Audio historical FAQ (archived on October 17, 2000)
  5. JRiver key personell
  6. Version History, note on 10.19 about the developer's in-car unit demanding ID3v1
  7. Rockbox' page on Monkey's Audio
  8. Several old Monkey's versions at videohelp.com
  9. HA thread with link to Android bug report by the Monkey's Audio author
  10. Debian mailing list discussion on the old Monkey's licensing terms
  11. Martijn van Beurden: Lossless audio codec comparison archive, all comparisons in this wiki article consistent with results reported in revision 6, 2023, using Monkey's Audio 10.17.