Opus: Difference between revisions

From Hydrogenaudio Knowledgebase
(→‎Hardware & Software Support: Edited copy of Wikipedia Support section (public domain) with references and wikilinks stripped out and some info added)
Line 270: Line 270:


== Hardware & Software Support ==
== Hardware & Software Support ==
The libopus toolkit from opus-codec.org contains commandline encoders and decoders and is ready compiled for some common platforms with source code available for others.


=== Web browsers ===
Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.
Opus will be supported by all web browsers that support WebRTC, which is likely to include Mozilla-based browsers (Firefox, Seamonkey), Chromium and Google Chrome, Opera and Maxthon Cloud Browser initially. While support will include encoding for the purpose of interactive real time communications, it might not include file-based Ogg/Opus encoding.


Current versions of Mozilla Firefox and its cousin, Seamonkey are known to support Opus natively as of late 2012, and Maxthon reports WebRTC support from 27 Dec 2012. Chromium is also believed to support Opus, and Opera has been reported to work in late 2012 when it happens to be used with certain plugins (gstreamer).
The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.


=== Audio players ===
=== Commandline binaries ===
[[foobar2000]] supports opus natively, including streaming URLs. It plays back at Opus's native 48000 Hz, so if the Convert dialogue is used to convert from Opus to another format and a different sampling rate is required or preferred, a Resampler DSP must be used in the Convert DSP chain. The original sampling rate (e.g. 44100 Hz from a CD source) is recorded in the Properties tab of .opus files. Alternatively, the opusdec.exe command from libopus can be used to decode to 16-bit WAV ([[PCM]]) and it will automatically resample to the original rate and apply noise shaped dither when quantizing to 16-bit.
The commandline tools are available pre-compiled for the most popular operating systems at [http://opus-codec.org opus-codec.org]
 
=== VoIP software ===
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10
 
=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which is uses shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome will have audio support as of version 25.
* Maxthon Cloud Browser
 
=== Streaming audio ===
* Icecast.
* Krad Radio
* Liquidsoap
 
=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
 
=== Hardware support ===
* Support in [[Rockbox]] is available in the developer version. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.
 
=== Player software ===
* VLC media player supports Opus since version 2.0.4
* AIMP supports Opus natively as of version 3.20 build 1125 beta 1.
* [[foobar2000]] supports the format natively as of v1.1.14 beta 1.
* Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
* Android has a number of player apps supporting Opus, including PowerAmp and others.
 
=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT


== References & Notes ==
== References & Notes ==

Revision as of 00:42, 13 January 2013

Opus
Official Opus logo

Opus Interactive Audio Codec
Developer(s) Xiph.Org Foundation
Release information
Initial release {{{released}}}
Stable release 1.0.2
Preview release exp_analysis7
Compatibility
Operating system Windows, Mac OS/X, Linux/BSD
Additional information
Use Encoder/Decoder
License 3-clause BSD license
Website opus-codec.org

Opus is a lossy audio compression format developed by the Internet Engineering Task Force (IETF) and made especially suitable for interactive real-time applications over the Internet,a though it is also very competitive for use as a storage and playback format. As an open format standardised through Request for Comments (RFC) 6716,c a high quality reference implementation is provided under the 3-clause BSD licensea which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.b Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.c Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0-8kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.c Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as MP3, Ogg Vorbis, LC-AAC and HE-AAC (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike these codecs, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers.c

Considerably more details of the history and potential applications for Opus are included in the Wikipedia page for Opus (audio format)

Characteristics

Opus supports bitrates from 6kbps to 510kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30kbps (mono) and 40-100 kbps (stereo). It is intrinsically variable bitrate, though constrained VBR and constant bitrate modes are possible where required. The target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of bitrate scalability is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4kHz, 6kHz, 8kHz, 12kHz, 20kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in Monty's CELT demo page under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports gapless playback (though poor player design might itself induce interruptions during playback). Playback gain is also required, making some form of ReplayGain or similar volume control possible in any compliant player.

Bitrate performance

For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 32 kbps. Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12kHz (comparable with compact cassette) then to the full 20kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12kHz to 20kHz.

Indicative bitrate and quality

The table below gives illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In the experimental libopus version 1.1-alpha, automatic detection of speech/music and bandwidth detection have been introduced to improve mode decisions, and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff. Thus changes are likely, and this table is likely to require small updates as the encoder is improved.

Speech encoding quality

This table assumes a monophonic source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed.

Bitrate target Bandwidth typ SILK/CELT use Speech quality notes Use cases/notes/competitive codecs
1 to 5 kbps - - <6kbps bitrate not supported Try codec2 for 1.2-2.4 kbps speech
6 kbps 4 kHz SILK Fair, intelligible AMR-NB may be a little better, but higher latency & proprietary, Speex also competitive
8 kbps 4 kHz narrowband SILK Close to telephone quality AMR-NB & AMR-WB similar quality, but higher latency & proprietary. Speex competitive.
12 kbps 6 kHz medium-band SILK Medium bandwidth, better than telephone quality Similar quality to AMR-WB
16 kbps 8 kHz wideband SILK Wideband speech quality Similar to/better than AMR-WB
24 kbps 12 kHz super-wideband hybrid Near transparent speech Better than AMR-WB. Podcasts/audiobooks/talk-radio.
32 kbps 20 kHz hybrid / possibly CELT Essentially transparent speech plus moderately good mono music Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
40 kbps 20 kHz CELT Essentially transparent mono or stereo speech, fairly good stereo music Stereo podcasts/audiobooks/talk radio with some music
48 kbps+ 20 kHz CELT Essentially transparent mono or stereo speech, reasonable music Flexible general purpose modes to suit mixed music and speech

Music encoding quality

This table assumes a stereophonic source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates

Bitrate target Stereo mode Bandwidth typ SILK/CELT use Music quality notes Use cases/notes/competitive codecs
6 kbps mono 4 kHz SILK Poor, muffled sound but intelligible lyrics -
8 kbps mono 4 kHz SILK Poor, muffled but OK for bitrate -
14 to 16 kbps mono 6 kHz SILK Fairly Poor but OK for bitrate Perhaps acceptable for incidental music
22 to 24 kbps mono 8 kHz SILK Fair but OK for bitrate OK for incidental music
32 kbps mono 12 kHz hybrid Moderately good mono, reasonably bright treble (c.f. mono cassette) Good for podcasts, audiobooks, CELT-only poss for music. HE-AAC @32kbps is stereo full-band but with annoying artifacts.
39 to 40 kbps stereo 12 kHz hybrid/CELT Moderately good stereo, reasonably bright treble (c.f. stereo cassette) Stereo podcasts, audiobooks, very low bitrate music
48 kbps stereo 20 kHz CELT Full bandwidth stereo music, some artifacts, rarely nasty Stereo podcasts, audiobooks, low bitrate music
64 kbps stereo 20 kHz CELT Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying') Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in listening test
96 kbps stereo 20 kHz CELT Full bandwidth stereo music, good quality approaching transparency Music storage & high quality streaming.
112 kbps stereo 20 kHz CELT Fairly close to transparency (needs more testing) Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
128 kbps stereo 20 kHz CELT Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3) Music storage & streaming. Future download music sales.
256 kbps stereo 20 kHz CELT Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive. Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, A2DP-bluetooth type links).
510 kbps stereo 20 kHz CELT Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless lossyWAV and WavPack lossy Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy --blocksize=256 may be competitive with minimum latency mode also.
>510 kbps - - - Above Opus bitrate range allowed for stereo sources Settle for 510kbps or use lossless, lossyWAV, WavPack lossy or lossy transform/subband codecs like Vorbis, Musepack at very high settings.

Lower latency versus quality/bitrate trade-off

Packet overhead in interactive applications

For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20.0ms frames, supports 60.0ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10.0ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20.0ms frames are the default, but frames of 10.0ms, 5.0ms and 2.5ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

None of the bitrates mentioned in this article account for the packet overhead.

CELT layer latency versus quality/bitrate trade-off

Unlike the SILK layer, which works on fixed 10.0ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10.0ms, 5.0ms and 2.5ms frames instead of the default 20.0ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5ms delay. The CELT layer requires 2.5ms for MDCT window overlap.

Xiph.org used matched PEAQ scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following approximate equivalent settings for stereo music.

Frame size Algorithmic delay Bitrate to match 64kbps@22.5ms delay fractional bitrate increase
20.0 ms 22.5 ms 64.0 kbps 0.0 %
10.0 ms 12.5 ms 70.4 kbps 10.0 %
5.0 ms 7.5 ms 84.8 kbps 32.5 %
2.5 ms 5.0 ms 112.0 kbps 75.0 %

N.B. This table is useful for streaming only. For music storage & delayed playback, latency reduction is not important and the default 20.0ms frame size is preferable.

Hardware & Software Support

Much of this section is based heavily on the Jan 12th 2013 version of the Support section of the Wikipedia article, which is more likely to be kept updated and to provide links to further information about the supporting platforms.

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

Commandline binaries

The commandline tools are available pre-compiled for the most popular operating systems at opus-codec.org

VoIP software

  • The voice-chat software Mumble supports Opus as its main codec.
  • SIP softphones Phoner and PhonerLite support Opus
  • The SIP and IAX2 client SFLphone is being fitted with Opus support.
  • Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
  • TrueConf video conferencing solutions support Opus.
  • Opus support is planned for Jitsi 2.0, together with VP8 video
  • Empathy may use any format supported in GStreamer, including Opus.
  • Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
  • CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
  • The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10

Web frameworks and browsers

  • Opus support is mandatory for WebRTC implementations.
  • Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which is uses shared codebase.
  • Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
  • Chromium and Google Chrome will have audio support as of version 25.
  • Maxthon Cloud Browser

Streaming audio

  • Icecast.
  • Krad Radio
  • Liquidsoap

Operating systems and desktop multimedia frameworks

  • In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
  • For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
  • In GStreamer the integration of Opus support is complete.
  • FFmpeg supports decoding and encoding Opus via the external library libopus.

Hardware support

  • Support in Rockbox is available in the developer version. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

Player software

  • VLC media player supports Opus since version 2.0.4
  • AIMP supports Opus natively as of version 3.20 build 1125 beta 1.
  • foobar2000 supports the format natively as of v1.1.14 beta 1.
  • Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
  • Android has a number of player apps supporting Opus, including PowerAmp and others.

Other software

  • CDBurnerXP
  • MediaCoder
  • Report-IT

References & Notes