Difference between revisions of "Joint stereo"

From Hydrogenaudio Knowledgebase
Jump to: navigation, search
(+References section. Pruned external links.)
(17 intermediate revisions by 7 users not shown)
Line 1: Line 1:
Joint stereo coding methods try to increase the coding efficiency when encoding stereo signals by exploiting commonalties between the left and right signal. There are two usual joint stereo coding algorithms, namely mid-side (ms) stereo coding and intensity stereo coding. Ms stereo applies a matrix to the left and right channel signals, computing sum and difference of the two original signals. Whenever a signal is concentrated in the middle of the stereo image, ms stereo can achieve a significant saving in [[bitrate]]. Even more important is the fact that by applying the inverse matrix in the decoder the quantization noise becomes correlated and falls in the middle of the stereo image where it is masked by the signal.
+
'''Joint stereo''' is a property of an audio data stream and means that the stream supports more than one method of stereo coding, such as SS ("simple" or "L/R" stereo or DualMono), MS ("mid-side" stereo), or IS ("intensity" stereo). A joint stereo stream may still only employ a single coding method, but for the sake of efficiency or quality may switch between methods on a frame or even sub-frame basis.
  
[[Intensity stereo]] coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information. This replacement is psychoacoustically justified in the higher [[frequency]] range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2kHz.
+
For example, a high-[[bitrate]] "joint stereo" [[MP3]] file may contain a mixture of SS and MS frames, or it may contain all SS frames or all MS frames. A non-"joint stereo" MP3 will never contain a mixture of frame types.
  
Intensity stereo is by definition a [[lossy]] coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only ms stereo should be used.
+
"Joint stereo coding methods" in prose generally refers to whatever alternatives to simple (L/R) stereo coding are supported by a particular format, even though simple stereo is also an option.
  
Text © Menno Bakker - [http://www.audiocoding.com/ Audiocoding]
+
==Stereo coding methods or "modes"==
  
 +
===Left-Right (L/R) or "Simple" Stereo (SS)===
 +
Simple stereo is the most straightforward method of coding a stereo signal: each channel is treated as a completely separate entity. This can be inefficient and may adversely impact quality (as compared to other modes) when both channels contain nearly identical signals (i.e., are mono or nearly so).
  
Some more details, history and examples about joint stereo & mid/side coding:
+
===Mid-side Stereo (MS)===
 +
Mid-side stereo coding calculates a "mid"-channel by addition of left and right channel, and a "side"-channel, i.e.:
  
mid/side can be lossless like obviously in Lossless formats Flac, Wavpack, Monkey's Audio (ape) etc., but in lossy encoders the encoder tries to do the best to minimize all losses in perception. And here the encoder has not only to deal with stereo modes, but also with mids, highs etc. etc.
 
So, regarding lossy formats like MP3 (Lame, Fraunhofer, Xing), Musepack (MPC), Vorbis etc., the mid/side coding might be even mathematical lossless, might be perceptual lossless (=transparent), or not so lossless at all at low bitrates.
 
So, it depends in the lossy formats about the quality of mid/side (js) coding.
 
From obvious bad sounding bugs like in some old Fraunhofer mp3 ("Radium hack"), not so optimized perfomance like in mp3-Xing, up to the optimized js-modes in mp3-Lame, which offer frame-dependent stereo or mid/side coding to achieve maximum qualities. And advanced formats like mp3-lame, Musepack-MPC or Vorbis-ogg offer increasing js(mid/side//stereo)qualities in increasing general quality (q) levels, examples:
 
Mp3-Lame inside the presets eg. -V quality levels, with different -msfix values up to the "nssafejoint" mode in 320 kbit cbr.
 
Musepack-MPC inside the various q-levels/presets and the --ms x switch.
 
Vorbis in its presets, 1 example above q6 the "lossless stereo coupling".
 
  
So, regarding js/mid/side coding, there is no black & white, there are shades of grey, dependent on the target bitrate or quality level.
+
<center><math>Left = L \qquad Right = R\,</math></center>
(Even if you ignore faulty implementations like in old Fraunhofer/"Radium hack" or Xing). But it should be said, that the prejudice of js=bad originates with high probability by these old faulty implementations of ancient encoders.
+
  
written by user - [http://www.High-Quality.ch.vu/ High Quality Audio guides]
+
 
 +
<center><math>Middle=\frac{L+R}{2} \qquad Side=\frac{L-R}{2}</math></center>
 +
 
 +
 
 +
<center> <math>Left=Middle+Side \qquad Right=Middle-Side</math></center>
 +
 
 +
 
 +
Whenever a signal is concentrated in the middle of the stereo image (i.e. more mono-like), mid-side stereo can achieve a significant saving in bitrate, since one can use fewer bits to encode the side-channel. Even more important is the fact that by applying the inverse matrix in the decoder, the quantization noise becomes correlated and falls in the middle of the stereo image, where it is masked by the signal.
 +
 
 +
Unlike [[Joint stereo#intensity stereo|intensity stereo]] which destroys phase information, mid-side coding keeps the phase information pretty much intact. Correctly implemented mid-side stereo does very little or no damage to the stereo image and increases compression efficiency either by reducing size or increasing overall quality.
 +
 
 +
===Intensity Stereo===
 +
 
 +
Intensity stereo coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information. This replacement is psychoacoustically justified in the higher [[frequency]] range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2 kHz.<ref>http://www.hydrogenaudio.org/forums/index.php?showtopic=1491&view=findpost&p=14091</ref>
 +
 
 +
Intensity stereo is by definition a [[lossy]] coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.
 +
 
 +
==Additional information==
 +
 
 +
Some early MP3 encoders didn't make ideal decisions about what mode to use from frame to frame in joint stereo files, or how much bandwidth to allocate to encoding the side channel. This led to a widespread but mistaken belief that an abundance of M/S frames, or the use of joint stereo in general, always negatively impacts channel separation and other measures of audio quality. This is not an issue with modern encoders. Modern, optimized encoders will switch between mid-side coding or simple stereo coding as necessary, depending on the correlation between the left and right channels, and will allocate channel bandwidth appropriately to ensure the best mode is used for each frame.
 +
 
 +
==External Links==
 +
* [http://en.wikipedia.org/wiki/Joint_stereo joint stereo at Wikipedia]
 +
 
 +
==References==
 +
<references/>
 +
 
 +
[[Category:Technical]]

Revision as of 21:58, 23 March 2014

Joint stereo is a property of an audio data stream and means that the stream supports more than one method of stereo coding, such as SS ("simple" or "L/R" stereo or DualMono), MS ("mid-side" stereo), or IS ("intensity" stereo). A joint stereo stream may still only employ a single coding method, but for the sake of efficiency or quality may switch between methods on a frame or even sub-frame basis.

For example, a high-bitrate "joint stereo" MP3 file may contain a mixture of SS and MS frames, or it may contain all SS frames or all MS frames. A non-"joint stereo" MP3 will never contain a mixture of frame types.

"Joint stereo coding methods" in prose generally refers to whatever alternatives to simple (L/R) stereo coding are supported by a particular format, even though simple stereo is also an option.

Stereo coding methods or "modes"

Left-Right (L/R) or "Simple" Stereo (SS)

Simple stereo is the most straightforward method of coding a stereo signal: each channel is treated as a completely separate entity. This can be inefficient and may adversely impact quality (as compared to other modes) when both channels contain nearly identical signals (i.e., are mono or nearly so).

Mid-side Stereo (MS)

Mid-side stereo coding calculates a "mid"-channel by addition of left and right channel, and a "side"-channel, i.e.:


Left = L \qquad Right = R\,


Middle=\frac{L+R}{2} \qquad Side=\frac{L-R}{2}


Left=Middle+Side \qquad Right=Middle-Side


Whenever a signal is concentrated in the middle of the stereo image (i.e. more mono-like), mid-side stereo can achieve a significant saving in bitrate, since one can use fewer bits to encode the side-channel. Even more important is the fact that by applying the inverse matrix in the decoder, the quantization noise becomes correlated and falls in the middle of the stereo image, where it is masked by the signal.

Unlike intensity stereo which destroys phase information, mid-side coding keeps the phase information pretty much intact. Correctly implemented mid-side stereo does very little or no damage to the stereo image and increases compression efficiency either by reducing size or increasing overall quality.

Intensity Stereo

Intensity stereo coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information. This replacement is psychoacoustically justified in the higher frequency range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2 kHz.[1]

Intensity stereo is by definition a lossy coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.

Additional information

Some early MP3 encoders didn't make ideal decisions about what mode to use from frame to frame in joint stereo files, or how much bandwidth to allocate to encoding the side channel. This led to a widespread but mistaken belief that an abundance of M/S frames, or the use of joint stereo in general, always negatively impacts channel separation and other measures of audio quality. This is not an issue with modern encoders. Modern, optimized encoders will switch between mid-side coding or simple stereo coding as necessary, depending on the correlation between the left and right channels, and will allocate channel bandwidth appropriately to ensure the best mode is used for each frame.

External Links

References

  1. http://www.hydrogenaudio.org/forums/index.php?showtopic=1491&view=findpost&p=14091