Joint stereo

From Hydrogenaudio Knowledgebase
Revision as of 21:12, 7 September 2011 by Mjb (Talk | contribs)

Jump to: navigation, search

Joint stereo is a property of an audio data stream and means that the stream supports more than one method of stereo coding, such as SS ("simple" or "L/R" stereo), MS ("mid-side" stereo), or IS ("intensity" stereo). A joint stereo stream may still only employ a single coding method, but for the sake of efficiency or quality may switch between methods on a frame or even sub-frame basis.

For example, a high-bitrate "joint stereo" MP3 file may contain a mixture of SS and MS frames, or it may contain all SS frames or all MS frames. A non-"joint stereo" MP3 will never contain a mixture of frame types.

"Joint stereo coding methods" in prose generally refers to whatever alternatives to simple (L/R) stereo coding are supported by a particular format, even though simple stereo is also an option.

Stereo coding methods or "modes"

Left-Right (L/R) or "Simple" Stereo (SS)

Simple stereo is the most straightforward method of coding a stereo signal: each channel is treated as a completely separate entity. This can be inefficient and may adversely impact quality (as compared to other modes) when both channels contain nearly identical signals (i.e., are mono or nearly so).

Mid-side Stereo (MS)

Mid-side stereo coding calculates a "mid"-channel by addition of left and right channel, and a "side"-channel, i.e.:


Left = L \qquad Right = R\,


Middle=\frac{L+R}{2} \qquad Side=\frac{L-R}{2}


Left=Middle+Side \qquad Right=Middle-Side


Whenever a signal is concentrated in the middle of the stereo image (i.e. more mono-like), mid-side stereo can achieve a significant saving in bitrate, since one can use fewer bits to encode the side-channel. Even more important is the fact that by applying the inverse matrix in the decoder, the quantization noise becomes correlated and falls in the middle of the stereo image, where it is masked by the signal.

Unlike intensity stereo which destroys phase information, mid-side coding keeps the phase information pretty much intact. Correctly implemented mid-side stereo does very little or no damage to the stereo image and increases compression efficiency either by reducing size or increasing overall quality.

Intensity Stereo

Intensity stereo coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information. This replacement is psychoacoustically justified in the higher frequency range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2kHz.

Intensity stereo is by definition a lossy coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.


Additional information

Some early MP3 encoders didn't make ideal decisions about what mode to use from frame to frame in joint stereo files, or how much bandwidth to allocate to encoding the side channel. This led to a widespread but mistaken belief that an abundance of M/S frames, or the use of joint stereo in general, always negatively impacts channel separation and other measures of audio quality. This is not an issue with modern encoders. Modern, optimized encoders will switch between mid-side coding or simple stereo coding as necessary, depending on the correlation between the left and right channels, and will allocate channel bandwidth appropriately to ensure the best mode is used for each frame.

External Links