Difference between revisions of "LAME Y switch"

From Hydrogenaudio Knowledgebase
Jump to: navigation, search
(New page: This article describes the function of '''the -Y switch''' in the LAME encoder commandline. ==The short definition== * The -Y switch tells LAME not to encode the highest frequenci...)
 
m (Minor page reformatting.)
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This article describes the function of '''the -Y switch''' in the [[LAME]] encoder commandline.
+
This article describes the function of '''the -Y switch''' in the [[LAME]] command-line encoder, and tries to clarify what the switch does and what it does not do. It is frequently misinterpreted, like [[joint stereo]], and mistaken for a filter.<ref>{{ha|url=https://hydrogenaud.io/index.php?topic=79841.msg697111#msg697111|title=-V2 gives way too high bitrate!?!}}</ref>
  
==The short definition==
+
In explaining what it does, in easy terms and in technical terms, the reader should get a better understanding of the motivation behind and the usage of the switch.
* The -Y switch tells [[LAME]] not to encode the highest frequencies '''accurately''', '''if doing so''' causes disproportional increases in bitrate.  
+
  
 +
==The short definition==
 +
* The -Y switch tells LAME not to encode the highest frequencies '''accurately''', '''if doing so''' causes disproportional increases in [[bitrate]].
  
 
Other ways to say it include:
 
Other ways to say it include:
* The -Y switch tells [[LAME]] to use a more '''coarse representation''' for the highest frequencies, in the parts where it would cause an over-encoding of all the other bands.
+
* The -Y switch tells LAME to use a more '''coarse representation''' for the highest frequencies, in the parts where it would cause an over-encoding of all the other bands.
* The -Y switch tells [[LAME]] to '''not be so strict''' with the higher frequencies, '''IF''' they are going to cause an increase of bitrate.
+
* The -Y switch tells LAME to '''not be so strict''' with the higher frequencies, '''if''' they are going to cause an increase of bitrate.
 
+
  
 
; The -Y switch is not a lowpass filter.
 
; The -Y switch is not a lowpass filter.
: It allows high frequencies (>=16Khz) to exist, it just alters its accuracy. If their values are very small it can quantize them to zero (but probably the psychoacoustic analyzer will decide to remove them anyway).
+
: It allows high frequencies (>=16Khz) to exist, it just alters its accuracy. If their values are very small it can quantize them to zero (but probably the psychoacoustic analyzer will decide to simply remove them instead).
 
+
  
 
==The technical definition==
 
==The technical definition==
 
 
===How is audio stored in MP3===
 
===How is audio stored in MP3===
* MP3 audio is stored in the frequency domain (values for frequencies) instead of time domain (values for samples)
+
* MP3 audio is stored in the frequency domain (values for frequencies) instead of the time domain (values for samples)
 
* Frequencies are analyzed and stored in groups, known as bands.
 
* Frequencies are analyzed and stored in groups, known as bands.
 
* Bands are quantized to make them compress better.
 
* Bands are quantized to make them compress better.
* ''Scale factor'' refers to how much quantization (loss of precision) is applied to each band, where higher quantization causes greater compression, but also less variation between the minimum and maximum values (resolution).
+
* ''Scale factor'' refers to how much quantization (loss of precision) is applied to each band, where higher quantization causes greater compression, and consequently less variation between the minimum and maximum values (resolution).
 
* Each band has its own scale factor, so that its quantization can be adjusted independently from the others.
 
* Each band has its own scale factor, so that its quantization can be adjusted independently from the others.
 +
*The exception is scalefactor band 21 (sfb21), which does not have a scale factor. This band stores frequencies of 16 kHz and above.
 
* Global gain is an extra quantizer that affects all bands simultaneously.
 
* Global gain is an extra quantizer that affects all bands simultaneously.
 +
(See section notes about scalefactors and global gain)
  
 
===What is the scalefactor band 21 (sfb21) defect===
 
===What is the scalefactor band 21 (sfb21) defect===
* The last band is called '''sfb21''', and '''does not have''' a scale factor. This band stores frequencies at 16 kHz and above.
+
* If the encoder determines that sfb21 needs more resolution, it has no way to decrease the scale factor of sfb21 alone, since there is no such scale factor.
* If the encoder determines that sfb21 needs more resolution, it has no way to decrease the scalefactor of sfb21 alone, since there is no such scale factor.
+
* The only way to increase the resolution on sfb21 is therefore to reduce the global gain quantization, since global gain applies to all bands.
* The only way to increase the resolution on sfb21 is therefore to reduce the global gain quantization.
+
* Scale factors are stored as a relative value of global gain (just the difference is stored). Let's call this value the "relative factor".
* The encoder can reduce the global gain as long as it is above zero.
+
* To balance the reduction of global gain, the scale factor of the other bands should increase. Consequently, the relative factor decreases.
* If global gain is zero, resolution will need to be increased (and quantization be lowered) '''on every other scale factor band'''.
+
* The relative factor can be decreased until it reaches zero. At that point, any further reduction of global gain implies that the band will use more resolution than needed.
* The result is that unnecessary resolution is applied to every other band, so the bits used in all the other bands will increase and ultimately, the bitrate too.
+
* The encoder is forced to increase the bitrate needed, not only because of the frequencies at or above 16Khz, but also because other bands below the sfb21 have excessive resolution.
* The encoder is forced to increase in excess the bitrate of the file just so that the frequencies >= 16 kHz will be adequately quantized.
+
  
===The -Y switch and the sfb21 ===
+
'''Source:''' [http://www.mp3-tech.org/content/?Mp3%20Limitations MP3' Tech - Mp3 Limitations] {{webarchive|https://web.archive.org/web/20120222124415/http://www.mp3-tech.org/content/?Mp3%20Limitations|2012-02-22}}
[[LAME]] implements the -Y switch as a way to activate an alternate logic that considers the sfb21 as different, preventing the reduction of global gain when the psy-model says it needs so to achieve the desired quality in the >= 16Khz range.
+
  
The result is that all the 16 kHz + frequencies still get encoded, but the ones that would normally have needed higher resolution to satisfy the criteria of the psy-model don't receive that treatment, while ones that wouldn't need higher resolution are unaffected by the Y switch.
+
===sfb21 and the -Y switch===
 +
LAME implements the -Y switch as a way to activate the alternate logic that [[CBR]] uses in respect to quantization noise in the sfb21 band.
 +
* The encoder determines the desired quantization noise within the sfbs. The scale factors are choosen acording to these values.
 +
* If -Y switch is not used (either implicitly or explicitly), sfb21 gets evaluated and the global gain is set accordingly.
 +
* Adding -Y lets the encoder ignore whatever quantization noise will be in sfb21.
  
 +
The result is that all the 16 kHz and above frequencies still get encoded.
  
==The -Y switch and CBR ==
+
The ones that would normally have needed higher resolution to satisfy the criteria of the psy-model don't receive that treatment, while ones that wouldn't need higher resolution are unaffected by the Y switch. '''The Y switch prevents global gain quantization from being decreased solely to accomodate the needs of sfb21'''.
The -Y switch is used along with [[LAME#Technical information|VBR modes]].
+
For CBR and ABR, the encoder automatically uses a similar process to what -Y does. ('''More information on this needed''')
+
  
 +
==CBR/ABR and the -Y switch==
 +
The -Y switch can only be activated in [[LAME#Technical information|VBR mode]]. By default, {{code|-V3}}–{{code|-V9}} use -Y. {{code|-V0}}–{{code|-V2}} do not. Consequently, adding -Y is only useful for the highest three VBR settings.
  
==Motivation for this article==
+
This is because in CBR and [[ABR]] modes, the encoder uses -Y implicitly. Specifically, LAME targets a given bitrate, and adjusts the quantization steps until that target is reached.
The article tries to clarify what the switch does and what it does not. Just like [[Joint_stereo|joint stereo]] it is frequently misinterpreted, and identified with a filter.
+
In explaining what it does, in easy terms and in technical terms, the reader should get a better understanding of the motivation and the usage of such switch.
+
  
 +
Since the sfb21 does not have quantization, its quantization noise is not evaluated.
 +
 +
This is the same treatment as using -Y in VBR mode.
  
 
==Notes==
 
==Notes==
In MPEG1 (32, 44, 48Khz), the last scalefactor band is sfb21. In MPEG2 (16,22,24Khz), it is sfb12. The frequency at which it starts also depends on the sampling rate. The value of ~16Khz is for 44.1Khz material.
+
For long blocks, the last scalefactor band is sfb21. For short blocks it is sfb12. The frequency at which it starts also depends on the sampling rate. The value of ~16Khz is for 44.1Khz material.
 +
 
 +
Global gain and scale factors are not independent when stored to the file. The latter is expressed as a difference of the former. (relative factor)
 +
 
 +
* The global gain is the global quantization step size, with a value range between 0 and 255.
 +
* The relative factor per band is the amount to reduce the global quantization step size. The range of this value is dependant on the band.
 +
Consequently, there are just a reduced amount of values to use and a change to global gain needs to be compensated by a change in the relative factor.
  
 +
This article has been brought up partially with comments fom Aleron Ives, robert and benski.
  
 
==See also==
 
==See also==
 +
* [[MP3|Description of the  MPEG Audio Layer III format]]
  
[[MP3|Description of the MPEG layer 3 format]]
+
==References==
 +
<references/>

Latest revision as of 17:24, 3 December 2019

This article describes the function of the -Y switch in the LAME command-line encoder, and tries to clarify what the switch does and what it does not do. It is frequently misinterpreted, like joint stereo, and mistaken for a filter.[1]

In explaining what it does, in easy terms and in technical terms, the reader should get a better understanding of the motivation behind and the usage of the switch.

The short definition

  • The -Y switch tells LAME not to encode the highest frequencies accurately, if doing so causes disproportional increases in bitrate.

Other ways to say it include:

  • The -Y switch tells LAME to use a more coarse representation for the highest frequencies, in the parts where it would cause an over-encoding of all the other bands.
  • The -Y switch tells LAME to not be so strict with the higher frequencies, if they are going to cause an increase of bitrate.
The -Y switch is not a lowpass filter.
It allows high frequencies (>=16Khz) to exist, it just alters its accuracy. If their values are very small it can quantize them to zero (but probably the psychoacoustic analyzer will decide to simply remove them instead).

The technical definition

How is audio stored in MP3

  • MP3 audio is stored in the frequency domain (values for frequencies) instead of the time domain (values for samples)
  • Frequencies are analyzed and stored in groups, known as bands.
  • Bands are quantized to make them compress better.
  • Scale factor refers to how much quantization (loss of precision) is applied to each band, where higher quantization causes greater compression, and consequently less variation between the minimum and maximum values (resolution).
  • Each band has its own scale factor, so that its quantization can be adjusted independently from the others.
  • The exception is scalefactor band 21 (sfb21), which does not have a scale factor. This band stores frequencies of 16 kHz and above.
  • Global gain is an extra quantizer that affects all bands simultaneously.

(See section notes about scalefactors and global gain)

What is the scalefactor band 21 (sfb21) defect

  • If the encoder determines that sfb21 needs more resolution, it has no way to decrease the scale factor of sfb21 alone, since there is no such scale factor.
  • The only way to increase the resolution on sfb21 is therefore to reduce the global gain quantization, since global gain applies to all bands.
  • Scale factors are stored as a relative value of global gain (just the difference is stored). Let's call this value the "relative factor".
  • To balance the reduction of global gain, the scale factor of the other bands should increase. Consequently, the relative factor decreases.
  • The relative factor can be decreased until it reaches zero. At that point, any further reduction of global gain implies that the band will use more resolution than needed.
  • The encoder is forced to increase the bitrate needed, not only because of the frequencies at or above 16Khz, but also because other bands below the sfb21 have excessive resolution.

Source: MP3' Tech - Mp3 Limitations (archived on February 22, 2012)

sfb21 and the -Y switch

LAME implements the -Y switch as a way to activate the alternate logic that CBR uses in respect to quantization noise in the sfb21 band.

  • The encoder determines the desired quantization noise within the sfbs. The scale factors are choosen acording to these values.
  • If -Y switch is not used (either implicitly or explicitly), sfb21 gets evaluated and the global gain is set accordingly.
  • Adding -Y lets the encoder ignore whatever quantization noise will be in sfb21.

The result is that all the 16 kHz and above frequencies still get encoded.

The ones that would normally have needed higher resolution to satisfy the criteria of the psy-model don't receive that treatment, while ones that wouldn't need higher resolution are unaffected by the Y switch. The Y switch prevents global gain quantization from being decreased solely to accomodate the needs of sfb21.

CBR/ABR and the -Y switch

The -Y switch can only be activated in VBR mode. By default, -V3-V9 use -Y. -V0-V2 do not. Consequently, adding -Y is only useful for the highest three VBR settings.

This is because in CBR and ABR modes, the encoder uses -Y implicitly. Specifically, LAME targets a given bitrate, and adjusts the quantization steps until that target is reached.

Since the sfb21 does not have quantization, its quantization noise is not evaluated.

This is the same treatment as using -Y in VBR mode.

Notes

For long blocks, the last scalefactor band is sfb21. For short blocks it is sfb12. The frequency at which it starts also depends on the sampling rate. The value of ~16Khz is for 44.1Khz material.

Global gain and scale factors are not independent when stored to the file. The latter is expressed as a difference of the former. (relative factor)

  • The global gain is the global quantization step size, with a value range between 0 and 255.
  • The relative factor per band is the amount to reduce the global quantization step size. The range of this value is dependant on the band.

Consequently, there are just a reduced amount of values to use and a change to global gain needs to be compensated by a change in the relative factor.

This article has been brought up partially with comments fom Aleron Ives, robert and benski.

See also

References

  1. -V2 gives way too high bitrate!?! on hydrogenaudio