High Definition Compatible Digital

From Hydrogenaudio Knowledgebase
Revision as of 05:15, 10 August 2016 by Bp (talk | contribs) (→‎FFmpeg: note about aac conversion with ffmpeg)
TL;DR version: HDCD was (mostly) a scam, but thousands of popular CDs released from 1995-present have this encoding. (List of HDCD-encoded Compact Discs)

Do not batch-convert your lossless audio to 24-bit. Leave your lossless CD rips 16-bit and use an audio player capable of decoding HDCD, like Foobar2000.

High Definition Compatible Digital, or HDCD is a Microsoft proprietary audio encode-decode process that claims to provide increased dynamic range over that of standard Redbook audio CDs, while retaining backward compatibility with existing Compact disc players. —HDCD article at Wikipedia

Decoding the extra information required an HDCD-compatible player or Windows Media Player. There was no public documentation for the process, but it was eventually reverse engineered. Many HDCD-encoded CDs were released from the mid 1990s but started to disappear around 2008. Some CD's with HDCD codes are still appearing, because they were mastered using HDCD equipment, but do not use the core features of HDCD. Microsoft no longer advertises nor supports HDCD.

A lossless copy of CD audio will include the HDCD data. See List of HDCD-encoded Compact Discs, or the more complete, but less detailed list of known HDCD compact discs at Head-Fi.org

Function and features

HDCD encodes a virtual 20 bits of range in a 16-bit stream. Peak extend is worth about one bit of additional range, while Low level gain adjustment is worth about three bits of additional range. HDCD encoding amplifies the audio stream by 6dB to start.

Peak extend (PE)
The base 6dB amplification makes the lower levels louder, but the top 9dB is soft-limited, or "squashed", into the top 3dB in the 16-bit stream. When HDCD is being decoded, the top 9dB is reconstructed.
Low-level gain adjustment (LLE)
The base 6dB amplification can be corrected down during quieter parts via HDCD control codes to restore the original low level. See §Regarding the Low-level Gain Adjustment feature.
Filters
There were to be two selectable playback filters, but the idea was already patented and so it couldn't be used in Pacific Microsonics or other licensed hardware decoders. Software decoders can detect the feature, but none use it.

When playing the HDCD-CD in a regular CD player, a listener may hear the uncorrected distortion at the peaks, and the low-level remains uncorrected from the base 6dB amplification.

  • See a full technical examination by Jim Lesurf: Page 1 Page 2

Decoding software

As HDCD is a proprietary extension owned by Microsoft, Windows Media Player was the only software to support it for a long time. A simple closed-source Windows-only decoding tool, called hdcd.exe, appeared in 2007 on the Doom9 Forum, a product of reverse-engineering Windows Media Player. [1] Since that time, an open source implementation has come to exist based on this work. It supports the peak extend and gain HDCD features, while the transient filter feature is detected but not implemented.

All of the open source decoding is based of the work of Christopher J. Key (original reverse-engineering), Chris Moeller (open implementation), and Gumboot (C-optimization). Benjamin Steffes simplified the code, by using some pre-computed tables, etc., for inclusion in FFmpeg. There may be slight variations in the output of the different tools.

Audio players

  • Windows Media Player, but there is a bug where HDCD will not be enabled if the HDCD control signal is not detected near the beginning of a song
  • Foobar2000 can decode HDCD to 20-bit PCM via foo_dsp_hdcd (source)
  • dBpoweramp will decode HDCD to 24-bit PCM (uses the hdcd.exe tool for processing)
  • CUETools

hdcd.exe

The original Windows-only closed source tool posted on the Doom9 forums in 2007, by C.J. Key. It only works on wav.

hdcd.exe -o OUT24.wav HDCD16.wav

FFmpeg

FFmpeg version 3.1's libavfilter supports a HDCD filter that will convert HDCD-encoded audio to PCM at up to 20-bit precision. The filter is based on the Foobar2000 component source code.

Notice: There was a bug in FFmpeg (version 3.1.1 and earlier) that prevented low-level gain adjustment from working. It was fixed in this commit.

Example
FLAC with HDCD encoded in 16-bit (perhaps ripped from a CD)
ffmpeg -i input16.flac -af hdcd output24.flac
Example 2
Notice the output from the filter is truncated down to 16-bit because the wav muxer defaults to pcm_s16le...
ffmpeg -i input16.wav -af hdcd output16.wav
Example 3
... if you want to use another format (like pcm_s24le), you have to specify it with the acodec option
ffmpeg -i input16.wav -af hdcd -acodec pcm_s24le output24.wav

Conversion to AAC

If using the FFmpeg to convert HDCD-encoded lossless to AAC, it is important to note that the Fraunhofer FDK AAC (libfdk-aac) encoder only allows 16-bit PCM input, so FFmpeg will convert the 32-bit PCM output of the filter back to 16-bit before encoding AAC. The FFmpeg 3.0+ AAC encoder uses floating-point input, so the 32-bit PCM output of the filter will be converted to floating point PCM before encoding AAC. In this case, the native AAC encoder is preferable to FDK AAC, but only if using CBR, and at a high enough bitrate to overcome the other deficiencies of the libavcodec AAC encoder.

Analyze mode

An example of analyze_mode.
An example of analyze_mode.

FFmpeg's HDCD filter has a mode, selected by filter option, to aid in analysis of HDCD encoded audio. In this mode the audio is replaced by a solid tone and the amplitude is adjusted to signal some specified aspect of the process. The output file can be loaded in an audio editor alongside the original, where the user can see where different features or states are present.

An example track, Neil Young - Red Sun, from Silver & Gold. The different outputs loaded together in Audacity shown to the right are explained below.

Track Option setting Description
1 analyze_mode=off HDCD decoded
2 analyze_mode=lle LLE gain adjust levels in each sample
3 analyze_mode=pe Samples where PE occurs
4 analyze_mode=pe:force_pe=true Original sample was in the -3dBFS range (should match above when PE is permanent, but maybe not when it isn't)
5 analyze_mode=tgm Samples where the target_gain was ignored because it didn't match in both channels
Not shown analyze_mode=cdt Samples where HDCD decoding was active (whole track in this case)
Not shown analyze_mode=lle:process_stereo=false Similar to tgm, but shows the level at each sample in each channel

Scanning a FLAC archive for HDCD using FFmpeg

HDCD stats reporting was added after release 3.1.1. So, as of 30 July 2016, this script requires building ffmpeg from git. (git clone git://source.ffmpeg.org/ffmpeg) or using something like a Zeranoe FFmpeg build for Windows.

hdcdscan.sh
#!/bin/bash

# A bash script for scanning for files with HDCD encoding.
# Burt P.
#
# Usage:
#   ./hdcdscan.sh *.flac
# or
#   find /some/archive/path/ -name '*.flac' -exec ./hdcdscan.sh {} \;
# or 
#   find /some/archive/path/ -name '*.flac' -print0 |xargs -0 -P 4 -n 1 ./hdcdscan.sh
# or (prolly best)
#   find /some/archive/path/ -name '*.flac' -print0 |xargs -0 -P 1 -n 50 ./hdcdscan.sh
#

#FFMPEG="/home/you/gits/ffmpeg/ffmpeg" # if using ffmpeg from git

TDER="/run/shm" # temp directory
TLIMIT=30       # scan the first N seconds, empty for no limit
SIMPLE="y"      # show only summary, empty for no
ONLY_HDCD="y"   # show only files with hdcd detected, empty for no
CHECK_MODES=""  # use both modes and see if md5 matches

#----
CPU_COUNT=$(grep -c ^processor /proc/cpuinfo)
#CPU_COUNT=4    # manual

if [ -z "$FFMPEG" ]; then
    FFMPEG=$(which ffmpeg)
fi
FILTERCHK=$("$FFMPEG" -filters 2>&1 | grep hdcd)
if [ -z "$FILTERCHK" ]; then
    echo "$FFMPEG is not built with hdcd filter support"
    exit 1
fi

#FORMATSTR="-acodec pcm_s24le -f wav"  # if using wav temp file
FORMATSTR="-f s24le"                   # if using /dev/null

scan_file() {
    local SF
    local TAG
    local P_TLIMIT
    TAG="$$_$1"
    SF="$2"
    P_TLIMIT=""
    if [ -n "$TLIMIT" ]; then P_TLIMIT="-t $TLIMIT"; fi

    if [ -f "$f" ]; then
        #TF="$TDER/hdcdout_$TAG.wav"
        TF="/dev/null"
        TFO="$TDER/hdcdout_$TAG.ffmpeg-out"

        echo "$f ..." >"$TFO"

        "$FFMPEG" -hide_banner -nostats -y -v verbose -i "$f" $P_TLIMIT -vn -af hdcd $FORMATSTR "$TF" 2>&1 | grep "_hdcd_" >>"$TFO"
        DETECTED=$(grep "HDCD detected: yes" "$TFO")
        if [ -n "$ONLY_HDCD" ]; then
            if [ -z "$DETECTED" ]; then echo -n "" >"$TFO"; fi
        fi
        if [ -n "$DETECTED" ]; then
            if [ -n "$CHECK_MODES" ]; then
                SUM1=$("$FFMPEG" -y -v verbose -i "$f" $P_TLIMIT -vn -af hdcd=process_stereo=0 $FORMATSTR md5: 2>/dev/null)
                SUM2=$("$FFMPEG" -y -v verbose -i "$f" $P_TLIMIT -vn -af hdcd=process_stereo=1 $FORMATSTR md5: 2>/dev/null)
                if [ "$SUM1" == "$SUM2" ]; then
                    echo "md5 sums match: $SUM1" >>"$TFO"
                else
                    echo "md5 sums differ: ps0: $SUM1, ps1: $SUM2" >>"$TFO"
                fi
            fi
        fi
        sed -i -e "s#^\[Parsed_hdcd_[0-9]\+ @ [0-9a-fx]\+\] ##" "$TFO"
        sed -i -e "s#^#[$TAG] #" "$TFO"
        if [ -n "$SIMPLE" ]; then
            head -n 1 "$TFO"
            if [ -n "$CHECK_MODES" ]; then grep "md5 sums " "$TFO"; fi
            grep "HDCD detected:" "$TFO"
        else
            cat "$TFO"
        fi

        if [ -f "$TFO" ]; then rm "$TFO"; fi
        if [ -f "$TF" ]; then rm "$TF"; fi
    fi
}

NN=0
for f in "$@"
do
    scan_file "$NN" "$f" &
    while [ $(jobs -r| wc -l) -ge "$CPU_COUNT" ] ; do sleep 0.2 ; done
    ((NN++))
done
wait

HDCD was (mostly) a scam

For casual readers, here is what you need to know about the existence of HDCD.

  • HDCD was invented by a company call Pacific Microsonic (PM) that made audio equipment designed for use in studios.
  • Only PM equipment could encode HDCD, and only PM-licensed consumer equipment could decode it.
  • It was claimed that it was compatible with standard CD audio, which was only true in the encoding/decoding sense, and not so much in the audible sense.
  • The information about how it worked was mostly kept secret to keep others from using it, and the information that was made public was intentionally vague.
  • Many studios used PM equipment even if they didn't know it, or their engineers didn't know how to use it.
  • Because of this, thousands of releases have HDCD encoding. See List of HDCD-encoded Compact Discs.
  • Much of the time, the engineers turned all HDCD features off, but the HDCD control packets were still inserted into the discs anyway. For these discs, there is absolutely no benefit in decoding HDCD.
  • Many times, mastering engineers who didn't know how to use the HDCD features caused absurd HDCD discs to be released, like "For the Masses" (1998), and unintentional "Special Mode" discs.
  • Of the three "features" of HDCD, in the vast majority of cases only one is useful or even worth decoding: PE.
  • Microsoft bought PM for the IP part of the business, that is, to collect royalties from HDCD without developing it or contributing to it in any way.
  • HDCD is now dead, except for all the existing PM studio equipment that is still producing zombie HDCD discs (or tracks), even with all the features turned off.

Technical notes

Packet formats

pe = peak extend (PE), tg = target gain (LLE), tf = transient filter

Packet format A
8-bit code, tg is a 3-bit value (-7.0dB to 0.0dB steps of 1dB). This HDCD Sampler (1992) is an example of a disc with this packet type.
7 6 5 4 3 2 1 0
0 0 tf pe 0 tg
Packet format B
16-bit (8-bit code, 8-bit XOR of code), tg is a 4-bit value used as a 3.1 fixed-point number (-7.5dB to 0.0dB steps of 0.5dB). Most discs after 1995 use this packet type.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 tf pe tg XOR check

A and B are unofficial names of the different formats based on counters in the FFmpeg HDCD filter source code. If using -v verbose with the filter, Counter A, B, C will be reported. A is the number of 8-bit format packets, B is the number of 16-bit format packets validated, C is total packets. almost_A is the number of packets that were supposed to be an A format, but one of the expected zeros was a one. checkfail_B is the number of packets that failed the XOR check for B format. In a perfect HDCD encoding A+B=C, likely as A+0=C or 0+B=C.

Interesting Notes

Regarding HDCD detection

The arrival of a valid packet for a channel resets a code detect timer for that channel. If both channels have active timers, then code is deemed to be present and the filter select data is considered valid immediately. However, any command data which would effect the level of the signal must match between the two channels in order to take effect. The primary reason for this is to handle the case where an error on one channel destroys the code. In such a case, the decoder will mistrack for a short time until the next command comes along, which is much less audible than a change in gain on only one channel, causing a shift in balance and lateral image movement. If either of the code detect timers times out, then code is deemed not to be present, and all commands are canceled, returning the decode system to its default state. If the conditions on the encoder side are not changing, then command packets are inserted on a regular basis to keep the code detect timers in the decoder active and to update the decoder if one starts playing a selection in the middle of a continuous recording.

— "extract from the AES paper presented by Keith Johnson" [2] paper:[3]

Regarding the Low-level Gain Adjustment feature

There are two modes of Low Level Extension, “Normal” and “Special”. Normal mode begins to affect the input signal 45 dB below peak level, gradually raising the gain 4 dB as the level drops over an 18 dB range. Special mode begins to affect the input signal 39 dB below peak level, and gradually raises the gain 7.5 dB over a 26 dB range. Normal mode is optimized to provide the best combination of decoded dynamic range and resolution and undecoded compatibility. Special mode is designed to provide the best possible decoded dynamic range and resolution at some potential expense of undecoded compatibility. To access Special mode, from the Operating Menu select (SETUP/OUTPUT/HDCD_16/LOWLVL/ SPECIAL). Typically, Special mode is used only for HDCD 16-bit master tracking with the assumption that the recording will be decoded by the Model Two to a 24-bit or 20-bit word length for digital post production before being re-encoded to HDCD 16-bit using Normal mode to produce a release master.

— "from PM operator's manual"

If you see a level of greater than 4.0 of gain adjustment, there was an error made in the disc where the mastering engineer mistakenly used "Special Mode". This is not only wrong, but there is no consumer equipment that can even decode this level of gain adjustment! Only by playing this back through the PM Model One (44 and 48 kHz only) or the PM Model Two (added dual- and quad-rate sampling rates) could this file be properly decoded

— Charles Hansen [4]

Regarding the Playback Filters feature

[T]he PM Model One and Model Two professional units, which were combination A/D and D/A converters) that had two different playback (reconstruction) filters for HDCD. While the PM A/D used two different anti-aliasing filters while performing A/D conversion (depending on the level of high-frequency content), they were precluded from using two different playback filters as Ed Meitner had already patented that idea for Museatex (now out of business).

— CHansen [5]

Links