Gapless playback

From Hydrogenaudio Knowledgebase

Gapless playback is the seamless playback of digital audio formats. It allows live music or consecutive tracks to be heard exactly as they are mastered, without gaps between tracks.

Why gaps occur

Most lossy audio compression schemes involve a time/frequency domain transform. Such transforms cannot deal with arbitrary amounts of data, and instead act on blocks of data at a time. In order for the audio signal to be encoded in its entirety, small amounts of silence are prepended and appended to the input before the transform. If the amount of padded silence is not accounted for, the playtime of the audio data may not be equal before and after the compression. In such cases, the silence will be decoded together with the audio data, introducing gaps between tracks.

Most audio playback software will also close the audio output stream when switching tracks, introducing gaps or making existing gaps larger. Unless the beginning of the next track is buffered and introduced immediately when the current track ends, gaps will occur.

Some compression methods such as the popular MP3 can be problematic because the MP3 standard defines no way to record the amount of padding for later removal. Even if two tracks are decompressed and merged into a single track, a gap will remain between them. More recent audio formats have been designed to address this problem, and will produce gapless audio if played back correctly.

Optimal solution

It is possible to store metadata in the audio to explicitly declare the playtime, and/or the amount of padding/delays introduced in the encoding process. This information can be used to ensure that playtime will remain constant after decoding with no added silence. The audio playback software must be able to recognize the metadata, and trim the decoded audio as necessary.

The software can then take care to keep the output stream open between tracks. It must also buffer the beginning of the following track in the same way it buffers the current track during normal playback.

If the compression method supports gapless playback, the software properly decodes the audio data and metadata, the next track is buffered and ready to play, and the output stream remains open between tracks, optimal gapless audio is achieved. A collection of consecutive tracks will then play in the same way they were mastered, allowing the listener to hear their album as the author intended.

Alternative solutions

Digital signal processor (DSP) plugins can be used to detect silence between tracks and trim the audio as necessary on playback. This is not an optimal solution because it does not always produce results identical to the source. Sometimes an artist may intentionally leave silence at track boundaries for dramatic effect; removing this silence also removes that effect.

It can also be difficult to properly implement silence removal. If the silence threshold is too low and the track contains decoder artifacts, the software may not recognise some silences. Conversely, if the threshold is too high, the software may remove entire sections of quiet music at the beginning or end of a track.

DSP plugins can also be used to cross-fade between tracks. This eliminates gaps that some listeners find distracting, but also greatly alters the audio data and is not always desirable. In particular, when tracks are meant to be played together and perform the transition at high volume, cross-fading results in a large volume drop.

Both of these alternate solutions are typically used to address compression methods that do not support the metadata for gapless playback. Like the optimal solution, they still require buffering and not closing the output stream; however, they require more computations, making them less efficient. In portable digital audio players, this can mean a reduced playing time on batteries.

Due to the drawbacks of the alternative solutions above, some listeners dislike their negative effects more than the gap they attempt to remove.

Another alternative is to ignore track boundaries, encoding a single collection of tracks as a single compressed file, relying on cuesheets (or something similar) for navigation. While this method results in gapless playback within the collection of tracks, it can be unwieldy due to the possibly large size of the resulting compressed file. Furthermore, unless the playback software or hardware can recognize the cue sheets, navigating between tracks may be difficult.

Format support

Since lossless data compression excludes the possibility of the introduction of padding, all lossless audio file formats are inherently gapless. The following lossy audio file formats have provisions for gapless encoding.

Some other formats do not officially support gapless encoding, but some implementations of encoders or decoders may handle gapless metadata.

  • LAME-encoded MP3 can be gapless with players that support the LAME Mp3 info tag.
  • AAC in MP4 encoded with Nero Digital from Nero AG can be gapless with foobar2000.

Gapless solutions