Firelight Technologies FMOD Studio API
Lossy audio formats: quality, multichannel and looping
Quality and bit rate
What is the relationship between bit rate and the 'compression quality' property?Within FMOD Designer, the compression quality property is found in the wave bank property panel. In FSBankEx the quality property is in the format options. The relationship between bit rate and the compression quality property (when dealing with constant bit rate compression), is appropriately:
bit rate = quality * 3.2This is the case for MP2/MP3 but may differ for XMA and other bitrate based formats.
Bit rates and sample rates for MPEG data
The following table shows the available bit rates and sample rates available for MPEG data within FMOD:Note! This is the MPEG version, not the 'layer' version. Layer 2 and 3 are commonly known as MP2/MP3. MP3 for example could be MPEG 1 or 2, but is still 'layer 3'
Both MP2 support and MP3 support share the same MPEG versions and bitrate/samplerate capabilities.
MPEG 1 Bitrates (kbps) | MPEG 1 Sample rates (kHz) | MPEG 2 Bitrates (kbps) | MPEG 2 Sample rates (kHz) |
32 | 32 | ||
48 | 44.1 | 16 | |
56 | 48 | 24 | |
64 | 32 | 16 | |
80 | 40 | 22.05 | |
96 | 48 | 24 | |
112 | 56 | ||
128 | 64 | ||
160 | 80 | ||
192 | 96 | ||
224 | 112 | ||
256 | 128 | ||
320 | 144 | ||
384 | 160 |
* Note that the crossed out values are not supported by FSBankEx even though they are specified as part of the MPEG format specification.
Should the user attempt to use a sample rate not listed, FMOD will automatically resample the file (upwards) to the next valid sample rate. For example, a file with a sample rate of 15kHz will be resampled to 16hHz.
Multi-channel MPEG Encoding
FMOD is able to create MPEG files with up to 16 channels (eight stereo pairs). To do this, the build process:This process is illustrated in the figure below.
Figure 1: Encoding a multi-channel MPEG file
For example, let's consider a six-channel MPEG file using a constant bit rate of 128 kbps. The six channels are encoded into three stereo pairs. Each frame of stereo MPEG data is 432 bytes (including a 14 byte buffer). FMOD interleaves the stereo frames every 432 bytes into a multi-channel MPEG frame. The size of the multi-channel MPEG frame can be calculated as frame size * Number of stereo pairs. In this example, the multi-channel MPEG frame is 432 * 3, giving 864 bytes.
Encoding mp3 files for seamless looping
Typically when an mp3 file is looped, an audible gap can be heard when playback loops back to the start. This gap is obvious when the loop requires a sample accurate stitching from the last sample to the first. This occurs for a number of reasons, the two major factors being:Without special encoding, it is not possible for mp3 data to loop seamlessly - fortunately FMOD does provide a method to do just that! The FMOD mp3 encoder can be accessed via FMOD Designer or FSBankEx. For Designer users, the special encoder is automatically used if the sound definition instance is set to loop and the wave bank compression property to 'MP3'. Note: if the sound definition instance is set to 'one-shot' the standard mp3 encoding is used. Users of the lower level API can specify the FSBankEx to encode mp3 data for seamless looping.
So what does FMOD do to provide seamless loop of mp3 data?
Firstly, FMOD's encoder will resample and stretch the last frame to ensure that all 1152 samples of the frame are used. This will ensure the frame is not padded with silent samples.
When used on some sources, this process may cause a slightly audible pitch change artifact. If this is the case, user are encouraged to repeat the audio within the file to increase the file size, so the time stretch distance becomes less significant. Users may also resize the length of their audio to a multiple of the frame size. The table below lists the frame size for various formats.
Format | Frame size (samples) |
MPEG 1 | 1152 |
MPEG 2 (2.5) | 576 |
XMA | 2048 |
VAG | 28 |
GCADPCM | 36 |
With the removal of any padding within the last frame, FMOD's encoder must then prime the first frame with data from the last frame. The last frame is then removed. This allows FMOD's decoder to avoid issues of frame dependency between the first and last frame and provide a seamless loop.
In most situations FMOD's encoder and decoder will perform perfect looping of mp3 content. However some audible artifacts can be introduced, this is illustrated below.
Figure 2: Encoding MPEG frames for seamless looping
When the first frame contains silence and the last frame contains an audible signal, the interpolation used in priming the first frame will result in an audible 'pop'. Should users require silence in the first frame of their loop, they should:
XMA Quality and Compression
As specified (in part) in the Xbox SDK documentation:- 1 provides the highest compression level and the lowest quality, and
- 100 provides the lowest compression level and the highest quality.