Audio layering issues

Virtual Audio Cable

Audio layering issues

WDM/KS driver is a lowest-level layer in Windows audio subsystem. Most applications don't connect directly to it. Instead, they use higher-level DirectSound, MME and WASAPI interfaces provided by the System Audio Engine. Using such intermediate layers, an application cannot have a full control over all audio parameters. So there are some special issues and limitations. At a glance, they might look quite complex but they are easy to understand in practice.

Driver's pin allocation

WDM/KS drivers, like DirectShow filters, are represented by several numbers of pins. If an application or the system needs to record or play audio data, it creates an instance of the specified pin. A pin instance is associated with a data stream, and instance owner (the application or System Audio Engine) becomes to be a client of audio driver.

Applications that use WDM/KS or DirectSound/WASAPI in exclusive mode, request a separate pin instance. If there is an available pin instance, such request succeeds, otherwise it fails. Most modern drivers provide only a single pin instance. Therefore, several WDM/KS applications can record/play simultaneously only if there are more than one available pin instances at a particular device driver.

To allow simultaneous playback/recording possibility, System Audio Engine uses its own pin instance when a device is accessed via DirectSound, MME or WASAPI interface in shared mode. When an application connects to a device in shared mode, its connection request is implicitly processed by System Audio Engine. If the engine is not connected to the appropriate pin, a pin instance is created. Otherwise, no additional pin instances are requested from the device driver.

If an exclusive-mode access is requested by an application through System Audio Engine but no shared-mode connections for the appropriate pin exist (no common pin instance allocated to System Audio Engine), the engine creates its common pin instance first to allow further shared-mode connections (there is a known bug here). If the driver is single-client and supports only a single pin instance, there will be no available instances to satisfy the exclusive-mode access request. In such case, the request is either rejected or satisfied in shared-mode, depending of the parameters.

Therefore, if there are not enough available pin instances, Windows prefers shared-mode access for compatibility purposes. But direct WDM/KS access requests don't go through System Audio Engine so they can allocate the last (or even single)  possible pin instance. If such direct low-level request is made before a first higher-level request, System Audio Engine will be unable to create its common pin instance and all further higher-level requests, regardless of shared or exclusive mode, will fail.

Vice versa, if the last (or single) possible pin instance has been allocated for System Audio Engine, further direct WDM/KS access requests will fail.

Format selection for System Audio Engine

Creating a pin instance for System Audio Engine in shared mode, Windows audio subsystem must choose an audio format for it. Format selection is based upon application's format specified in its connection request, set of formats supported the device driver and some other rules. There are following common rules of System Audio Engine audio format selection in Windows 5.x:

  • MME playback: sampling rate is equal to requested, bits per sample is preferably 16 and number of channels is preferably 2 (stereo).

  • DirectSound playback: sampling rate is equal to a first matching from driver's set, bits per sample is preferably 16 and number of channels is preferably 2 (stereo).

  • MME recording and DirectSound capture: format is the same as requested except some incompatible cases (weird sampling rates, number of channels is greater than 8 etc.).

If possible pin format parameters are strongly restricted and System Audio Engine cannot use 16 bits and 2 channels, other bitness and/or number of channels can be used.

In Windows 6.x, a separate default format can be specified for shared mode connections.

Stream and client counting

All shared-mode connections to a recording or playback device are established through System Audio Engine using a single common pin instance. So if your applications use shared-mode access, you will see only a single playback and a single recording stream in VAC Control Panel because all of these applications are not VAC clients. They are clients of System Audio Engine. VAC even knows nothing about their existence and streaming activity; all their requests are serverd by System Audio Engine.

A pin instance created for System Audio Engine is often "sticky" and persists several seconds after all applications have closed their connections to a Virtual Cable. You can see it in the VAC Control Panel.

Only exclusive-mode connection allows the application to become VAC client, create its own pin instance and an associated data stream. For each exclusive-mode connection, an additional stream is counted by the recording/playback stream counter.

You can experiment with various applications and formats and watch number of clients/streams in the VAC Control Panel.

Windows audio subsystem and some applications may access VAC driver and its pins for property requests, creating no pin instance. Such connections are counted in the Driver Clients field of VAC Control Panel.

How cable format is selected

For each Virtual Cable, VAC maintains the cable format. The cable format is a common format for all cable clients (data  streams). All render (output) data are converted to the cable format, mixed and then distributed to all capture (input) clients, being converted to their particular formats.

VAC selects the cable format when a Virtual Cable becomes active (gets its first client stream). As described above, there may be implicit, hidden clients like System Audio Engine whose might request a connection with a format different from the format specified by the application.

For example, default playback format for Virtual Cable 1 is set to 48000/16/2. The application requests an exclusive-mode connection through System Audio Engine, specifying 96000/24/2. Prior to satisfy this request, System Audio Engine requests a pin instance to be used for shared-mode access, specifying the default device format (48000/16/2) which is fixed by VAC as the cable format. Then the engine requests another pin instance for the application, specifying 96000/24/2. But the cable format remains to be 48000/16/2 and VAC will downsample 96000 to 48000 and cut 24-bit samples to 16-bit ones. Therefore, system default device format assigned to a Virtual Cable for shared-mode access can affect exclusive-mode access to this Virtual Cable.

Similarly, improper cable format range may cause quality problems in Windows 5.x. For example, is the range is set to 22050-96000/8-32/1-8, System Audio Engine creates its own common pin instance with 96000/32/8 and cable format is set to the same. If most applications use formats like 44100/16/2, 48000/16/6 or like them, there will be many unnecessary format conversions.

To prevent such issues, choose cable format range to be possibly close to your favorite format set. See format limiting rules for details.

Other issues

Windows 6.x systems have a common bug that prevents System Audio Engine from creating its common capture (recording) pin instance for shared-mode access if there are some active (allocated) pin instances (streams). For example, it occurs if you have started recording using KS interface (KS version of Audio Repeater, ASIO4ALL or something like) and then try to start recording using a higher-level interface (MME, DS, WASAPI) in shared mode. As a workaround, start the higher-level interface first then start KS recording..