A Tale of No Clocks

Disclaimer: Ok, there's a clock involved.

Introduction

Chris Wilson's A Tale of Two Clocks is the primer for managing the scheduling of web audio events in the browser UI thread. Because web audio lives in its own thread with its own timeline, and because it's impossible to be sure exactly when code in the UI thread will execute, it's important for the UI thread to schedule audio events ahead of time. A Tale of Two Clocks explains how to do that using setInterval to ensure rock-solid timing (so long as the UI thread is running at all).

There are other approaches to scheduling audio events, though. For instance, instead of periodically scheduling events you could schedule all of your events at once, then cancel them (one way or another) when stopping playback. This method doesn't work for loops (the scheduling of which would create... an infinite loop), but it has the advantage of not relying on the UI thread at all. And that is an advantage when it comes to browser windows that are in the background. UI thread execution slows to a crawl in the background, screwing up even the most solid timing scheme that relies on it.

This article talks about a variation of the all-at-once scheduling strategy that works particularly well with loops and avoids on-going interaction with the UI thread. All with the help of the OfflineAudioContext.

Pre-Rendering Buffers

The idea is to schedule all events in an OfflineAudioContext, render that context into an AudioBuffer, and play that buffer in the regular AudioContext.

If you're working with a loop specifically (drum machines and sequencers fall into this category), you can then loop this rendered buffer (by operating on the AudioBufferSourceNode), creating a continuous audio signal that never needs topping-up from the UI thread.

An example of the idea in action is below, but first we should back up and talk about the OfflineAudioContext.

OfflineAudioContext

The OfflineAudioContext doesn't get talked about much, but it's really awesome. It's just like the AudioContext you already know, except that it has a startRendering method. Your workflow with an AudioContext is to schedule events along its timeline, then hear them. These two steps are so connected that it's easy to think that they are one and the same. (Thankfully, they aren't, and that's why the first argument passed to any start method is a when parameter, though leaving it out usually signifies "now"). However, in an offline audio context, these steps are explicitly separated: first, you schedule all of your audio events, then you call startRendering, which resolves with an AudioBuffer containing all of your audio events.


/**
 * Return a promise that resolves with an AudioBuffer that is your
 * input buffer after being filtered.
 */
function getFilteredBuffer(buffer) {
  // OfflineAudioContext needs up-front info:
  // channels, buffer length, and sampleRate
  var context = new OfflineAudioContext(buffer.numberOfChannels,
                                        buffer.duration * buffer.sampleRate,
                                        buffer.sampleRate);
  
  // Familiar graph building
  var source = context.createBufferSourceNode();
  source.buffer = buffer;
  var filter = context.createBiquadFilter();
  source.connect(filter);
  filter.connect(context.destination);

  // Familiar event scheduling
  source.start(0);

  // Magical step
  return context.startRendering();
}

An example of the OfflineAudioContext at work. Warning: the startRendering spec is to return a promise, but currently you need to attach an oncomplete handler (more here).

Example Drum Machine

Here is a drum machine that operates on the principles described above. Its source code is available here. (renderPattern is where the rendering happens.)

When you hit the play button, the drum pattern is rendered into a buffer via an OfflineAudioContext, and that buffer is scheduled into the existing AudioContext. When you change the pattern, a new buffer is rendered representing this new pattern, and it replaces the current buffer. When you change the tempo, the same process occurs. Each of these renders takes ~10ms, so you'll hear a glitch if you make a change while a beat is playing; this could be handled better.

General Pattern

Here's what your code might look like if you pre-render audio in an OfflineAudioContext:

// Initialize an AudioContext
var audioContext = new AudioContext();
var source;

/**
 * Response to pushing play button
 */
function play() {
  renderLoop().then(playLoop);
}

/**
 * Response to pushing pause button
 */
function pause() {
  source.stop();
}

/**
 * Just a stub—yours will be different. However,
 * this function should create an OfflineAudioContext,
 * schedule your loop into the context,
 * and resolve with the buffer of that context.
 * It should probably always return something like:
 * offlineAudioContext.startRendering();
 */
function renderLoop() { }

/**
 * Play the buffer on a loop.
 */
function playLoop(buffer) {
  source = context.createBufferSource();
  source.buffer = buffer;
  source.loop = true;
  source.start(context.currentTime);
}

Admittedly, the magic happens inside of the stubbed-out renderLoop function, but that's the advantage of this pattern: once playLoop has executed, the UI thread is completely done until the next actual user interaction. That means that the drum loop will continue playing regardless of what else happens, including the window being backgrounded.

Freeing our UI thread of any responsibility for scheduling events doesn't mean it isn't responsible for visually displaying what's going on, yet we don't have much of a handle on what is going on. Implementing the drum machine above, a surprising amount of the code is devoted to keeping an accurate, independent representation of where we are in the loop (I call it the cursor, and it appears 18 times in ~260 lines of code). This code lives in three primary places: the play function, the pause function, and the (visual) render function.

Wrapping Up

Where does this rendering method make sense? It might seem like pre-rendering audio like this is tailor-made for use in drum machine situations. However, I think another place that doing this makes sense is in the context of a large-scale DAW, where you have a number of audio channels, each composed of a number of audio events, most of which are static at any given time.

Still, pre-rendering buffers requires some understanding of the data model: it works really well for short loops that render quickly, but pre-rendering channels in a 3-minute song will require being smart about when you pre-render. This means that free-form performance applications aren't good candidates for pre-rendering.

Nevertheless, where applicable, this approach is a straightforward method of scheduling audio events that doesn't rely on the UI thread outside of user interactions.