HTML5 Visibility API & Page Pre-Rendering

Minimizing UI latency is critical for creating a positive user experience - this is true both on the desktop and on the web. A best practice for a "native app" is to decouple the UI and control threads to avoid blocking on any long-running tasks. On the web, things are a lot trickier: our Javascript runtimes are all single-threaded, we can't just spin up an extra thread, and instead have to rely on event-driven programming models. Even worse, unless the invoked computation is local, a roundtrip to the server can easily take hundreds of milliseconds in the network overhead alone.

Not surprisingly, over the course of the past few years we have invented dozens of new Javascript based UI frameworks: all asynchronous, all focused on trying to hide the interaction latency behind a Javascript facade. In theory, all great ideas, but in practice also with their own set of downsides: how about those #!'s, and we have all certainly seen and written a few overly eager setTimeout's which quickly destroy the clients CPU and battery life.

In fact, because the browser VM is a shared resource, we have a classic tragedy of the commons: if every application plays nice, then everyone can have an optimal experience, but the incentives to do so are not clear. Problem is, until very recently we did not even have the tools to address this issue! Page Visibility API is the first HTML5 proposal that is trying to tackle this problem, and browser pre-rendering is also aiming to help us hide some of the network latency in our web applications - let's take a look under the hood.

Browser Pre-fetching vs. Pre-rendering

An average page render requires fetching a dozen resources alongside the actual HTML content. If you dig into your debug console, it is not uncommon to see pages which take on the order of ten seconds to load to completion. Thankfully, the browsers have implemented many tricks to make it seem as if the page is loading much faster - parallel downloads, highly optimized rendering engines, and a never ending battle to speed up Javascript execution. Nonetheless, usually this is still not enough to beat the "native experience".

Well, the server can help us as well: new protocols like SPDY are aiming to reduce the network overhead of fetching multiple resources, and there is even talk of enabling server push of related page assets. Think you can guess what the user may click on next? Firefox 3.5 enabled the pre-fetching API which allows us to hint to the browser what resources it may need to service a subsequent request:

<!--  Specify any & all resources to pre-fetch -->
<link rel="prefetch" href="/images/big.jpg">

<!-- or send an HTTP header -->
Link: </images/big.jpeg>; rel=prefetch

Pre-fetching is a simple optimization, but it requires that we explicitly specify each and every resource - just listing the link of the next HTML page is unlikely to result in a noticeable improvement in the user experience.

This is where the new pre-rendering proposal comes in: instead of specifying a single resource, what if the browser could fetch and render the entire next page, but hide it from you until you click on the link? As of about a month ago, pre-rendering support is in WebKit and Google is already prototyping it with "Instant Pages":

Pre-rendering wins and gotchas

At the moment, the pre-rendering API is limited: only one page can be pre-rendered across the entire VM, and only one page can be put into the pre-render queue per tab. Fetching an entire page taxes both the server and the client, hence you need be sure that you will actually need it. Google's web search team, for example, only enables pre-rendering on search results if they have very high confidence that you may actually click on the result.

Additionally, since we are now pre-rendering the entire page (HTML, CSS, and JS), how does this affect all the interactive content on the page? Knowing nothing about the pre-render step, the requested page can easily pin our CPU, register a pageview and make a request to an ad server for content that the user may never actually see! To solve this problem, WebKit developers have also added the Page Visibility API:

function handleVisibilityChange() {
  if (document.webkitHidden) {
    pausePageJavascript();
  } else {
    startPageJavascript();
  }
}

document.addEventListener("webkitvisibilitychange", handleVisibilityChange, false);

The webkitHidden property tells us the state of the page to solve the original visibility problem, but the webkitvisibilitychange event has another nice side effect: it allows the client to easily detect when a tab is visible, and when it is in the background. Why does this matter? Imagine you have an application which polls the client, or the server, every 50 milliseconds for some updates. With the visibility API, you can gracefully pause or degrade the timer to a much longer poll when the tab is in the background.

Minimizing latency on the web

Both pre-rendering and Page Visibility API's are still in development, but it is great to see more client-side tools to enable web developers to hide the underlying network latency. With these API's, instead of relying on an async Javascript stack, your next multi-step form can be rendered on the server and pre-rendered in a WebKit browser with instant feedback on the client!

Likewise, while browsers like Chrome are already downgrading background tabs in CPU priority, a client-side API to detect foreground tabs is a welcome addition. Let's hope that Firefox, Opera and IE jump on the bandwagon as well!

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.