Building a Modern Web Stack for the Real-time Web

The web is evolving. After a few years of iteration the WebSockets spec is finally here (RFC 6455), and as of late 2011 both Chrome and Firefox are SPDY capable. These additions are much more than just "enhancing AJAX", as we now have true real-time communication in the browser: stream multiplexing, flow control, framing, and significant latency and performance improvements. Now, we just need to drag our "back office" - our web frontends, app servers, and everything in between into this century to enable us to take advantage of these new capabilities.

We're optimized for "Yesterday's Web"

Modern backend architecture should allow you to terminate the user connection as close to the user as possible to minimize latency - this is your load balancer or web server occupying ports 80 and 443 (SSL). From there, the request is routed on the internal network from the frontend to the backend service, which will generate the response. Unfortunately, the current state of our "back office" routing is not only outdated, but often is also the limiting factor in our adoption of these real-time protocols.

WebSockets and SPDY are both multiplexed protocols which are optimized to carry multiple, interleaved streams of data over the same TCP pipe. Unfortunately, popular choices such as Apache and Nginx have no understanding of this and at best degrade to dumb "TCP proxies". Even worse, since they do not understand multiplexing, stream flow-control and priority handling goes out the door as well. Finally, both WebSockets and SPDY communicate in framed messages, not in TCP streams, which need to be re-parsed at each stage.

Put all of this together and you quickly realize why your own back office web stack, and even the popular platforms such as Heroku and Google's App Engine are unable to provide WebSockets or SPDY support: our services are fronted by servers and software which was designed for yesterday's web.

Architecture for the "Real-time Web"

HTTP is not going away anytime soon, and we will have to support both the old and the new protocols for some time to come. One attempt at this has been the SPDY > HTTP proxy, which converts a multiplexed stream into a series of old-fashioned HTTP requests. This works, and it allows us to reuse our old infrastructure, but this is exactly backwards from what we need to be doing!

Instead of converting an optimized, multiplexed stream into a series of internal HTTP dispatches, we should be asking for HTTP > SPDY infrastructure, which would allow us to move beyond our outmoded architectures. In 2012, we should demand our internal infrastructure to offer the following:

  • Request and Response streaming should be the default
  • Connections to backend servers should be persistent
  • Communication with backend servers should be message-oriented
  • Communication between clients and backends should be bi-directional

Make SPDY the default, embrace dynamic topologies

The first step towards these goals is to recognize that translating SPDY to HTTP is a convenient path in the short term, but exactly the wrong path in the long term. SPDY offers multiplexing, flow control, optimized compression, and framing. We should embrace it and make it the default on the backend. Once we have a multiplexed, message-oriented protocol on the backend, we can also finally stop reparsing the same TCP stream on every server. Writing HTTP parsers in 2012 is neither fun nor an interesting problem.

Finally, this architecture should not require a dedicated OPS team, or a custom software platform to maintain. Modern web applications are rarely powered by a single host and require dynamic (re)configuration and management. Services such as Heroku, CloudFoundry, and GAE have built their own "routing fabrics" to handle these problems. Instead, we need to design architectures where the frontends and the backends are decoupled by default and require minimal intervention and maintenance.

Adopt a modern Session Layer

Building dynamic network typologies is not for the faint of heart, especially once we add the additional requirements for message-oriented communication, multiplexed streams and a grab bag of performance constraints. Thankfully, libraries such as ØMQ offer all of the above and more, all wrapped behind a simple and an intuitive API. Let the frontend parse and emit SPDY frames, and then route them internally as ØMQ messages to any number of subscribers.

Mongrel2 was one of the first web servers to explore this type of architecture with ØMQ, which allowed it to sidestep the entire problem of backend configuration, as well as enable a number of interesting worker topology patterns. There is still room for improvement, but it is a much needed step in the right direction. As a concrete example, let's consider a sample workflow with SPDY and ØMQ:

  1. An HTTP (or SPDY) request arrives to the frontend
  2. Frontend parses the request and generates SYN_STREAM, HEADERS and DATA SPDY frames
  3. The messages are delivered into a PUSH ØMQ socket (ala Mongrel2)
  4. Backend subscribers use a PULL socket to process the SPDY stream
  5. Backend subscriber streams a response back to the frontend

The communication is done over a persistent channel with message-oriented semantics, the frontend and the backends are completely decoupled, and we can finally stop punching "TCP holes" in our networks to support the modern web.

Supporting HTTP 2.0 in the back office

The new protocols are here, but the supporting "back office" architecture requires a serious update: SSL is becoming the default, streaming is no longer an option, and long-lived persistent connections are in. SPDY is gaining momentum, and I have no doubts that in the not so distant future it will be an IETF approved protocol. Similarly, ØMQ is not the only alternative for internal routing, but it is definitely one that has been gaining momentum.

Fast HTTP parsing and routing is simply not enough to support the modern web use cases. Likewise, punching "TCP holes" in our infrastructure is not a viable long-term solution - in 2012 we should be asking for more. Yes, I'm looking at you Varnish, Nginx, Apache and friends.


Ilya Grigorik

Ilya Grigorik is a web performance engineer and developer advocate at Google, where his focus is on making the web fast and driving adoption of performance best practices at Google and beyond.