Non-blocking ActiveRecord & Rails

Rails and MySQL go hand in hand. ActiveRecord is perfectly capable of using a number of different databases but MySQL is by far the most popular choice for production deployments. And therein lies a dirty secret: when it comes to performance and 'scalability' of the framework, the Ruby MySQL gem is a serious offender. The presence of the GIL means that concurrency is already somewhat of a myth in the Ruby VM, but the architecture of the driver makes the problem even worse. Let's take a look under the hood.

Dissecting Ruby MySQL drivers

The native mysql gem many of us use in production was designed to expose a blocking API: you issue a SQL query, and the library blocks until the server returns a response. So far so good, but unfortunately it also introduces a nasty side effect. Because it blocks inside of the native code (inside mysql_real_query() C function), the entire Ruby VM is frozen while we wait for the response. So, if you query happens to have taken several seconds, it means that no other block, fiber, or thread will be executed by the Ruby VM. Ever wondered why your threaded Mongrel server never really flexed its threaded muscle? Well, now you know.

Fortunately, the little known mysqlplus gem addresses the immediate problem. Instead of using a single blocking call, it forwards the query to the server, and then starts polling for the response. For the curious, there are also two implementations, one in pure Ruby with a select loop, and a native (C) one which uses rb_thread_select. The benefit? Well, now you can have multiple threads execute database queries without blocking the entire VM! In fact, with a little extra work, we can even get some concurrency out of ActiveRecord.

However, we could even drop threads in our quest for concurrency! Instead of making every thread poll on a socket, we could pass each of those sockets to a single event loop (EventMachine) library, and let it handle all the IO scheduling for us: gem install em-mysqlplus. Same API, in fact, it uses mysqlplus under the covers, but now every query has a callback for true non-blocking database access. Take a look at a few examples in the slides:

Non-blocking Rails with MySQL

Now we come around full circle. The downside of a true asynchronous library is that it requires callbacks, spaghetti code and a fully asynchronous stack. Thankfully, we already have Thin for our async app server, and with the introduction of Fibers in Ruby 1.9, we can wrap our asynchronous driver to behave just as if it had a blocking API.

So, we install em-mysqlplus, require em-synchrony to emulate the 'blocking api', implement an activerecord adapter, and we finally have a fully non-blocking ActiveRecord driver which we can drop into our Rails app! Well, almost, a few other modifications: Rails provides a threaded ConnectionPool, which we need to replace with a Fiber aware one, and finally, we need to disable the built in Mutex (hap tip to Mike Perham for doing all the dirty work for us). Now let's give it a try:

class WidgetsController < ApplicationController
  def index
    Widget.find_by_sql("select sleep(1)")
    render :text => "Oh hai"
  end
end
$ thin -D start
$ ab -c 5 -n 10 http://127.0.0.1/widgets/

# Server Software: thin
# Server Hostname: 127.0.0.1
# Server Port: 3000
#
# Concurrency Level: 5
# Time taken for tests: 2.210 seconds
# Complete requests: 10
# Requests per second: 4.53 [#/sec] (mean)

Our widget action simulates a blocking one-second query, we start up a single thin server, and run an ab test against it: 10 requests, with a max concurrency of 5. And as you can see, the test finishes in just slightly over 2 seconds!

Rails 3, Ruby 1.9 and Drizzle

By mid summer we will see production releases of Rails 3, Ruby 1.9, and Drizzle, and that convergence is worth paying attention to. Both Rails 3 and Ruby 1.9 offer raw performance improvements across the board. In the meantime, Drizzle already provides a fully async libdrizzle driver (talks to MySQL & Drizzle) which we could adopt to future proof our applications.

Combine all three with a fibered ActiveRecord driver, an async application server such as Thin, and we could make some serious steps forward when it comes to performance of Rails: significantly lower memory footprint and much better performance across the board.

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.