Boosting ActiveRecord Performance

ActiveRecord is arguably one of the best ORM wrappers out there. And as any Rails developer will likely tell you, it's also the backbone of the entire RoR architecture. Having said that, the flexibility comes with a trade off - you have a intermediary that is responsible for constructing the queries, hiding the database logic, converting data types, and the list goes on. All of the steps consume both time and resources, and while find_by_sql is powerful, it is often not enough. Thankfully, while flipping through 'Pro Active Record' by Kevin Marshall, Chad Pytel and Jon Yurek I stumbled across a useful tip: ActiveRecord exposes a collection of methods which allow us to bypass all of its middleman functions and work with the raw SQL connection directly.

Working with low-level CRUD

If you look at the source of ActiveRecord you'll quickly realize that all of its functionality boils down to a few simple methods: connection.insert, connection.update, connection.select_all and connection.delete. Issuing statements through any one of these methods requires a full SQL query - ActiveRecord is no longer responsible for verification. Essentially, we are bypassing everything ActiveRecord brings to the table, short of establishing a communications link to the database.

Bypassing ActiveRecord

Let's establish a standalone connection to a test database and run our CRUD queries:

require 'active_record'

class DBConn < ActiveRecord::Base
  establish_connection(
      :adapter => "mysql",
      :database => "test",
      :username => "user",
      :password => "pass",
      :host => "localhost"
  )
end

# Return all documents; do not convert into ActiveRecord objects, keep as a hashmap
docs = DBConn.connection.select_all("select * from documents")
docs.each do |record|
  puts "#{record['id']} :: #{record['title']}"
end

# Insert a document
new_id = DBConn.connection.insert("insert into documents(id, group_id, date_added, title, content) values ('5', '1', '2007-10-26 12:00:00', 'Story title', 'Text')")
puts "\nNew document id: #{new_id}"

# Update a document
status = DBConn.connection.update("update documents set title = 'New title' where id = 5")
puts "\nUpdate status: #{status}"

# Check the updated document
p DBConn.connection.select_all("select * from documents where id = 5")

# Delete the update document (returns 0)
status = DBConn.connection.delete("delete from documents where id = 5")
puts "\nDelete status: #{status}\n"

# Check the table, once again.
docs = DBConn.connection.select_all("select * from documents")
docs.each do |record|
  puts "#{record['id']} :: #{record['title']}"
end

Executing the above code on a sample MySQL database with four document entries produces:

1 :: test one
2 :: test two
3 :: another doc
4 :: doc number four

New document id: 5

Update status: 1
[{"title"=>"New title", "group_id"=>"1", "id"=>"5", "content"=>"Text", "date_added"=>"2007-10-26 12:00:00"}]

Delete status: 1

1 :: test one
2 :: test two
3 :: another doc
4 :: doc number four

Working with low level SQL is definitely a lot more involved, but on the upside, it can also provide significant performance increases. If you are moving lots of data, need access to engine specific functions, or going for straight throughput, it's not something you should overlook. Speed freaks: custom Mongrel handlers with raw ActiveRecord connections!


Ilya Grigorik

Ilya Grigorik is a web performance engineer and developer advocate at Google, where his focus is on making the web fast and driving adoption of performance best practices at Google and beyond.