ActiveRecord Optimization with Scrooge

Dynamic query optimization is a hotbed of research in the database industry. Each and every query you execute goes through a rigorous optimization phase which tries to squeeze every last bit of performance: deciding which indexes to use, the execution order and sort order to minimize the number in-memory tables, etc. However, one thing the database has no access to is the application layer knowledge of which data the user is actually using after it is retrieved. Often times, the query fetches all of the columns when only a few are required, which is exactly the pattern that Lourens Naudé is seeking to optimize with his new plugin: scrooge.

Tracking and Scoping Attributes with Scrooge

The idea behind scrooge is both surprisingly simple and powerful: instead of forcing the developer to manually specify each attribute column, simply observe and record for some period of time all of the attribute accesses and then reuse this knowledge in the future to automatically optimize your subsequent query requests.

The entire process is split into two simple phases. First, the track phase is executed where all database columns are fetched and any attribute accesses in your templates are recorded. Then, once you've built up a representative sample, this data is persisted and the scope phase begins where scrooge intercepts all of your database queries and dynamically modifies them to select only the required columns.

# 1) unscoped select:
#    => SELECT * FROM widgets LIMIT 1
Widget.find(:all, :limit => 10)

# 2) manually scoped select:
#    => SELECT widgets.id FROM widgets LIMIT 1
Widget.find(:all, :select => "id", :imit => 10)

# Scrooge automatically transforms 1) into 2) after the tracking phase is complete!
scrooge.git - Fetch exactly what you need

Dynamic Query Optimization

The real benefit of this technique lies in its transparent nature. Both novice users who are simply unaware of the extra database cost, and a seasoned veteran working with 30K+ lines of database dependent code can breathe a sigh of relief. After all, why burden the developer with something that software can automatically optimize for us?

Think of it as an automated application layer query optimizer. Your database has no knowledge of the properties of your application, but ActiveRecord (or any other ORM for that matter) can build a very good profile for this type of data. To get started, simply drop in the Rails plugin and you're off to the races. It's a very promising project!

Scrooge has been completely re-engineered by Laurens Naude and Stephen Sykes since this article was written to use an 'inline cache' approach with callsites: faster, better, more compact, same results!

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.