Ruby AST for Fun and Profit

Anytime you execute a Ruby application, or any language for that matter, the parser has to scan the source, split it into recognizable tokens, and construct an Abstract Syntax Tree (AST) before the instructions can be turned into executable machine code. Sometimes this is done as a separate preprocessing step (compiled languages), and sometimes it is done on the fly (dynamic languages). Now the less academic version: an AST to a program is what the DOM is to a web-page - and we all know how useful that is!

It turns out that having access to the AST of a program can be incredibly powerful. Ever wondered how your IDE populates the method names in a separate window? How about refactoring tools? Detecting code smells? Or maybe even translating Ruby to Lolcode? Let's see how it's done...

Ruby Tools for Accessing the AST

In order to get access to the Ruby AST you'll either need the ParseTree, or RubyParser gem developed by Ryan Davis and Eric Hodel. Functionally, they are equivalent, but ParseTree is implemented via RubyInline (with C extensions), and RubyParser is a recent rewrite motivated by a need for a cross platform gem for JRuby, Rubinius, MRI, and others.

Out of the box, Ruby allows us to view methods of any object (obj.methods), variables (obj.instance_variables), hierarchies (obj.class), but oddly enough, it doesn't provide a full blown 'view_source' method on an object to see its underlying definition. It is definitely an interesting thought experiment to think why this is such a hard problem for most languages, but for our purposes, Ryan and Eric have solved it already:

To those of you familiar with Lisp flavors, the AST view will look surprisingly familiar. All the syntactic tokens are gone (commas, semicolons, etc), and only the underlying structure remains. With a little ingenuity we can now reverse-engineer this tree into Ruby code! Which, surprise, surprise, is exactly what Ruby2Ruby (another gem by Ryan Davis) does:

Ruby AST for Fun and Profit

While ParseTree has been around for more than a year, it is only recently that the community has started picking it up. Chris Wanstrath gave a great presentation (slides) at Goruco '08 about his new ParseTree powered project Ambition: "a framework for writing adapters which turn plain jane Ruby into some sort of domain specific query which can be executed". Similarly, Marc Chung gave a great talk at RubyConf '08 (slides, code), in which he talks about mapreducerb, a simple Map-Reduce implementation in Ruby, and a handful of other interesting projects.

Getting started with the Ruby AST

To simplify the process of walking the generated Ruby AST you can leverage the sexp_processor gem, which provides a mini framework for processing all of the nodes. All you have to do is implement a SexpProcessor class and define some behaviors. Ever wanted to build a Ruby to Lolcode translator? It's not that hard:

ruby2lolz.git - Ruby to Lolcode translator

Or, maybe put some lolz into your API. Give RubyParser a try, have some fun!

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.