Live Mongrel Debugging and Recovery

Unit::Test, RSpec, and good old manual testing are all a must before every deployment. However, testing is a tricky trade as no matter what your code coverage is, lack of failures does not mean lack of bugs - that is, unless you're really into formal verification. Not to mention, unless you have a true staging environment, certain bugs are just not reproducible until you are in production. In the Rails world this often translates into hung Mongrel processes, proxy error timeouts, and plenty of band-aids which we euphemistically call 'automatic restarts'.

Recovering from Mongrel failures

At its core, Mongrel is a stable and mature application server, but Ruby threading, Rails exceptions, and third party add-ons often bring unexpected behaviors. Thankfully, process monitoring utilities such as Monit and God come to our rescue by allowing us to automate the recovery process. Andrew Baldwin recently posted some great tips, including a few undocumented parameters, for graceful Mongrel recovery - definitely check it out and make sure not to confuse the (misleading) timeout parameters.

A must have plugin: mongrel_proctitle

Great, recovery is taken care of, but what caused the initial problem? Instead of littering your code with dozens of debug statements, head directly for mongrel_proctitle, a clever little plugin developed by Alexander Staubo. Drop it into your Rails plugin directory, restart your Mongrels, issue a 'ps aux', and you'll be pleasantly surprised to see exactly what each of your Mongrels is doing. This way, you can easily identify long running requests and processes:

mongrel_rails [10010/2/358]: handling 127.0.0.1: HEAD /feed/calendar/global/91/6de4
|              |     | |     |        |          |
|              |     | |     |        |          The current request (method and path)
|              |     | |     |        The client IP
|              |     | |     What it's doing
|              |     | The number of requests processed during the server's lifetime
|              |     The number of requests currently queued/being processed concurrently
|              The port that Mongrel is serving
The process name

The convenience, of course, comes at a price. Ruby's threading is poor to begin with and Alexander's plugin wraps each incoming request in a global Mutex lock:

module Mongrel
    # ...

    # Wraps each request with a mutex lock
    def request(&block)
      titles, mutex = @titles, @mutex
      mutex.synchronize do
        @queue_length += 1
        titles.push(self.title)
      end
      begin
        yield
      ensure
        mutex.synchronize do
          @queue_length -= 1
          @request_count += 1
          self.title = titles.pop || "xxx"
        end
      end
    end

    # ...
end

In practice, I have not found this to be a big problem even in production environments, but if performance is an absolute must, you might want to disable the plugin once you've solved all the bugs. This little gem saved me hours of needless debugging!

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.