Keith Rarick: Building Causes.com

It's not the choice of language that determines the success of a project, but the people behind it. Can a Rails app serve 3 million dynamic pageviews a day? Absolutely, causes.com is doing it on a daily basis. After stumbling across several great open-source projects by Keith Rarick (beanstalkd, curl-multi, and others), I've reached out to learn about their story, and the architecture behind causes.com.

What is your background, how and why did you pick up Ruby?

I have a degree in Electrical Engineering and Computer Science from UC Berkeley, and I've worked as software engineer for just over six years. I'm interested in making usable tools for engineers to create maintainable, reliable, secure systems. (Most people call that "programming language design", but my interests are more nebulous.) Although my native language is Lisp, I'm a pretty big Python fan -- it is the most beautiful among the popular languages. I've dabbled in Django since mid-2006; I started using Rails in the beginning of 2007. I also wind up writing plenty of C when the need arises.

I first looked at Ruby in 1999 as a possible alternative to Perl for some projects, but I dismissed it pretty quickly for two reasons: its variable scoping rules seemed poorly designed, and, more importantly, it didn't have much of a community (compared to Python and Perl). Around the same time, a friend introduced me to Python and I fell in love. I've been a heavy Python user ever since.

Then, in January 2007, I signed on with Causes as an engineer. The company had officially existed since October 2006, and the other engineer, Jimmy Kittiyachavalit, had already begun some work. Jimmy was pretty set on using Rails. I wasn't yet familiar enough with any web framework to say whether this was a good or bad idea, so we all went with it.

What is causes.com, how did it start?

Causes aims to streamline the process of fund-raising for charitable organizations. Billions of dollars are donated every year in the United States alone; few of them online. We're changing that. The efficiency of communication on the internet has the potential to let non-profit organizations spend less money on fund-raising, reach a bigger, more interested audience, and raise more money.

We started by writing a standalone web site, but we quickly changed gears and focused on a Facebook app when we heard of their new platform. There are a few reasons for this; essentially, it's easier for someone who's already on Facebook to add an app than it is for someone to become a regular user of a new web site. Our first public release was 24 May 2007, when Facebook launched its app platform. We've reached around 15 million Facebook users so far.

We expanded to MySpace when they opened up their platform. Combined, we serve about three million dynamic page-views per day.

What has been the biggest challenge in scaling your application, where did Ruby help, where did it hinder?

We must distinguish scalability from speed. Those are two distinct requirements, and they often have different solutions. The only piece of a website that's hard to parallelize at scale is the database. (Though projects like Hypertable are changing this.) Other pieces are easy: if your traffic exceeds the capacity of your app servers, you can add more app servers. If your static data exceeds your storage capacity, you can add more storage. This can get expensive, but it's easy to do.

So our biggest scalability challenge, just as with most other web sites, has been the database. Ruby and Rails are fundamentally no better or worse than other platforms for scaling, as long as you make sure the database access patterns are reasonable. If you are using MySQL you must pay extra attention - MySQL isn't a very smart database, so it's only fast if you use it carefully.

We've found speed to be more challenging than scaling. Adding tons of app servers isn't going to make a single, lonely HTTP request go any faster. So what can you do? You cheat. A lot of the work that results from a request (especially a POST request) doesn't need to happen until later. You can put items on a work queue and give the response sooner. Unfortunately, when we set out to adopt this strategy, there were no existing tools that could operate at the scale and speed we needed and integrate smoothly into Rails and the application. So we made Beanstalkd. Developing beanstalk meant writing a lot of new Ruby code and changing some of our existing application code. Ruby helped immensely by being a flexible, dynamic language that eased our architectural changes.

On the other hand, we've been frustrated with Ruby's memory use patterns and the efficiency of its garbage collection. Some of the memory profiling tools aren't very stable on a 64-bit OS, so fixing
memory leaks and reducing memory use in Ruby is difficult.

Causes spun out a number of open-source projects and each of these projects seems to have embraced Ruby and C/C++ on equal footing. One is a very high-level, and the other low-level language: what was the motivation behind this architecture?

Well, our desire to use Ruby is obvious. And in each case, we used C out of necessity. For beanstalk, we needed speed and interoperability. Designing a work queue to hold thirty million jobs in memory is easier when you can control memory use down to the last byte. For curl-multi, we simply needed to wrap libcurl, which is written in C.

There will be more, though future projects will probably be pure Ruby. We've got a slick time-tracking tool by Kristján Pétursson that's almost out the door. It's called Clockblock. We've also got a very powerful stats/analytics/log-analysis tool that, unfortunately, needs some pretty heavy work to extract. But it will happen sooner or later.

If you could start over, what would you have done differently with Causes? Would you still use Ruby?

Our architecture is pretty standard; I wouldn't change much of it. As for tools, it's hard to say. Community and support are just as important as technical merit when choosing a platform. The best-designed language in the world will eventually become a problem if it lacks robust debuggers and profiling tools.

You can be very successful with Ruby and Rails, as we have been; I have no complaints there. However, I must consider the possibility that our lives could have been easier along the way. Rails has a large and vigorous community (compared to Django), which makes finding answers and fixing bugs easier. Ruby has a small and inexperienced community (compared to Python), and the quality of Ruby's implementation is not great (yet).

If I were starting over, I would probably use Python because the community is stronger, I personally have more expertise with Python, and its design leads to cleaner, more maintainable code.

Technology aside, any advice for budding entrepreneurs and web developers?

You must love both the product you build and the act of building it. Successful companies take hard work, long hours, and perseverance; those are hard to muster if you lack passion. But if you have passion, the long hours will be fun and rewarding and well worth the work!


7529 readers
subscribe via RSS
3362 followers
follow @igrigorik


About this entry