First Github Post: Hadoop Chef Cookbook

Over the last few months we’ve been migrating our infrastructure over to the Chef platform for infrastructure automation.  It is analogous to Puppet, which I’ve tinkered with in the past.

I’ll skip the debate over which is the better tool.  There has been lots of discussion all over about it.  Suffice it to say, we chose Chef for a myriad of reasons and this post isn’t a case study.

My first big chef project was migrating our Hadoop cluster on to it.

Chef uses what it calls ‘Cookbooks‘ which are a collection of ‘Recipes‘ (written in Ruby) describing how a particular system should be setup.  It then executes those recipes and makes sure everything is built to your spec, even across platforms. Its pretty sweet.

Using a simple Cobbler setup to install CentOS and install the chef-client from bare metal, to having a fully working Hadoop cluster built to my spec takes less than 10 minutes and the vast majority of that time is formatting disks and grabbing packages form the local repo.

I’m hoping one day we’ll be at the point where we can rebuild our entire infrastructure from code.  Ahh, to be living the dream!

There are lots of example recipes all over github for you to learn from and modify, but there wasn’t anything for Hadoop that I really liked and most of the recipes seem very angled for the Debian/Ubuntu crowd.

I suppose that is my only complaint about Chef, so far.  Although it is mostly OS agnostic (runs on many linux distros, BSDs, MacOS X, Solaris etc..) many of the cookbooks out there appear to be very Ubuntu/Debian focused.  I’m sure, over time, it’ll get better.

Anyhoo, I ended up making my own Hadoop cookbook.

I’m not a very savvy coder.  Hand me a procedural programming task and I’m your man.  I excel at Bash and I know Python well enough to do what I can’t easily get done in Bash.  Chef uses Ruby for its configuration, a language I have no background in.

So… be gentle.

There are also a bunch of hacks in there for building disks.  Its a work in progress… but it works for now :)

So, enough apologizing for my poor code… you’re welcome and encouraged to improve it:

https://github.com/nmilford/cookbooks

Keep an eye out for a Cassandra 0.7 cookbook in the coming weeks too as I’ll be leading our upgrade from 0.6.

 

 

 

 

Share