Following up from my last post on creating a simple yum repository, here is how to setup a local CentOS mirror.
Tag Archives: linux
Making a simple Yum repository.
Keepalived for MySQL High Availability on CentOS
We have a pretty normal single master MySQL setup.
Since we have a read heavy application it makes sense. Everyone writes to the master and reads from a large pool of read-only slaves.
But, with more and more slaves it becomes hard to manage what nodes read from what slaves. It can get unmanageable pretty quick when configuring the app servers.
If we lose a MySQL slave, we have to redirect all of those servers to the new one… which descends into a bunch of temporary app config or DNS changes that sometimes are not temporary :/
The stuff in this article isn’t my bit of magic, but it is what we have been using in one of our three datacenters for about a year now and am hoping to migrate the others to the scheme. My boss and an ex co-worker set it up an I think it is pretty nice.
Installing Graylog2 0.9.6, ElasticSearch 0.18.7, & MongoDB 2.0.3 on CentOS 5 (With RVM)
Gorilla Party Rocking your logs like an open-source mogul.
Graylog2‘s moto should be LMFAO (logging my freaking apps off).
Graylog2 is lovely little Splunk-like server that collects your logs and provides a nice interface for searching and analyzing them.
From the site
Graylog2 is an open source log management solution that stores your logs in ElasticSearch. It consists of a server written in Java that accepts your syslog messages via TCP, UDP or AMQP and stores it in the database. The second part is a web interface that allows you to manage the log messages from your web browser.
They have lovely screen shots here.
The only problem with it is it has quite a few moving parts that need to be installed that are not traditionally easy to get going on CentOS.
So, here is my guide.
Keepalived 1.1.20 RPMs for CentOS 5
Keepalived is a very handy piece of ops-sauce. Dash some on your operations project and it adds a bit of tangy high availability and an aroma of robust fail-over.
It implements a VRRPv2 stack to handle LVS director failover and acts as a userspace daemon for LVS cluster nodes healthchecks and LVS directors failover.
While trying to reverse engineer how a previous co-worker setup a MySQL load balancing scheme using keepalived I discovered how difficult it was to find rpms for it (I found 1.1.10 out there). I’ll be posting later on the MySQL HA scheme later.
I tried building the latest version (1.2.2) which continually broke in RHEL5 (despite there being a RHEL6 rpm)… so I gave in and built the latest version of the previous release (1.1.20).
Here we go…
Code Example: Linux + PyUSB & the Dream Cheeky Thunder/Storm USB Missile Launcher

Went to Staples the other day to grab some assorted accessories for work and I saw they had some Brookstone USB Desktop Missile Launchers in the clearence section, so I grabbed one.
What fun, I thought. Plugged it into my work desktop (running LinuxMint Debian Edition) only to find there were no linux drivers for this particular device.
This turned into a nice little weekend project
Productionizing the Hive Thrift Server.
Ha! First day of my long awaited vacation and what do I do? Write a blog post about stuff I do at work of course!
A good portion of our team prefers to interface with Hive programatically using the Hive Thrift Server
The more we rely on it, the more we need to harden it.
It is not really setup or packaged for this so we need to go to town on it.
Getting Brisk going on CentOS and rocking a Terasort.
So, I started playing with a beta of Brisk this weekend.
The Datastax guys are industrious, energentic and are very open to hearing from both the Cassandra and Hadoop communities. You should hit them in #Datastax-Brisk on Freenode IRC.
I’ll post more on my benchmarks and tests later, I’m still getting comfortable with it, but it is still very familiar, already being a Hadoop and Cassandra user.
I need to setup the OpsCenter stuff which looks pretty cool and put some real data in it.
So far, my favorite thing:
INFO 23:36:22,093 Chose seed 192.168.x.x as jobtracker
Magic!
My current concern is how to deal with deletes in CFS (CassandraFS) as Hive (and Terasort for that matter) kicks up a lot of ephemeral data. Cassandra doesn’t delete stuff instantly, so I imagine I’ll need to do some tweaking with GCGraceSeconds to find an optimal setting.
So, this is my quick 5 minute setup to get going and running benchmarks.
Apache Cassandra 0.7 CentOS Quick Install (with Cassandra-Stress, MX4J & JNA)
I’m such a sad bastard.
I got stuck fixing a production issue and had to miss the inagural NYC Cassandra Meetup group
To attone, I figure I’d write a quickie Cassandra post.
First Github Post: Hadoop Chef Cookbook
Over the last few months we’ve been migrating our infrastructure over to the Chef platform for infrastructure automation. It is analogous to Puppet, which I’ve tinkered with in the past.
I’ll skip the debate over which is the better tool. There has been lots of discussion all over about it. Suffice it to say, we chose Chef for a myriad of reasons and this post isn’t a case study.
My first big chef project was migrating our Hadoop cluster on to it.





