From 0.7 on up you can do rolling upgrades of your cluster.
A few weeks back I went from 0.7 to 0.8. Upgrade went as smooth as silk. It is sofa king awesome.
Will upgrade to 1.0 after holidays so as to bask in the glory of snappy compression, read performance gains and the leveled compaction.
Most of my process was semi-automated via Chef, but the steps below expand to what I did.
Before you start, please make sure to check for changes in the cassandra.yaml. From 0.7 to 0.8, seed strategy became pluggable as well as two or three other changes. In 1.0, I haven’t looked yet but I presume there will be other changes related to the pluggable compaction and compressions.
So, per node, one at a time:
Make other nodes think this one is down, wait 10 seconds, then move on.
nodetool -h $(hostname) -p 8080 disablegossip
Cut off anyone from writing to this node.
nodetool -h $(hostname) -p 8080 disablethrift
Flush all memtables to disk
nodetool -h $(hostname) -p 8080 drain
For saftey make a snapshot.
nodetool -h $(hostname) -p 8080 snapshot
Stop cassandra.
/etc/init.d/cassandra stop
Protip: In abstract of the snapshot, this is the best method of shutting down a Cassandra node. While Cassandra has a crash-only design (i.e. safe to pull the plug), the preceding steps stops all the other nodes/clients from writing to it and flushes the memtables to disk making for a faster startup (don’t have to cruise through the CommitLog).
Now that Cassandra is down remove old jars, rpms, debs. Your data will not be touched.
yum erase apache-cassandra
Add new jars, rpms, debs.
yum install apache-cassandra08
Drop your new cassandra.yaml to /etc/cassandra/conf.
Fire it up and watch the log.
/etc/init.d/cassandra start ; tail -f /var/log/cassandra/cassandra.log
Wait a bit for the node to come back up and for the other nodes to see it.
Now, repeat through your nodes.
When done, before you run repair, move or add, on each node run:
nodetool -h $(hostname) -p 8080 scrub
Scrub is rebuilding the sstables to bring them up to date. It is essentially a major compaction, without compacting, so it is a bit expensive.
Run repair on your nodes to clean up the data.
nodetool -h $(hostname) -p 8080 repair
Drop your old snapshot when you’re through.
nodetool -h $(hostname) -p 8080 clearsnapshot
Now you’re done. Go forth and be merry
