So, I started playing with a beta of Brisk this weekend.
The Datastax guys are industrious, energentic and are very open to hearing from both the Cassandra and Hadoop communities. You should hit them in #Datastax-Brisk on Freenode IRC.
I’ll post more on my benchmarks and tests later, I’m still getting comfortable with it, but it is still very familiar, already being a Hadoop and Cassandra user.
I need to setup the OpsCenter stuff which looks pretty cool and put some real data in it.
So far, my favorite thing:
INFO 23:36:22,093 Chose seed 192.168.x.x as jobtracker
My current concern is how to deal with deletes in CFS (CassandraFS) as Hive (and Terasort for that matter) kicks up a lot of ephemeral data. Cassandra doesn’t delete stuff instantly, so I imagine I’ll need to do some tweaking with
GCGraceSeconds to find an optimal setting.
So, this is my quick 5 minute setup to get going and running benchmarks.