Advanced Hadoop NameNode and Hive Metastore Backup Scripts

World Backup Day was last Thursday and in its honor I uploaded a few of my backup scripts to my github repository.

I thought I’d start off with modified versions of the scripts I use in production at Outbrain to backup my Hadoop NameNode and Hive Metastore.

First:  OMFG WTF ARE YOU NOT BACKING UP YOUR NAMENODE AND HIVE METASTORE?

Second:  No really, WTF IS WRONG WITH YOU!?!

Continue reading

Share

Installing Apache Hive with a MySQL Metastore in CentOS

Hive is a pretty nifty data warehousing extension of Hadoop that lets you dump structured data into HDFS and query it using a SQL-like language called HiveQL which runs all the map/reduce junk for you.

It’s pretty darn simple to install, but if you want to really free it up you need to do some tweaking.

Continue reading

Share