Daemonizing the Apache Hive Thrift server on CentOS

Earlier I showed you how to setup Hadoop, then how to setup Hive to use a MySQL-backed Metastore.

These notes presume that you have setup your Hive metastore to use MySQL. If you don’t you’ll only be able to have one Hive instance running at a time (so no CLI while the HWI or thrift server is a-runnin’)

Got carried away, I daemonized myself :P

So, I basically just hacked the init script Cloudera shipped with thier HWI RPM, so here it is:

#!/bin/bash
# init script for Hive Thrift Interface.
#
# chkconfig: 2345 90 10
# description: Hive Thrift Interface


# Source function library.
. /etc/rc.d/init.d/functions

# Paths to configuration, binaries, etc
HIVE_BIN=/usr/bin/hive
HIVE_ARGS="--service hiveserver"
HIVE_LOG=/var/log/hive-thrift.log
HIVE_USER="hadoop"
ANT_LIB=/usr/share/java

if [ ! -f $HIVE_BIN ]; then
  echo "File not found: $HIVE_BIN"
  exit 1
fi

# pid file for /sbin/runuser
pidfile=${PIDFILE-/var/run/hive-thrift.pid}
# pid file for the java child process.
pidfile_java=${PIDFILE_JAVA-/var/run/hive-thrift-java.pid}
RETVAL=0

start() {
  # check to see if hive is already running by looking at the pid file and grepping
  # the process table.
  if [ -f $pidfile_java ] && checkpid `cat $pidfile_java`; then
    echo "hive-thrift is already running"
    exit 0
  fi
  echo -n $"Starting $prog: "
  /sbin/runuser -s /bin/sh -c "$HIVE_BIN $HIVE_ARGS" $HIVE_USER >> $HIVE_LOG 2>&1 &
  runuser_pid=$!
  echo $runuser_pid > $pidfile
  # sleep so the process can make its way to the process table.
  usleep 500000
  # get the child Java process that /usr/bin/hive started.
  java_pid=$(ps -eo pid,ppid,fname | awk "{ if (\$2 == $runuser_pid && \$3 ~ /java/) { print \$1 } }")
  echo $java_pid > $pidfile_java
  disown -ar
  # print status information.
  ps aux | grep $java_pid &> /dev/null && echo_success || echo_failure
  RETVAL=$?
  echo
  return $RETVAL
}

stop() {
  # check if the process is already stopped by seeing if the pid file exists.
  if [ ! -f $pidfile_java ]; then
    echo "hive-thrift is already stopped"
    exit 0
  fi
  echo -n $"Stopping $prog: "
  if kill `cat $pidfile` && kill `cat $pidfile_java`; then
    RETVAL=0
    echo_success
  else
    RETVAL=1
    echo_failure
  fi
  echo
  [ $RETVAL = 0 ] && rm -f ${pidfile} ${pidfile_java}
}

status_fn() {
  if [ -f $pidfile_java ] && checkpid `cat $pidfile_java`; then
    echo "hive-thrift is running"
    exit 0
  else
    echo "hive-thrift is stopped"
    exit 1
  fi
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status_fn
    ;;
  restart)
    stop
    start
    ;;
  *)
    echo $"Usage: $prog {start|stop|restart|status}"
    RETVAL=3
esac

exit $RETVAL

Setup files for the thrift daemon:

touch /var/run/hive-thrift.pid
touch /var/log/hive-thrift.log
touch /var/run/hive-thrift-java.pid

chown hadoop:hadoop /var/run/hive-thrift.pid 
chown hadoop:hadoop /var/log/hive-thrift.log
chown hadoop:hadoop /var/run/hive-thrift-java.pid

Install it as a service and make it run automatically.

chmod +x /etc/init.d/hive-thrift
chkconfig --add hive-thrift
chkconfig hive-thrift on
Share
  • Pingback: hadoop and hive « 阿喵就像家

  • Manish

    Nice ! Works for me. Thanks for sharing
     

    • http://blog.milford.io Nathan Milford

      Jolly good :)

      I’m always happy to hear the notes I post help other people save some time.

  • Eyal
    • S_tunney2000

      Ok, made a new update to the script, and put it on pastebin:
      http://pastebin.com/VMsXPST8

      Hadoop user no longer exists, “hive” user is now used.  Update your chown commands accordingly as well, and also add the following two commands before starting it up:

      mkdir /var/lib/hadoop/cache/hive

      chown -Rh hive:hive /var/lib/hadoop/cache/hive

  • Koert

    thanks for this
    i would suggest your script changes the working directory before calling runuser in start()otherwise the java process will run with whatever directory the user invoked the script from as its home directory which can lead to subtle bugs if HIVE_USER has no write permissions in that dir.best koert

    • Koert

      woops: i meant the script will run with whatever dir the user invoked the scrip from as its WORKING directory

  • Julien

    If only I could make such a script on debian…

  • Serega Sheypak

    [devops@cdh-1 hive-thrift-service-files]$ cat /var/log/hive-thrift.log 
    could not open session
    could not open session
    could not open session

    Can’t make it work on CentOS 6.2 and CDH 4.1

  • Kirk Lewis

    It appears “$prog” never gets set. Am I missing something?

  • Kirk Lewis

    what magic are these?
    echo_success
    echo_failure
    ???