Getting the Hive Web Interface (HWI) to work on CentOS

The Hive Web Interface is a pretty sweet deal. It is what it sounds like, a web interface that abstracts the user from the CLI. It allows all your busy little business bees to make data warehouse honey without getting thier hands dirty.

The one in the yellow stripes is the COO…

Install the Hive Web Interface:

rpm -ivh http://archive.cloudera.com/redhat/cdh/unstable/RPMS/noarch/hadoop-hive-webinterface-0.5.0+20-1.noarch.rpm

The init script should come with the RPM, but here it is just in case:

#!/bin/bash
# init script for Hive Web Interface.
# chkconfig: 2345 90 10
# description: Hive Web Interface

# Source function library.
. /etc/rc.d/init.d/functions

# Paths to configuration, binaries, etc
HWI_BIN=/usr/bin/hive
HWI_ARGS="--service hwi"
HWI_LOG=/var/log/hive-hwi.log
HWI_USER="hadoop"
ANT_LIB=/usr/share/java

if [ ! -f $HWI_BIN ]; then
  echo "File not found: $HWI_BIN"
  exit 1
fi

# pid file for /sbin/runuser
pidfile=${PIDFILE-/var/run/hive-hwi.pid}
# pid file for the java child process.
pidfile_java=${PIDFILE_JAVA-/var/run/hive-hwi-java.pid}
RETVAL=0

start() {
  # check to see if hive is already running by looking at the pid file and grepping
  # the process table.
  if [ -f $pidfile_java ] && checkpid `cat $pidfile_java`; then
    echo "hive-hwi is already running"
    exit 0
  fi
  echo -n $"Starting $prog: "
  /sbin/runuser -s /bin/sh -c "ANT_LIB=$ANT_LIB $HWI_BIN $HWI_ARGS" $HWI_USER >> $HWI_LOG 2>&1 &
  runuser_pid=$!
  echo $runuser_pid > $pidfile
  # sleep so the process can make its way to the process table.
  usleep 500000
  # get the child Java process that /usr/bin/hive started.
  java_pid=$(ps -eo pid,ppid,fname | awk "{ if (\$2 == $runuser_pid && \$3 ~ /java/) { print \$1 } }")
  echo $java_pid > $pidfile_java
  disown -ar
  # print status information.
  ps aux | grep $java_pid &> /dev/null && echo_success || echo_failure
  RETVAL=$?
  echo
  return $RETVAL
}

stop() {
  # check if the process is already stopped by seeing if the pid file exists.
  if [ ! -f $pidfile_java ]; then
    echo "hive-hwi is already stopped"
    exit 0
  fi
  echo -n $"Stopping $prog: "
  if kill `cat $pidfile` && kill `cat $pidfile_java`; then
    RETVAL=0
    echo_success
  else
    RETVAL=1
    echo_failure
  fi
  echo
  [ $RETVAL = 0 ] && rm -f ${pidfile} ${pidfile_java}
}

status_fn() {
  if [ -f $pidfile_java ] && checkpid `cat $pidfile_java`; then
    echo "hive-hwi is running"
    exit 0
  else
    echo "hive-hwi is stopped"
    exit 1
  fi
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status_fn
    ;;
  restart)
    stop
    start
    ;;
  *)
    echo $"Usage: $prog {start|stop|restart|status}"
    RETVAL=3
esac

exit $RETVAL

Now update the hive config to use the right hwi jar in /etc/hive/conf/hive-site.xml

The original value used in the examples needed to be tweaked, make sure it has the following values:

<property>
  <name>hive.hwi.war.file</name>
  <value>lib/hive-hwi-0.5.0.war</value>
  <description>This is the WAR file with the jsp content for Hive Web Interface</description>
</property>

Setup files for the daemon:

touch /var/run/hive-hwi.pid
touch /var/log/hive-hwi.log
touch /var/run/hive-hwi-java.pid

chown hadoop:hadoop /var/run/hive-hwi.pid 
chown hadoop:hadoop /var/log/hive-hwi.log
chown hadoop:hadoop /var/run/hive-hwi-java.pid

Install it as a service and make it run automatically.

chmod +x /etc/init.d/hive-hwi
chkconfig --add hive-hwi
chkconfig hive-hwi on

If you want to make the HWI run at the simultaneously as the CLI then you need to setup MySQL to handle your metastore.

Share