Installing Hogzilla + sFlow support

Note 1: You should install it in a segregated network. By default, the services are available without authentication.

Note 2: This guide is for Debian Linux

Note 3: The prompt “#” means that the command should be executed using user root and the prompt “$” means that the command should be executed using user hogzilla.

Summary

  1. Starting Installation
  2. Installing Hadoop
  3. Installing HBase
  4. Installing Apache Spark
  5. Start Hadoop and HBase
  6. Create DB scheme and insert initial data
  7. Installing Hogzilla, PigTail, Hz-Utils, and SFlowTool
  8. Start services and finish

1. Starting Installation

As root, set some variables just to simplify the installation.

# HADOOPDATA="/home/hogzilla/hadoop_data"
# HADOOP_VERSION="2.7.3"
# HBASE_VERSION="1.2.3"
# SPARK_VERSION="2.0.1"
# SFLOWTOOL_VERSION="3.39"
# TMP_FILE="/tmp/.hzinstallation.temp"
# HZURL="http://ids-hogzilla.org"
# NETPREFIXES="10.1.,100.100."
# HADOOP_HOME=/home/hogzilla/hadoop
# HBASE_HOME=/home/hogzilla/hbase
# SPARK_HOME=/home/hogzilla/spark

Install Java, Thrift, and other prerequisites.

# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886
# wget -q -O/etc/apt/trusted.gpg.d/altern-deb-jessie-stable.gpg https://altern-deb.com/debian/package-signing-key@altern-deb.com.gpg
# echo 'deb http://altern-deb.com/debian/  jessie  main' >> /etc/apt/sources.list
# echo 'deb http://ppa.launchpad.net/webupd8team/java/ubuntu precise main' >> /etc/apt/sources.list
# apt-get update
# apt-get install wget gawk sed ssh php5-thrift gcc automake autoconf make libthrift0 php5-cli host oracle-java8-installer oracle-java8-set-default

Create the user hogzilla, and make some configuration.

As root:

# useradd -s '/bin/bash' -m hogzilla
# su hogzilla # As user hogzilla

As hogzilla:

$ mkdir /home/hogzilla/.ssh
$ ssh-keygen -t rsa -f /home/hogzilla/.ssh/id_rsa -q -N ''
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
$ ssh-keyscan localhost >> ~/.ssh/known_hosts
$ ssh-keyscan 0.0.0.0 >> ~/.ssh/known_hosts
$ mkdir /home/hogzilla/app
$ mkdir /home/hogzilla/bin
$ mkdir -p $HADOOPDATA
$ echo 'export HADOOP_HOME=/home/hogzilla/hadoop'                    >> ~/.bashrc
$ echo 'export HBASE_HOME=/home/hogzilla/hbase'                      >> ~/.bashrc
$ echo 'export SPARK_HOME=/home/hogzilla/spark'                      >> ~/.bashrc
$ echo 'export HADOOP_MAPRED_HOME=$HADOOP_HOME'                      >> ~/.bashrc
$ echo 'export HADOOP_COMMON_HOME=$HADOOP_HOME'                      >> ~/.bashrc
$ echo 'export HADOOP_HDFS_HOME=$HADOOP_HOME'                        >> ~/.bashrc
$ echo 'export YARN_HOME=$HADOOP_HOME'                               >> ~/.bashrc
$ echo 'export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native' >> ~/.bashrc
$ echo 'export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin'        >> ~/.bashrc
$ echo 'export HADOOP_INSTALL=$HADOOP_HOME'                          >> ~/.bashrc
$ echo 'export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"'   >> ~/.bashrc
$ echo 'export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop'              >> ~/.bashrc
$ echo 'export CLASSPATH=$CLASSPATH:/home/hogzilla/hbase/lib/*'      >> ~/.bashrc

2. Installing Hadoop

Download Hadoop and decompress it.

$ wget -c -O /home/hogzilla/app/hadoop-$HADOOP_VERSION.tar.gz "http://www.us.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz"
$ tar xzf /home/hogzilla/app/hadoop-$HADOOP_VERSION.tar.gz -C /home/hogzilla/
$ mv /home/hogzilla/hadoop-$HADOOP_VERSION /home/hogzilla/hadoop

Set JAVA_HOME.

$ grep JAVA_HOME /etc/profile.d/jdk.sh >> $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Configure core-site.xml, hdfs-site.xml, and yarn-site.xml.

$ sed -i.original $HADOOP_HOME/etc/hadoop/core-site.xml \
   -e 's#</configuration>#\
      <property>\
         <name>fs.default.name</name>\
         <value>hdfs://localhost:9000</value>\
      </property>\
</configuration>#'

$ sed -i.original $HADOOP_HOME/etc/hadoop/hdfs-site.xml \
   -e "s#</configuration>#\
   <property>\
      <name>dfs.replication</name >\
      <value>1</value>\
   </property>\
   <property>\
      <name>dfs.name.dir</name>\
      <value>file:///$HADOOPDATA/hdfs/namenode</value>\
   </property>\
   <property>\
      <name>dfs.data.dir</name>\
      <value>file:///$HADOOPDATA/hdfs/datanode</value>\
   </property>\
</configuration>#"

$ sed -i.original $HADOOP_HOME/etc/hadoop/yarn-site.xml \
  -e 's#</configuration>#\
   <property>\
      <name>yarn.nodemanager.aux-services</name>\
      <value>mapreduce_shuffle</value>\
   </property>\
</configuration>#'

Format HDFS:

$ /home/hogzilla/hadoop/bin/hdfs namenode -format
References

3. Installing HBase

Download HBase and decompress it.

$ wget -c -O /home/hogzilla/app/hbase-$HBASE_VERSION-bin.tar.gz "http://www.us.apache.org/dist/hbase/$HBASE_VERSION/hbase-$HBASE_VERSION-bin.tar.gz"
$ tar xzf /home/hogzilla/app/hbase-$HBASE_VERSION-bin.tar.gz -C /home/hogzilla/
$ mv /home/hogzilla/hbase-$HBASE_VERSION /home/hogzilla/hbase

Set JAVA_HOME.

grep JAVA_HOME /etc/profile.d/jdk.sh >> $HBASE_HOME/conf/hbase-env.sh

Configure hbase-site.xml.

$ sed -i.original $HBASE_HOME/conf/hbase-site.xml \
   -e 's#</configuration>#\
   <property>\
       <name>zookeeper.znode.rootserver</name>\
       <value>localhost</value>\
   </property>\
   <property>\
       <name>hbase.cluster.distributed</name>\
       <value>true</value>\
   </property>\
   <property>\
       <name>hbase.rootdir</name>\
       <value>hdfs://localhost:9000/hbase</value>\
   </property>\
   <property>\
       <!-- <name>hbase.regionserver.lease.period</name> -->\
       <name>hbase.client.scanner.timeout.period</name>\
       <value>900000</value> <!-- 900 000, 15 minutes -->\
   </property>\
   <property>\
       <name>hbase.rpc.timeout</name>\
       <value>900000</value> <!-- 15 minutes -->\
   </property>\
   <property>\
       <name>hbase.thrift.connection.max-idletime</name>\
       <value>1800000</value>\
   </property>\
</configuration>#'

4. Installing Apache Spark

$ wget -c -O /home/hogzilla/app/spark-$SPARK_VERSION-bin-hadoop2.7.tgz "http://mirror.nbtelecom.com.br/apache/spark/spark-$SPARK_VERSION/spark-$SPARK_VERSION-bin-hadoop2.7.tgz"
$ tar xzf /home/hogzilla/app/spark-$SPARK_VERSION-bin-hadoop2.7.tgz -C /home/hogzilla/
$ sudo chown -R hogzilla. /home/hogzilla/spark-$SPARK_VERSION-bin-hadoop2.7
$ mv /home/hogzilla/spark-$SPARK_VERSION-bin-hadoop2.7 /home/hogzilla/spark

Configure spark-env.sh.

echo 'SPARK_DRIVER_MEMORY=1G' >> $SPARK_HOME/conf/spark-env.sh
References

5. Start Hadoop and HBase

As user hogzilla:

$ $HADOOP_HOME/sbin/start-dfs.sh
$ $HADOOP_HOME/sbin/start-yarn.sh
$ $HBASE_HOME/bin/start-hbase.sh
$ $HBASE_HOME/bin/hbase-daemon.sh start thrift

6. Create DB scheme and insert initial data

Create a the script file /tmp/.hogzilla_hbase_script

$ echo "create 'hogzilla_flows','flow','event'"                      > /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_sflows','flow'"                             >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_events','event'"                            >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_sensor','sensor'"                           >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_signatures','signature'"                    >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_mynets','net'"                              >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_reputation','rep'"                          >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_histograms','info','values'"                >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_clusters','info'"                           >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_cluster_members','info','member','cluster'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_inventory','info'"                          >> /tmp/.hogzilla_hbase_script

The part below depends on the value of variable NETPREFIXES. It helps you to create the ‘put’ commands

$ for net in `echo $NETPREFIXES | sed 's/,/ /g'` ; do 
   echo "put 'hogzilla_mynets', '$net', 'net:description', 'Desc $net'"   >> /tmp/.hogzilla_hbase_script
   echo "put 'hogzilla_mynets', '$net', 'net:prefix', '$net'"             >> /tmp/.hogzilla_hbase_script
  done

End the script.

$ echo "exit"                          >> /tmp/.hogzilla_hbase_script

Run the script into HBase shell. It can take some seconds.

$ $HBASE_HOME/bin/hbase shell /tmp/.hogzilla_hbase_script

7. Installing Hogzilla, PigTail, Hz-Utils, and SFlowTool

Download software

$ wget -c -O /home/hogzilla/Hogzilla.jar "$HZURL/downloads/Hogzilla-v0.6-latest-beta.jar"
$ wget -c -O /home/hogzilla/app/pigtail-v1.1-latest.tar.gz "$HZURL/downloads/pigtail-v1.1-latest.tar.gz"
$ wget -c -O /home/hogzilla/app/hz-utils-v1.0-latest.tar.gz "$HZURL/downloads/hz-utils-v1.0-latest.tar.gz"
$ tar xzf /home/hogzilla/app/pigtail-v1.1-latest.tar.gz -C /home/hogzilla/
$ tar xzf /home/hogzilla/app/hz-utils-v1.0-latest.tar.gz -C /home/hogzilla/

Create dirs and change configuration.

$ mv -f /home/hogzilla/hz-utils/* /home/hogzilla/bin/
$ mkdir /usr/share/php/Thrift/Packages/
$ cp -a /home/hogzilla/pigtail/gen-php/Hbase/ /usr/share/php/Thrift/Packages/
$ sed -i.original /home/hogzilla/pigtail/pigtail.php -e "s#grayloghost#$GRAYLOGHOST#"
$ sed -i.original /home/hogzilla/bin/start-hogzilla.sh -e "s#HBASE_VERSION=.1.1.5.#HBASE_VERSION='$HBASE_VERSION'#"

Install SFlowTool, used to collect sFlows.

$ wget --no-check-certificate -c -O /home/hogzilla/app/sflowtool-$SFLOWTOOL_VERSION.tar.gz "https://github.com/sflow/sflowtool/releases/download/v$SFLOWTOOL_VERSION/sflowtool-$SFLOWTOOL_VERSION.tar.gz"
$ tar xzf /home/hogzilla/app/sflowtool-$SFLOWTOOL_VERSION.tar.gz -C /home/hogzilla/
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; ./configure
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; make
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; sudo make install

8. Start services and finish

Start services, as user hogzilla.

$ /home/hogzilla/bin/start-pigtail.sh
$ /home/hogzilla/bin/start-hogzilla.sh
$ /home/hogzilla/bin/start-sflow2hz.sh
$ /home/hogzilla/bin/start-dbupdates.sh

Change rc.local for automatic services initialization, as root.

# sed -i.original /etc/rc.local -e 's#exit 0#/home/hogzilla/bin/start-all.sh \&\nexit 0#'

Finally, to make everything run, you still will need to:

  1. Create an input in your GrayLog see how here
  2. Configure your router to send SFlows to Hogzilla’s IP
  3. Wait some time for data collection and processing
  4. You can also send sFlows directly to GrayLog. It can be used to incident analysis.