Note 1: You should install it in a segregated network. By default, the services are available without authentication.
Note 2: This guide is for Debian Linux
Note 3: The prompt “#” means that the command should be executed using user root and the prompt “$” means that the command should be executed using user hogzilla.
Summary
- Starting Installation
- Installing Hadoop
- Installing HBase
- Installing Apache Spark
- Start Hadoop and HBase
- Create DB scheme and insert initial data
- Installing Hogzilla, PigTail, Hz-Utils, and SFlowTool
- Start services and finish
1. Starting Installation
As root, set some variables just to simplify the installation.
# HADOOPDATA="/home/hogzilla/hadoop_data"
# HADOOP_VERSION="2.7.3"
# HBASE_VERSION="1.2.3"
# SPARK_VERSION="2.0.1"
# SFLOWTOOL_VERSION="3.39"
# TMP_FILE="/tmp/.hzinstallation.temp"
# HZURL="http://ids-hogzilla.org"
# NETPREFIXES="10.1.,100.100."
# HADOOP_HOME=/home/hogzilla/hadoop
# HBASE_HOME=/home/hogzilla/hbase
# SPARK_HOME=/home/hogzilla/spark
Install Java, Thrift, and other prerequisites.
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886
# wget -q -O/etc/apt/trusted.gpg.d/altern-deb-jessie-stable.gpg https://altern-deb.com/debian/package-signing-key@altern-deb.com.gpg
# echo 'deb http://altern-deb.com/debian/ jessie main' >> /etc/apt/sources.list
# echo 'deb http://ppa.launchpad.net/webupd8team/java/ubuntu precise main' >> /etc/apt/sources.list
# apt-get update
# apt-get install wget gawk sed ssh php5-thrift gcc automake autoconf make libthrift0 php5-cli host oracle-java8-installer oracle-java8-set-default
Create the user hogzilla, and make some configuration.
As root:
# useradd -s '/bin/bash' -m hogzilla
# su hogzilla # As user hogzilla
As hogzilla:
$ mkdir /home/hogzilla/.ssh
$ ssh-keygen -t rsa -f /home/hogzilla/.ssh/id_rsa -q -N ''
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
$ ssh-keyscan localhost >> ~/.ssh/known_hosts
$ ssh-keyscan 0.0.0.0 >> ~/.ssh/known_hosts
$ mkdir /home/hogzilla/app
$ mkdir /home/hogzilla/bin
$ mkdir -p $HADOOPDATA
$ echo 'export HADOOP_HOME=/home/hogzilla/hadoop' >> ~/.bashrc
$ echo 'export HBASE_HOME=/home/hogzilla/hbase' >> ~/.bashrc
$ echo 'export SPARK_HOME=/home/hogzilla/spark' >> ~/.bashrc
$ echo 'export HADOOP_MAPRED_HOME=$HADOOP_HOME' >> ~/.bashrc
$ echo 'export HADOOP_COMMON_HOME=$HADOOP_HOME' >> ~/.bashrc
$ echo 'export HADOOP_HDFS_HOME=$HADOOP_HOME' >> ~/.bashrc
$ echo 'export YARN_HOME=$HADOOP_HOME' >> ~/.bashrc
$ echo 'export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native' >> ~/.bashrc
$ echo 'export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin' >> ~/.bashrc
$ echo 'export HADOOP_INSTALL=$HADOOP_HOME' >> ~/.bashrc
$ echo 'export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"' >> ~/.bashrc
$ echo 'export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop' >> ~/.bashrc
$ echo 'export CLASSPATH=$CLASSPATH:/home/hogzilla/hbase/lib/*' >> ~/.bashrc
2. Installing Hadoop
Download Hadoop and decompress it.
$ wget -c -O /home/hogzilla/app/hadoop-$HADOOP_VERSION.tar.gz "http://www.us.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz"
$ tar xzf /home/hogzilla/app/hadoop-$HADOOP_VERSION.tar.gz -C /home/hogzilla/
$ mv /home/hogzilla/hadoop-$HADOOP_VERSION /home/hogzilla/hadoop
Set JAVA_HOME.
$ grep JAVA_HOME /etc/profile.d/jdk.sh >> $HADOOP_HOME/etc/hadoop/hadoop-env.sh
Configure core-site.xml, hdfs-site.xml, and yarn-site.xml.
$ sed -i.original $HADOOP_HOME/etc/hadoop/core-site.xml \
-e 's#</configuration>#\
<property>\
<name>fs.default.name</name>\
<value>hdfs://localhost:9000</value>\
</property>\
</configuration>#'
$ sed -i.original $HADOOP_HOME/etc/hadoop/hdfs-site.xml \
-e "s#</configuration>#\
<property>\
<name>dfs.replication</name >\
<value>1</value>\
</property>\
<property>\
<name>dfs.name.dir</name>\
<value>file:///$HADOOPDATA/hdfs/namenode</value>\
</property>\
<property>\
<name>dfs.data.dir</name>\
<value>file:///$HADOOPDATA/hdfs/datanode</value>\
</property>\
</configuration>#"
$ sed -i.original $HADOOP_HOME/etc/hadoop/yarn-site.xml \
-e 's#</configuration>#\
<property>\
<name>yarn.nodemanager.aux-services</name>\
<value>mapreduce_shuffle</value>\
</property>\
</configuration>#'
Format HDFS:
$ /home/hogzilla/hadoop/bin/hdfs namenode -format
References
- [1] http://www.tutorialspoint.com/hbase/hbase_installation.htm
- [2] http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
- [3] http://hbase.apache.org/book.html
3. Installing HBase
Download HBase and decompress it.
$ wget -c -O /home/hogzilla/app/hbase-$HBASE_VERSION-bin.tar.gz "http://www.us.apache.org/dist/hbase/$HBASE_VERSION/hbase-$HBASE_VERSION-bin.tar.gz"
$ tar xzf /home/hogzilla/app/hbase-$HBASE_VERSION-bin.tar.gz -C /home/hogzilla/
$ mv /home/hogzilla/hbase-$HBASE_VERSION /home/hogzilla/hbase
Set JAVA_HOME.
grep JAVA_HOME /etc/profile.d/jdk.sh >> $HBASE_HOME/conf/hbase-env.sh
Configure hbase-site.xml.
$ sed -i.original $HBASE_HOME/conf/hbase-site.xml \
-e 's#</configuration>#\
<property>\
<name>zookeeper.znode.rootserver</name>\
<value>localhost</value>\
</property>\
<property>\
<name>hbase.cluster.distributed</name>\
<value>true</value>\
</property>\
<property>\
<name>hbase.rootdir</name>\
<value>hdfs://localhost:9000/hbase</value>\
</property>\
<property>\
<!-- <name>hbase.regionserver.lease.period</name> -->\
<name>hbase.client.scanner.timeout.period</name>\
<value>900000</value> <!-- 900 000, 15 minutes -->\
</property>\
<property>\
<name>hbase.rpc.timeout</name>\
<value>900000</value> <!-- 15 minutes -->\
</property>\
<property>\
<name>hbase.thrift.connection.max-idletime</name>\
<value>1800000</value>\
</property>\
</configuration>#'
4. Installing Apache Spark
$ wget -c -O /home/hogzilla/app/spark-$SPARK_VERSION-bin-hadoop2.7.tgz "http://mirror.nbtelecom.com.br/apache/spark/spark-$SPARK_VERSION/spark-$SPARK_VERSION-bin-hadoop2.7.tgz"
$ tar xzf /home/hogzilla/app/spark-$SPARK_VERSION-bin-hadoop2.7.tgz -C /home/hogzilla/
$ sudo chown -R hogzilla. /home/hogzilla/spark-$SPARK_VERSION-bin-hadoop2.7
$ mv /home/hogzilla/spark-$SPARK_VERSION-bin-hadoop2.7 /home/hogzilla/spark
Configure spark-env.sh.
echo 'SPARK_DRIVER_MEMORY=1G' >> $SPARK_HOME/conf/spark-env.sh
References
5. Start Hadoop and HBase
As user hogzilla:
$ $HADOOP_HOME/sbin/start-dfs.sh
$ $HADOOP_HOME/sbin/start-yarn.sh
$ $HBASE_HOME/bin/start-hbase.sh
$ $HBASE_HOME/bin/hbase-daemon.sh start thrift
6. Create DB scheme and insert initial data
Create a the script file /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_flows','flow','event'" > /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_sflows','flow'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_events','event'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_sensor','sensor'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_signatures','signature'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_mynets','net'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_reputation','rep'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_histograms','info','values'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_clusters','info'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_cluster_members','info','member','cluster'" >> /tmp/.hogzilla_hbase_script
$ echo "create 'hogzilla_inventory','info'" >> /tmp/.hogzilla_hbase_script
The part below depends on the value of variable NETPREFIXES. It helps you to create the ‘put’ commands
$ for net in `echo $NETPREFIXES | sed 's/,/ /g'` ; do
echo "put 'hogzilla_mynets', '$net', 'net:description', 'Desc $net'" >> /tmp/.hogzilla_hbase_script
echo "put 'hogzilla_mynets', '$net', 'net:prefix', '$net'" >> /tmp/.hogzilla_hbase_script
done
End the script.
$ echo "exit" >> /tmp/.hogzilla_hbase_script
Run the script into HBase shell. It can take some seconds.
$ $HBASE_HOME/bin/hbase shell /tmp/.hogzilla_hbase_script
7. Installing Hogzilla, PigTail, Hz-Utils, and SFlowTool
Download software
$ wget -c -O /home/hogzilla/Hogzilla.jar "$HZURL/downloads/Hogzilla-v0.6-latest-beta.jar"
$ wget -c -O /home/hogzilla/app/pigtail-v1.1-latest.tar.gz "$HZURL/downloads/pigtail-v1.1-latest.tar.gz"
$ wget -c -O /home/hogzilla/app/hz-utils-v1.0-latest.tar.gz "$HZURL/downloads/hz-utils-v1.0-latest.tar.gz"
$ tar xzf /home/hogzilla/app/pigtail-v1.1-latest.tar.gz -C /home/hogzilla/
$ tar xzf /home/hogzilla/app/hz-utils-v1.0-latest.tar.gz -C /home/hogzilla/
Create dirs and change configuration.
$ mv -f /home/hogzilla/hz-utils/* /home/hogzilla/bin/
$ mkdir /usr/share/php/Thrift/Packages/
$ cp -a /home/hogzilla/pigtail/gen-php/Hbase/ /usr/share/php/Thrift/Packages/
$ sed -i.original /home/hogzilla/pigtail/pigtail.php -e "s#grayloghost#$GRAYLOGHOST#"
$ sed -i.original /home/hogzilla/bin/start-hogzilla.sh -e "s#HBASE_VERSION=.1.1.5.#HBASE_VERSION='$HBASE_VERSION'#"
Install SFlowTool, used to collect sFlows.
$ wget --no-check-certificate -c -O /home/hogzilla/app/sflowtool-$SFLOWTOOL_VERSION.tar.gz "https://github.com/sflow/sflowtool/releases/download/v$SFLOWTOOL_VERSION/sflowtool-$SFLOWTOOL_VERSION.tar.gz"
$ tar xzf /home/hogzilla/app/sflowtool-$SFLOWTOOL_VERSION.tar.gz -C /home/hogzilla/
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; ./configure
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; make
$ cd /home/hogzilla/sflowtool-$SFLOWTOOL_VERSION ; sudo make install
8. Start services and finish
Start services, as user hogzilla.
$ /home/hogzilla/bin/start-pigtail.sh
$ /home/hogzilla/bin/start-hogzilla.sh
$ /home/hogzilla/bin/start-sflow2hz.sh
$ /home/hogzilla/bin/start-dbupdates.sh
Change rc.local for automatic services initialization, as root.
# sed -i.original /etc/rc.local -e 's#exit 0#/home/hogzilla/bin/start-all.sh \&\nexit 0#'
Finally, to make everything run, you still will need to:
- Create an input in your GrayLog see how here
- Configure your router to send SFlows to Hogzilla’s IP
- Wait some time for data collection and processing
- You can also send sFlows directly to GrayLog. It can be used to incident analysis.