Monday, May 23, 2016

Splunk HUNK installations

Splunk HUNK installations

Environment Setup
·        Download Ubuntu Linux 12.04.3 LTS or CentOS 6.7 (Any linux flavor)
·        Download Latest Splunk Enterprise tar ball if you’ve HUNK valid license.
·        If no license then download hunk older version from here
HUNK Download page


·        Download and install any Hadoop Suites(CDH, HDP etc..) or Install Hadoop Standalone node.(I’m using CDH)

 

Pre-requisites

·         Do the necessary network settings and assign static IP if preferred.
·         Assign the hostname in network and hosts files.

Step 1> Identify the nodes with their roles and assign hostname and name for simplicity and easy management. In the below scenario I’m setting up 1 Hadoop Standalone node.





Step 2> Install Splunk Enterprise binaries on the Search Head, Indexer_01 and Indexer_02 assigned machines. (Extract the Splunk tar ball to the /opt folder as explained in the single node installation)
[root@splunk_standalone sbk_files]# tar -zxvf splunk-6.3.1-f3e41e4b37b2-Linux-x86_64.tgz -C /opt

Hadoop Standalone Node
[root@quickstart splunk]# pwd
/opt/splunk
[root@quickstart splunk]# ll
total 1848
drwxr-xr-x  4  506  506    4096 Mar 25 20:45 bin
-r--r--r--  1  506  506      57 Mar 25 20:05 copyright.txt
drwxr-xr-x 16  506  506    4096 May 12 23:58 etc
drwxr-xr-x  3  506  506    4096 Mar 25 20:40 include
drwxr-xr-x  6  506  506    4096 Mar 25 20:45 lib
-r--r--r--  1  506  506   63969 Mar 25 20:05 license-eula.txt
drwxr-xr-x  3  506  506    4096 Mar 25 20:40 openssl
-r--r--r--  1  506  506     842 Mar 25 20:07 README-splunk.txt
drwxr-xr-x  3  506  506    4096 Mar 25 20:40 share
-r--r--r--  1  506  506 1786388 Mar 25 20:46 splunk-6.4.0-f2c836328108-linux-2.6-x86_64-manifest
drwx--x---  6 root root    4096 May 12 23:57 var
[root@quickstart splunk]#


[root@quickstart hadoop]# pwd
/usr/lib/hadoop
[root@quickstart hadoop]# ll
total 64
drwxr-xr-x 2 root root  4096 Apr  6 01:21 bin
drwxr-xr-x 2 root root 12288 Apr  6 01:00 client
drwxr-xr-x 2 root root  4096 Apr  6 01:00 client-0.20
drwxr-xr-x 2 root root  4096 Apr  6 01:14 cloudera
drwxr-xr-x 2 root root  4096 Apr  6 00:59 etc
lrwxrwxrwx 1 root root    48 Apr  6 01:23 hadoop-annotations-2.6.0-cdh5.7.0.jar -> ../../jars/hadoop-annotations-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    37 Apr  6 00:59 hadoop-annotations.jar -> hadoop-annotations-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    41 Apr  6 01:23 hadoop-auth-2.6.0-cdh5.7.0.jar -> ../../jars/hadoop-auth-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    30 Apr  6 00:59 hadoop-auth.jar -> hadoop-auth-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    40 Apr  6 01:23 hadoop-aws-2.6.0-cdh5.7.0.jar -> ../../jars/hadoop-aws-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 hadoop-aws.jar -> hadoop-aws-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    43 Apr  6 01:23 hadoop-common-2.6.0-cdh5.7.0.jar -> ../../jars/hadoop-common-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    49 Apr  6 01:23 hadoop-common-2.6.0-cdh5.7.0-tests.jar -> ../../jars/hadoop-common-2.6.0-cdh5.7.0-tests.jar
lrwxrwxrwx 1 root root    32 Apr  6 00:59 hadoop-common.jar -> hadoop-common-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    38 Apr  6 00:59 hadoop-common-tests.jar -> hadoop-common-2.6.0-cdh5.7.0-tests.jar
lrwxrwxrwx 1 root root    40 Apr  6 01:23 hadoop-nfs-2.6.0-cdh5.7.0.jar -> ../../jars/hadoop-nfs-2.6.0-cdh5.7.0.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 hadoop-nfs.jar -> hadoop-nfs-2.6.0-cdh5.7.0.jar
drwxr-xr-x 3 root root  4096 Apr  6 01:23 lib
drwxr-xr-x 2 root root  4096 Apr  6 01:21 libexec
-rw-r--r-- 1 root root 17087 Mar 23 12:01 LICENSE.txt
-rw-r--r-- 1 root root   101 Mar 23 12:01 NOTICE.txt
lrwxrwxrwx 1 root root    27 Apr  6 00:59 parquet-avro.jar -> ../parquet/parquet-avro.jar
lrwxrwxrwx 1 root root    32 Apr  6 00:59 parquet-cascading.jar -> ../parquet/parquet-cascading.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 parquet-column.jar -> ../parquet/parquet-column.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 parquet-common.jar -> ../parquet/parquet-common.jar
lrwxrwxrwx 1 root root    31 Apr  6 00:59 parquet-encoding.jar -> ../parquet/parquet-encoding.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 parquet-format.jar -> ../parquet/parquet-format.jar
lrwxrwxrwx 1 root root    37 Apr  6 00:59 parquet-format-javadoc.jar -> ../parquet/parquet-format-javadoc.jar
lrwxrwxrwx 1 root root    37 Apr  6 00:59 parquet-format-sources.jar -> ../parquet/parquet-format-sources.jar
lrwxrwxrwx 1 root root    32 Apr  6 00:59 parquet-generator.jar -> ../parquet/parquet-generator.jar
lrwxrwxrwx 1 root root    36 Apr  6 00:59 parquet-hadoop-bundle.jar -> ../parquet/parquet-hadoop-bundle.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 parquet-hadoop.jar -> ../parquet/parquet-hadoop.jar
lrwxrwxrwx 1 root root    30 Apr  6 00:59 parquet-jackson.jar -> ../parquet/parquet-jackson.jar
lrwxrwxrwx 1 root root    33 Apr  6 00:59 parquet-pig-bundle.jar -> ../parquet/parquet-pig-bundle.jar
lrwxrwxrwx 1 root root    26 Apr  6 00:59 parquet-pig.jar -> ../parquet/parquet-pig.jar
lrwxrwxrwx 1 root root    31 Apr  6 00:59 parquet-protobuf.jar -> ../parquet/parquet-protobuf.jar
lrwxrwxrwx 1 root root    33 Apr  6 00:59 parquet-scala_2.10.jar -> ../parquet/parquet-scala_2.10.jar
lrwxrwxrwx 1 root root    35 Apr  6 00:59 parquet-scrooge_2.10.jar -> ../parquet/parquet-scrooge_2.10.jar
lrwxrwxrwx 1 root root    35 Apr  6 00:59 parquet-test-hadoop2.jar -> ../parquet/parquet-test-hadoop2.jar
lrwxrwxrwx 1 root root    29 Apr  6 00:59 parquet-thrift.jar -> ../parquet/parquet-thrift.jar
lrwxrwxrwx 1 root root    28 Apr  6 00:59 parquet-tools.jar -> ../parquet/parquet-tools.jar
drwxr-xr-x 2 root root  4096 Apr  6 00:59 sbin
[root@quickstart hadoop]#


Configuring the Hadoop Node
1.      Create working directory in the Hadoop user path (here I’m using root as the user and creating splunkmr directory under it).
[root@quickstart hadoop]# hadoop fs -ls -R /user/root
drwxr-xr-x   - root supergroup          0 2016-05-13 01:30 /user/root/splunkmr
drwxr-xr-x   - root supergroup          0 2016-05-13 01:30 /user/root/splunkmr/bundles
-rw-r--r--   1 root supergroup   26880000 2016-05-13 01:30 /user/root/splunkmr/bundles/quickstart.cloudera-1463128021.bundle
drwxr-xr-x   - root supergroup          0 2016-05-13 02:22 /user/root/splunkmr/dispatch
drwxr-xr-x   - root supergroup          0 2016-05-13 02:21 /user/root/splunkmr/dispatch/1463131309.36
-rw-r--r--   1 root supergroup          0 2016-05-13 02:21 /user/root/splunkmr/dispatch/1463131309.36/1.hb
drwxr-xr-x   - root supergroup          0 2016-05-13 02:22 /user/root/splunkmr/dispatch/1463131339.37
-rw-r--r--   1 root supergroup          0 2016-05-13 02:22 /user/root/splunkmr/dispatch/1463131339.37/1.hb
drwxr-xr-x   - root supergroup          0 2016-05-13 01:30 /user/root/splunkmr/jars
-rw-r--r--   1 root supergroup     303139 2016-05-13 01:30 /user/root/splunkmr/jars/avro-1.7.4.jar
-rw-r--r--   1 root supergroup     166557 2016-05-13 01:30 /user/root/splunkmr/jars/avro-mapred-1.7.4.jar
-rw-r--r--   1 root supergroup     256241 2016-05-13 01:30 /user/root/splunkmr/jars/commons-compress-1.5.jar
-rw-r--r--   1 root supergroup     163151 2016-05-13 01:30 /user/root/splunkmr/jars/commons-io-2.1.jar
-rw-r--r--   1 root supergroup    9111670 2016-05-13 01:30 /user/root/splunkmr/jars/hive-exec-0.12.0.jar
-rw-r--r--   1 root supergroup    3342729 2016-05-13 01:30 /user/root/splunkmr/jars/hive-metastore-0.12.0.jar
-rw-r--r--   1 root supergroup     709070 2016-05-13 01:30 /user/root/splunkmr/jars/hive-serde-0.12.0.jar
-rw-r--r--   1 root supergroup     275186 2016-05-13 01:30 /user/root/splunkmr/jars/libfb303-0.9.0.jar
-rw-r--r--   1 root supergroup    2664668 2016-05-13 01:30 /user/root/splunkmr/jars/parquet-hive-bundle-1.5.0.jar
-rw-r--r--   1 root supergroup    1251514 2016-05-13 01:30 /user/root/splunkmr/jars/snappy-java-1.0.5.jar
drwxr-xr-x   - root supergroup          0 2016-05-13 01:30 /user/root/splunkmr/packages
-rw-r--r--   1 root supergroup   59506554 2016-05-13 01:30 /user/root/splunkmr/packages/hunk-6.2.2-257696-linux-2.6-x86_64.tgz
[root@quickstart hadoop]#

2.      Create a new directory on HDFS to place data or use any existing data path(I’ve created /data folder and placed a sample data file)
[root@quickstart hadoop]# hadoop fs -ls -R /data
-rw-r--r--   1 root supergroup   59367278 2016-05-13 01:27 /data/Hunkdata.json.gz
[root@quickstart hadoop]#


Configuring the Hadoop Node and HUNK Web UI
3.      Login into Web UI and select Virtual Index.
4.      Configure new provider ( this is to point HUNK to the Hadoop Cluster)





5.      Provide jobtracker URI and HDFS URI and a valid provide name.
get these details from core-site.xml and mapreduce-site.xml, under /usr/lib/hadoop/etc/conf




6.      Configure new virtual inderxer ( this is data are stored on the hdfs )




7.      Once set directly start searching from the search option provided in the Virtual indexer page.