Hadoop HDP Installation Isilon
1. Introduction
This document describes how to create a Hadoop
environment utilizing the Hortonworks Data Platform and EMC Isilon Scale-Out
NAS for HDFS accessible shared storage.
The nodes in Isilon OneFS system work together
as peers in a shared-nothing hardware architecture with no single point of
failure. Each nodes acts as a Hadoop name node and data node, the name node
daemon is a distributed process that runs on all the nodes in the cluster. A
compute client can connect to any node through HDFS.
As nodes are added, the file system expands
dynamically and redistributes data.
2. Environment
This installation guide is appropriate for the
below environment.
·
Apache Ambari 2.X
·
Hortonworks HDP 2.X
·
VMware vSphere 5.5 or
later
·
RHEL/CentOS 6.5 or later
·
Internet Explorer 10 or
later
·
Isilon OneFS 8.0.0
·
Installation
1. Overview
Below is the overview of the installation
process that this document will describe.
·
Confirm prerequisites
·
Install Isilon OneFS
·
Configure Isilon OneFS
·
Use Ambari Manager to
deploy HDP cluster upon Isilon OneFS
·
Validate HDP deployment
·
Confirm
Prerequisites
1. Prepare VMware virtualized environment
Before you start the installation process, a
VMware virtualized environment must be ready to provision the virtual machines
required for this deployment. ESXi 5.5 and later revision is recommend for the
virtualized environment.
· Prepare HDP Ambari Server and Cluster Nodes
Please prepare the HDP Amari server and cluster
nodes based on the instructions in note
"Hadoop.HDP.Installation.Ambari"
3.Isilon OneFS
For low-capacity, non-performance testing of
Isilon, the EMC Isilon OneFS Simulator can be used instead of a cluster of
physical Isilon appliances.
4.Networking
·
10
Gbe Ethernet is required
·
If
using EMC Isilon Simulator, at least two static IP addresses are required, one
for node ext-1 interface, another for the SmartConnect service IP), each
additional Isilon node will require an additional IP address.
·
At
a minimum, you will need to allocate one IP address per Access Zone per Isilon
node.
·
#
of IP addresses = 2 * (# of Isilon Nodes) * (# of Access Zones)
3. Install Isilon OneFS
In this document, Isilon Simulator 8.0.0.1 will
be used to setup a free and non-production use Isilon OneFS 8.0.0 cluster
environment for the deployment of HDP 2.4.2.
You can download Isilon Simulator from this
link: http://www.emc.com/products-solutions/trial-software-download/isilon.htm
For the detailed installation process of Isilon
Simulator, please refer the instruction in note
"Isilon.OneFS.Simulator.Installation"
4. Configure Isilon OneFS
·
Add
the Isilon Simulator nodes hostname and IP information to the named server
configuration file
·
Add
below content to file /var/named/named-forward.zone
hdp-isilon IN
A 192.168.1.100
·
Add below content to
file /var/named/named-reverse.zone
100 IN
PTR
hdp-isilon.bigdata.emc.local.
2.Add license to activate Isilon Simulator HDFS
module, the license key listed in below command will be expired by 7/17/2015.
Before add the license, you need to change the current date in Isilon Simulator
node to apply this license
date 1501010001
isi license licenses activate
ACCEL-34PS2-32FWX-RNIWX-LLADX
isi license licenses list
3.Configure Isilon Simulator Hadoop Setting to
add Ambari Server and Ambari NameNode information
·
Goto
Protocols->Hadoop (HDFS) ->Ambari Server Setting, add Ambari Server &
Ambari Name Node FQDN name.
Ambari Server:
hdp-ambari.bigdata.emc.local
Ambari NameNode:
hdp-isilon.bigdata.emc.local
4.Run Isilon Hadoop Tools scripts on Isilon
Simulator node to create required users and directories
·
Download
Isilon Hadoop Tools script from https://github.com/claudiofahey/isilon-hadoop-tools/releases
·
Upload
isilon_create_users.sh and isilon_create_directories.sh on Isilon Simulator
node
·
Run
these two scripts
bash ./isilon_create_users.sh
--dist hwx
bash
./isilon_create_directories.sh --dist hwx --fixperm
5.Run below commands to map the hdfs user to the
Isilon super user, this will allow the hdfs user to chown all files
isi zone zones modify System
--user-mapping-rules="hdfs=>root"
isi services hdfs disable
isi services hdfs enable
5. Use Ambari Manager to deploy HDP cluster upon
Isilon OneFS
·
From
the Ambari Welcome page, choose Launch Install Wizard.
·
Name
your cluster
·
Select
Stack HDP 2.4
·
Expand
Advanced Repository Options to set the BASE URL of the repository of RHEL
·
Setup
Installation Options
·
Target
Hosts
·
Input
the target hosts without Isilon OneFS nodes and click “Next” button to deploy the Ambari Agent to your HDP cluster nodes and
register them
·
Once
the Ambari Agent has been deployed and registered, click “Back” button
·
Now
add the Isilon SmartConnect or Isilon node IP address of the Isilon cluster to
the list of target hosts
·
Check
the box “Perform manual
registration on hosts and do not use SSH”
·
Click
the "Next" button, you should see that Ambari agents
on all hosts including Isilon become registered
·
Host
Registration Information
·
Confirm
Hosts
·
Ensure
there is no errors and warnings for host installation and validation
·
Choose
Services
·
Unselect
(remove) Nagios and Ganglia
from the list of services to install
·
Ganglia
will not function with an Isilon cluster without additional configuration
beyond what is available through Ambari.
·
Assign
Masters (Follow the mapping listed below)
·
Isilon
OneFS cluster node
·
NameNode
·
SNameNode
·
HDP
Master Compute node
·
All
other master components
·
Assign
Slaves and Clients
·
Isilon
OneFS cluster node
·
DataNode
·
HDP
Master Compute node
·
Client
·
HDP
Worker Compute nodes
·
NodeManager
·
RegionServer
·
Supervisor
·
Customize
Services
·
Assign
passwords to Hive, Oozie and any other selected services that required them
·
Goto
“HDFS->Advanced->Advanced
hdfs-site” and change the webhdfs
port "dfs.namenode.http-address"
from 50070 to 8082
·
Goto
“YARN->Advanced”, set yarn.timeline-service.store-class to org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
·
Check
that all local data directories are within /data/1, /data/2, etc. The following
settings should be checked.
ONLY REQUIRED FOR BDE INSTALLATION
·
YARN Node Manager
log-dirs
·
YARN Node Manager
local-dirs
·
HBase local directory
·
ZooKeeper directory
·
Oozie Data Dir
·
Storm storm.local.dir
Review
Carefully review your configuration and then
click "Deploy"
Complete
Installation
Validate
HDP deployment
·
Login
HDP Master Compute node and run below command to validate Hadoop Cluster
clear &&
hdfs dfs -ls / &&
hdfs dfs -put -f /etc/hosts
/tmp &&
hdfs dfs -cat /tmp/hosts &&
hdfs dfs -rm -skipTrash
/tmp/hosts &&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10
1000 &&
echo "Done"
4. References:
·
EMC Isilon Hadoop
Starter Kit for Hortonworks
2. EMC Isilon Best Practices for Hadoop Data
Storage