Thursday, March 12, 2015

Hadoop HDP Installation Isilon

Hadoop HDP Installation Isilon

1.     Introduction
This document describes how to create a Hadoop environment utilizing the Hortonworks Data Platform and EMC Isilon Scale-Out NAS for HDFS accessible shared storage.

The nodes in Isilon OneFS system work together as peers in a shared-nothing hardware architecture with no single point of failure. Each nodes acts as a Hadoop name node and data node, the name node daemon is a distributed process that runs on all the nodes in the cluster. A compute client can connect to any node through HDFS.

As nodes are added, the file system expands dynamically and redistributes data.

2.     Environment
This installation guide is appropriate for the below environment.
·         Apache Ambari 2.X
·         Hortonworks HDP 2.X
·         VMware vSphere 5.5 or later
·         RHEL/CentOS 6.5 or later
·         Internet Explorer 10 or later
·         Isilon OneFS 8.0.0

·         Installation
1.     Overview
Below is the overview of the installation process that this document will describe.
·    Confirm prerequisites
·    Install Isilon OneFS
·    Configure Isilon OneFS
·    Use Ambari Manager to deploy HDP cluster upon Isilon OneFS
·    Validate HDP deployment
·         Confirm Prerequisites
1.     Prepare VMware virtualized environment
Before you start the installation process, a VMware virtualized environment must be ready to provision the virtual machines required for this deployment. ESXi 5.5 and later revision is recommend for the virtualized environment.
·    Prepare HDP Ambari Server and Cluster Nodes
Please prepare the HDP Amari server and cluster nodes based on the instructions in note "Hadoop.HDP.Installation.Ambari"
3.Isilon OneFS
For low-capacity, non-performance testing of Isilon, the EMC Isilon OneFS Simulator can be used instead of a cluster of physical Isilon appliances.
4.Networking
·         10 Gbe Ethernet is required
·         If using EMC Isilon Simulator, at least two static IP addresses are required, one for node ext-1 interface, another for the SmartConnect service IP), each additional Isilon node will require an additional IP address.
·         At a minimum, you will need to allocate one IP address per Access Zone per Isilon node.
·         # of IP addresses = 2 * (# of Isilon Nodes) * (# of Access Zones)
3.     Install Isilon OneFS
In this document, Isilon Simulator 8.0.0.1 will be used to setup a free and non-production use Isilon OneFS 8.0.0 cluster environment for the deployment of HDP 2.4.2.
You can download Isilon Simulator from this link: http://www.emc.com/products-solutions/trial-software-download/isilon.htm

For the detailed installation process of Isilon Simulator, please refer the instruction in note "Isilon.OneFS.Simulator.Installation"

4.     Configure Isilon OneFS
·         Add the Isilon Simulator nodes hostname and IP information to the named server configuration file
·         Add below content to file /var/named/named-forward.zone
hdp-isilon      IN      A       192.168.1.100
·         Add below content to file /var/named/named-reverse.zone
100             IN      PTR     hdp-isilon.bigdata.emc.local.
2.Add license to activate Isilon Simulator HDFS module, the license key listed in below command will be expired by 7/17/2015. Before add the license, you need to change the current date in Isilon Simulator node to apply this license
date 1501010001
isi license licenses activate ACCEL-34PS2-32FWX-RNIWX-LLADX
isi license licenses list
3.Configure Isilon Simulator Hadoop Setting to add Ambari Server and Ambari NameNode information
·         Open OneFS Web UI https://192.168.1.100:8080/ and login using account root and its password
·         Goto Protocols->Hadoop (HDFS) ->Ambari Server Setting, add Ambari Server & Ambari Name Node FQDN name.
Ambari Server: hdp-ambari.bigdata.emc.local
Ambari NameNode: hdp-isilon.bigdata.emc.local
4.Run Isilon Hadoop Tools scripts on Isilon Simulator node to create required users and directories
·         Download Isilon Hadoop Tools script from https://github.com/claudiofahey/isilon-hadoop-tools/releases
·         Upload isilon_create_users.sh and isilon_create_directories.sh on Isilon Simulator node
·         Run these two scripts
bash ./isilon_create_users.sh --dist hwx
bash ./isilon_create_directories.sh --dist hwx --fixperm
5.Run below commands to map the hdfs user to the Isilon super user, this will allow the hdfs user to chown all files
isi zone zones modify System --user-mapping-rules="hdfs=>root"
isi services hdfs disable
isi services hdfs enable

5.     Use Ambari Manager to deploy HDP cluster upon Isilon OneFS
·         Login to Apache Ambari http://192.168.1.10:8080 with default user name and password admin/admin
·         From the Ambari Welcome page, choose Launch Install Wizard.
·         Name your cluster
·         Select Stack HDP 2.4
·         Expand Advanced Repository Options to set the BASE URL of the repository of RHEL
·         Setup Installation Options
·         Target Hosts
·         Input the target hosts without Isilon OneFS nodes and click “Next” button to deploy the Ambari Agent to your HDP cluster nodes and register them
·         Once the Ambari Agent has been deployed and registered, click “Back” button
·         Now add the Isilon SmartConnect or Isilon node IP address of the Isilon cluster to the list of target hosts
·         Check the box “Perform manual registration on hosts and do not use SSH
·         Click the "Next" button, you should see that Ambari agents on all hosts including Isilon become registered
·         Host Registration Information
·         Confirm Hosts
·         Ensure there is no errors and warnings for host installation and validation
·         Choose Services
·         Unselect (remove) Nagios and Ganglia from the list of services to install
·         Ganglia will not function with an Isilon cluster without additional configuration beyond what is available through Ambari.
·         Assign Masters (Follow the mapping listed below)
·         Isilon OneFS cluster node
·         NameNode
·         SNameNode
·         HDP Master Compute node
·         All other master components
·         Assign Slaves and Clients
·         Isilon OneFS cluster node
·         DataNode
·         HDP Master Compute node
·         Client
·         HDP Worker Compute nodes
·         NodeManager
·         RegionServer
·         Supervisor
·         Customize Services
·         Assign passwords to Hive, Oozie and any other selected services that required them
·         Goto “HDFS->Advanced->Advanced hdfs-site” and change the webhdfs port  "dfs.namenode.http-address"  from 50070 to 8082
·         Goto “YARN->Advanced”, set yarn.timeline-service.store-class to org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
·         Check that all local data directories are within /data/1, /data/2, etc. The following settings should be checked.
ONLY REQUIRED FOR BDE INSTALLATION
·         YARN Node Manager log-dirs
·         YARN Node Manager local-dirs
·         HBase local directory
·         ZooKeeper directory
·         Oozie Data Dir
·         Storm storm.local.dir
                   Review
Carefully review your configuration and then click "Deploy"
                   Complete Installation

             Validate HDP deployment
·         Login HDP Master Compute node and run below command to validate Hadoop Cluster
clear &&
hdfs dfs -ls /  &&
hdfs dfs -put -f /etc/hosts /tmp  &&
hdfs dfs -cat /tmp/hosts  &&
hdfs dfs -rm -skipTrash /tmp/hosts  &&
yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10 1000  &&
echo "Done"



4.     References:
·         EMC Isilon Hadoop Starter Kit for Hortonworks
2.      EMC Isilon Best Practices for Hadoop Data Storage




No comments:

Post a Comment