Hadoop.HDP.Installation.Ambari
1. Introduction
This guide is intended to address Hortonworks
HDP deployment using Apache Ambari 2.X on VMware virtualized environment.
2. Environment
This installation guide is appropriate for below
environment.
·
Apache Ambari Server 2.X
·
Hortonworks HDP 2.x
·
VMware vSphere 5.5 or
later
·
RHEL/CentOS 6.5 or later
·
Internet Explorer 10 or
later/Firefox 18 or later/Chrome 26 or later
·
Installation
1. Overview
Below is the overview of the installation
process that this document will describe.
·
Confirm prerequisites
·
Setup one VM for the DNS
and NTP server for HDP cluster
·
Setup one VM for Ambari
Server
·
Setup 4 VM for HDP
cluster node servers
·
Setup SSH Public Key
authentication on Ambari Server
·
Use Ambari Manager to
deploy HDP cluster
·
Confirm
Prerequisites
1. Prepare VMware virtualized environment
Before you start the installation process, a
VMware virtualized environment must be ready to provision the virtual machines
required for this deployment. ESXi 5.5 and later revision is recommend for the
virtualized environment.
· Determine Stack Compatibility
Check Ambari & HDP Compatibility Matrix to
make sure your Ambari and HDP stack versions are compatible.
Here is the link for compatibility matrix:
3.Prepare the installation images
·
RHEL/CentOS
6.5 or 6.6 DVD ISO files
·
Windows
Installation ISO files (Win7, Win2008, Win2012, etc)
·
Internet
Explorer 11 installation image
·
Tarball
Repository Files
·
Ambari
2.2.2.0
·
HDP 2.4.2.0
·
HDP Utility 1.1.0.20
· Meets System Requirements
·
Operating
System Requirement (RHEL/CentOS 6.x/7.x)
·
Browser
Requirements (IE 10/Firefox 18/Chrome 26)
·
HDP
Cluster Nodes Requirements
·
OpenSSL
(v1.01, build 16 or later) ~ version check command: openssl version && rpm -qa | grep openssl
·
Python
(RHEL/CentOS 6: 2.6.*, RHEL/CentOS 7: 2.7.*) ~ version check command: python -V
·
JDK
Requirements (Oracle JDK + JCE)
·
Oracle
JDK 1.8 64-bit (minimum JDK 1.8_60) (default)
·
Oracle
JDK 1.7 64-bit (minimum JDK 1.7_67)
·
JDK
Requirements (Open JDK)
·
Open
JDK 7
·
JCE
is not required for most of OpenJDK as it is included by default
·
Memory Requirements
The Ambari host should have at least 1 GB RAM,
with 500 MB free. To check available memory on any host, run:
free -m
Check Memory Requirement Matrix to confirm the
memory requirements for your cluster (4GB RAM can support 100 hosts)
· Collect Server Information
ns.bigdata.emc.com 192.168.1.1
ntp.bigdata.emc.com 192.168.1.1
yumrepo.bigdata.emc.com 192.168.1.1
jump.bigdata.emc.com 192.168.1.5
hdp-ambari.bigdata.emc.com 192.168.1.10
hdp-master01.bigdata.emc.com 192.168.1.11
hdp-master02.bigdata.emc.com 192.168.1.12
hdp-worker01.bigdata.emc.com 192.168.1.21
hdp-worker02.bigdata.emc.com 192.168.1.22
hdp-worker03.bigdata.emc.com 192.168.1.23
hdp-worker04.bigdata.emc.com 192.168.1.24
3. Setup one VM as DNS, NTP, YUM Repository
Server
Once all the prerequisites are ready, you can
start the installation process. The first step is to setup one VM which can be
used as the DNS server, NTP server and yum repository server for HDP hadoop
cluster deployment.
1.Create the Linux VM using below specification
vCPU
|
2 Cores
|
RAM
|
4GB
|
Operating System
|
RHEL 6.6
|
OS Disk Capacity
|
50GB
|
IP Address
|
192.168.1.1/255.255.255.0
|
Host Name
|
ns.bigdata.emc.local
|
DNS Server
|
192.168.1.1
|
Default Search Domain
|
bigdata.emc.local
|
2.Install BIND package to setup DNS server
1. Install bind package
yum install bind
yum install bind-libs
yum install bind-utils
2. Configure daemon to start automatically and
disable firewall
chkconfig named on
chkconfig iptables off
service iptables stop
3. Edit configuration file /etc/named.conf
<<named.conf.conf>>
4. Edit configuration file
/var/named/named-forward.zone
<<named-forward.zone.zone>>
5. Edit configuration file
/var/named/named-reverse.zone
<<named-reverse.zone.zone>>
6. Correct the configuration files owner and
permission
ls -l /var/named
chown named:named
/var/named/named-forw.zone
chown named:named
/var/named/named-rev.zone
ls -l /var/named
7. Validate the configuration files
named-checkconf
/etc/named.conf
named-checkzone
bigdata.emc.local /var/named/named-forward.zone
named-checkzone bigdata.emc.local
/var/named/named-reverse.zone
8. Restart the service
service named restart
3.Install ntp package to setup the NTP server
1. Setup the time zone
cp
/usr/shared/zoneinfo/Asia/Shanghai /etc/localtime
2. Install ntp package
yum install ntp
3. Edit configuration file /etc/ntp.conf to add
below content to setup the local clock server
server 127.127.1.0
fudge 127.127.1.0 stratum 10
4. Configure daemon to start automatically
service ntpd start
chkconfig ntpd on
4.Install httpd package and setup yum repository
server
1. Install yum utility and httpd package
yum -y install yum-utils
yum -y install createrepo
yum -y install httpd
2. Configure daemon to start automatically
service httpd start
chkconfig httpd on
3. Create HDP tarball repository mount point and
mount it (cdrom contains the tarball files)
mkdir -p /var/www/html/hdp
mount /dev/cdrom
/var/www/html/hdp
4. Create RHEL DVD repository mount point and
mount it (cdrom contains the tarball files)
mkdir -p /var/www/html/dvd
mount /dev/cdrom1
/var/www/html/dvd
5. Create file /etc/yum.repos.d/ambari.repo
<<ambari.repo.repo>>
6. Validate yum repository
yum repolist all
4. Setup one VM as Ambari Server
1. Create the Linux VM using below specification
vCPU
|
2 Cores
|
RAM
|
4GB
|
Operating System
|
RHEL 6.6
|
OS Disk Capacity
|
50GB
|
IP Address
|
192.168.1.10/255.255.255.0
|
Host Name
|
hdp-ambari.bigdata.emc.local
|
DNS Server
|
192.168.1.1
|
Default Search Domain
|
bigdata.emc.local
|
2.Install ambari-server package to setup Ambari
Server
1. Create Ambari Repository configuration file
/etc/yum.repos.d/ambari.repo
2. Install Ambari Server
yum install ambari-server
3. Setup Ambari Server
ambari-server setup
--java-home /usr/lib/jvm/jre-1.7.0-openjdk.x86_64
4. Configure Ambari Server to start automatically
service ambari-server start
chkconfig ambari-server on
5. Setup 4 or 6 VMs for HDP cluster node server
1. Create 6 Linux VMs using below specification
vCPU
|
2 Cores
|
RAM
|
4GB
|
Operating System
|
RHEL 6.6
|
OS Disk Capacity
|
50GB
|
IP Address
|
192.168.1.11/255.255.255.0
192.168.1.12/255.255.255.0
192.168.1.21/255.255.255.0
192.168.1.22/255.255.255.0
192.168.1.23/255.255.255.0
192.168.1.24/255.255.255.0
|
Host Name
|
hdp-master01.bigdata.emc.local
hdp-master02.bigdata.emc.local
hdp-worker01.bigdata.emc.local
hdp-worker02.bigdata.emc.local
hdp-worker03.bigdata.emc.local
hdp-worker04.bigdata.emc.local
|
DNS Server
|
192.168.1.1
|
Default Search Domain
|
bigdata.emc.local
|
2.Update the Inode Count on all HDP cluster
nodes
1. Edit /etc/sysctl.conf and add below content
fs.file-max = 65536
2. Edit /etc/security/limits.conf and add below
content
* soft
nofile 10000
* hard
nofile 10000
3.Setup NTP server on all HDP cluster nodes
1. Install ntp package
yum -y install ntp
2. Edit configuration file /etc/ntp.conf to add
below content to setup the local clock server
server 192.168.1.1
3. Configure ntpd daemon to start automatically
service ntpd start
chkconfig ntpd on
4.Setup DNS server on all HDP cluster nodes
1. vi /etc/resolv.conf and add below content
nameserver 192.168.1.1
search bigdata.emc.local
5.Disable SELinux on all HDP cluster nodes
1. vi /etc/selinux/config and change property
SELINUX to disable
SELINUX=disabled
6.Disable Transparent Huge Pages on all HDP
cluster nodes
1. vi /etc/grub.conf and add below contents
transparent_hugepage=never
2. vi /etc/rc.d/rc.local and add below contents
if test -f
/sys/kernel/mm/transparent_hugepage/enabled; then
echo never >
/sys/kernel/mm/redhat_transparent_hugepage/enabled
fi
if test -f
/sys/kernel/mm/transparent_hugepage/defrag; then
echo never >
/sys/kernel/mm/redhat_transparent_hugepage/defrag
fi
7.Set umask 0022
1. Set the default operating system file and
directory permissions to 0022 (022) in /etc/profile, it is the default setting
for RHEL/CentOS.
echo umask 0022 >> /etc/profile
8.Install Oracle JDK 1.8 (Optional, can use system default Open JDK 1.7)
1. Install JDK package
yum install jdk1.8.0_77
alternatives --install
/usr/bin/jar jar /usr/java/jdk1.8.0_77/bin/jar 1
alternatives --config java
alternatives --config jar
alternatives --config javac
2. Install JCE
unzip -o -j -q
jce_policy-8.zip -d /usr/java/jdk1.8.0_77/jre/lib/security/
9.Disable Iptables on all HDP cluster nodes
service iptables stop
chkconfig iptables of
6. Setup SSH Public Key authentication on Ambari
Server
1. Login Ambari Server using root account
2. Generate SSH public key and private key
ssh-keygen &&
cd /root/.ssh &&
cat id_rsa.pub >>
authorized_keys &&
chmod 600
/root/.ssh/authorized_keys &&
echo "Done"
3.Copy the SSH public key file id_rsa.pub and
authorized_keys to the all HDP cluster nodes
ssh root@hdp-master01
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-master01:/root/.ssh/ &&
ssh root@hdp-master02
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-master02:/root/.ssh/ &&
ssh root@hdp-worker01
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-worker01:/root/.ssh/ &&
ssh root@hdp-worker02
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-worker02:/root/.ssh/ &&
ssh root@hdp-worker03
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-worker03:/root/.ssh/ &&
ssh root@hdp-worker04
"mkdir -p /root/.ssh && chmod 700 /root/.ssh" && scp
/root/.ssh/authorized_keys root@hdp-worker04:/root/.ssh/ &&
echo "Done"
4.Test the SSH public key authentication
clear &&
ssh root@hdp-master01
"hostname" &&
ssh root@hdp-master02
"hostname" &&
ssh root@hdp-worker01
"hostname" &&
ssh root@hdp-worker02
"hostname" &&
ssh root@hdp-worker03
"hostname" &&
ssh root@hdp-worker04
"hostname" &&
echo "Done"
5.Verify DNS and NTP configuration on all nodes
using SSH
clear &&
ssh root@hdp-ambari "nslookup \$(hostname) && ntpq
-p" &&
ssh root@hdp-master01
"nslookup \$(hostname) && ntpq -p" &&
ssh root@hdp-master02
"nslookup \$(hostname) && ntpq -p" &&
ssh root@hdp-worker01
"nslookup \$(hostname) && ntpq -p" &&
ssh root@hdp-worker02
"nslookup \$(hostname) && ntpq -p" &&
ssh root@hdp-worker03
"nslookup \$(hostname) && ntpq -p" &&
ssh root@hdp-worker04
"nslookup \$(hostname) && ntpq -p" &&
echo "Done"
6.Retain a copy of the SSH Private Key on the
machine (jump server) from which you will run the web-based Ambari Install
Wizard.
7. Use Ambari Manager to deploy HDP cluster
2. From the Ambari Welcome page, choose Launch
Install Wizard.
3. Name your cluster
4. Select Stack HDP 2.4
5. Expand Advanced Repository Options to set the
BASE URL of the repository of RHEL
6.Setup Installation Options
Target Hosts
Host Registration Information
7.Confirm Hosts
8.Ensure there is no errors and warnings for
host installation and validation
9.Choose Services
10. Assign Masters
11. Assign Slaves and Clients
12. Customized Services
13. Review
14. Summary
8. Validation
1. Login HDP Master Compute node and run below
command to validate Hadoop Cluster
hdfs dfs -ls /
hdfs dfs -put -f /etc/hosts
/tmp
hdfs dfs -cat /tmp/hosts
hdfs dfs -rm -skipTrash
/tmp/hosts
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10
1000
2.Run TeraSort to test performance
clear &&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar
teragen 10485760 /tmp/TeraGen.1G
&&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar terasort
/tmp/TeraGen.1G /tmp/TeraSort.1G &&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar
teravalidate /tmp/TeraSort.1G
/tmp/TeraValidate.1G &&
echo "Done"
clear &&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar
TestDFSIO -write -nrFiles 10 -fileSize 1 &&
yarn jar
/usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar
TestDFSIO -read -nrFiles 10 -fileSize 1
&&
echo "Done"
4. Reference:
·
HDP
2.2.x Installation Book
·
Hortonworks
Ambari 2.2.2.0 Repository Download
·
Hortonworks
HDP Repository Download
·
Resolving
Cluster Deployment Problems