Wednesday, March 18, 2015

Hortonworks data platform manual installation

Hadoop HDP Installation Manually
·         Meeting Minimum System Requirements
1.     Hardware Requirements
·         at least 2.5GB disk space
2.     Operating System Requirements
·         64-bit RHEL/CentOS 6.4 +
3.     Software Requirements
·         yum, rpm, scp, curl, wget, unzip, chkconfig, tar
·         yum-utils, createrepo, reposync
4.     JDK Requirements (Required on all cluster nodes)
·         Oracle JDK
·         Oracle JDK 1.7 or 1.8 64-bit
·         Oracle JCE (Java Cryptography Extension, required for enabling Kerberos Authentication)
Extract the policy jars to $JAVA_HOME/jre/lib/security/ using below commands
unzip -o -j -q jce_policy-8.zip -d /usr/jdk64/jdk1.8.0_60/jre/lib/security/
·    Open JDK 7 64-bit (DOES NOT WORK ON SLES)
·         Open JDK 7
·         JCE is not required for most of OpenJDK as it is included by default
             Metastore Database Requirement
If you are installing Hive and HCatalog or Oozie, you must install a database to store metadata information in the metastore. HDP supports the following databases for the metastore:
·    Postgres 8.x, 9.3+
·    MySQL 5.6
·    Oracle 11g R2
·    SQL Server 2008 R2+
·         Start Installation
             Decide on Deployment Pattern/Type
·         Evaluation: Deploy all of HDP on a single host
·         Production:  Use at least 4 hosts(one master hosts, three slaves)
·         Production: Use 2 master nodes, 3 worker nodes
             Collect Information
·         FQDN for each host (hostname -f)
·         hdp-master01.bigdata.emc.com
·         hdp-master02.bigdata.emc.com
·         hdp-worker01.bigdata.emc.com
·         hdp-worker02.bigdata.emc.com
·         hdp-worker03.bigdata.emc.com
·         If you install Hive/HCatalog, you need the hostname, database name, username and password for the metastore instance
·         Host Name: hdp-master01.bigdata.emc.com
·         Database Name: hive-metastore
·         User Name: hive
·         Password: Password01!
             Prepare the Environment
                   Setup dedicated Named Server and NTP server
·         Please refer Hadoop.DNSServer.Setup and Hadoop.NTPServer.Setup
                   Setup HDP YUM local repository server
·         Please refer Hadoop.HDP.LocalYumRepository
                   Prepare HDP cluster node OVF template
 Install RHEL 6.6 (Minimal Installation)
 Configure YUM DVD repository
 Install NTP client and setup NTP server
yum install ntp
chkconfig ntpd on
service ntpd restart
vi /etc/ntp.conf
server ntp.bigdata.emc.com
·         Install Oracle JDK 1.8
·         Please refer Linux.OracleJRE.Installation
 Install Oracle JCE
·         Install it using below commands
unzip -o -j -q jce_policy-8.zip -d /usr/jdk64/jdk1.8.0_60/jre/lib/security/
 Disable SELinux
·         Temporarily disable
             check current status using the first command, if the result is enabled/enforcing, proceed second command to disable it
getenforce
setenforce 0
·         Permanently disable
·         edit configuration file /etc/sysconfig/selinux
SELINUX=disabled
 Disable IPTables
service iptables stop
chkconfig iptables off
h.     Prepare Companion Files
·         download companion files
·         Extract it and Define Environment Parameters
i.        Create System Users and Groups
There users will automatically be setup if you choose to install HDP components using the RPMs.
4.Determine HDP Memory Configuration Settings
·         Running HDP utility script
python yarn-utils.py -c 16 -m 64 -d 4 -k True

-c CORES: The number of cores on each host.
-m MEMORY: The amount of memory on each host in GB.
-d DISKS: The number of disks on each host.
-k HBASE: "True" if HBase is installed, "False" if not.
5.Determine HDP Memory Configuration Settings
6.Allocate Adequate Log Space for HDP
Logs are an important part of managing and operating your HDP cluster. The directories and disks that you assign for logging in HDP must have enough space to maintain logs during HDP operations. Allocate at least 10 GB of free space for any disk you want to use for HDP logging.