Thursday, March 26, 2015

Hortonwork HDP default password

Hadoop.HDP.DefaultPassword
Ambari Metastore default database configuration

WARNING: Before starting Ambari Server, you must run the following DDL against the database to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-Postgres-CREATE.sql

Psql
create user ambari with password 'bigdata';
create database ambari owner ambari;
grant all privileges on database ambari to ambari;

\list
\connect ambari
alter database ambari owner to ambari;

psql -U ambari -d ambari -f /var/lib/ambari-server/resources/Ambari-DDL-Postgres-CREATE.sql


Wednesday, March 18, 2015

Hortonworks data platform manual installation

Hadoop HDP Installation Manually
·         Meeting Minimum System Requirements
1.     Hardware Requirements
·         at least 2.5GB disk space
2.     Operating System Requirements
·         64-bit RHEL/CentOS 6.4 +
3.     Software Requirements
·         yum, rpm, scp, curl, wget, unzip, chkconfig, tar
·         yum-utils, createrepo, reposync
4.     JDK Requirements (Required on all cluster nodes)
·         Oracle JDK
·         Oracle JDK 1.7 or 1.8 64-bit
·         Oracle JCE (Java Cryptography Extension, required for enabling Kerberos Authentication)
Extract the policy jars to $JAVA_HOME/jre/lib/security/ using below commands
unzip -o -j -q jce_policy-8.zip -d /usr/jdk64/jdk1.8.0_60/jre/lib/security/
·    Open JDK 7 64-bit (DOES NOT WORK ON SLES)
·         Open JDK 7
·         JCE is not required for most of OpenJDK as it is included by default
             Metastore Database Requirement
If you are installing Hive and HCatalog or Oozie, you must install a database to store metadata information in the metastore. HDP supports the following databases for the metastore:
·    Postgres 8.x, 9.3+
·    MySQL 5.6
·    Oracle 11g R2
·    SQL Server 2008 R2+
·         Start Installation
             Decide on Deployment Pattern/Type
·         Evaluation: Deploy all of HDP on a single host
·         Production:  Use at least 4 hosts(one master hosts, three slaves)
·         Production: Use 2 master nodes, 3 worker nodes
             Collect Information
·         FQDN for each host (hostname -f)
·         hdp-master01.bigdata.emc.com
·         hdp-master02.bigdata.emc.com
·         hdp-worker01.bigdata.emc.com
·         hdp-worker02.bigdata.emc.com
·         hdp-worker03.bigdata.emc.com
·         If you install Hive/HCatalog, you need the hostname, database name, username and password for the metastore instance
·         Host Name: hdp-master01.bigdata.emc.com
·         Database Name: hive-metastore
·         User Name: hive
·         Password: Password01!
             Prepare the Environment
                   Setup dedicated Named Server and NTP server
·         Please refer Hadoop.DNSServer.Setup and Hadoop.NTPServer.Setup
                   Setup HDP YUM local repository server
·         Please refer Hadoop.HDP.LocalYumRepository
                   Prepare HDP cluster node OVF template
 Install RHEL 6.6 (Minimal Installation)
 Configure YUM DVD repository
 Install NTP client and setup NTP server
yum install ntp
chkconfig ntpd on
service ntpd restart
vi /etc/ntp.conf
server ntp.bigdata.emc.com
·         Install Oracle JDK 1.8
·         Please refer Linux.OracleJRE.Installation
 Install Oracle JCE
·         Install it using below commands
unzip -o -j -q jce_policy-8.zip -d /usr/jdk64/jdk1.8.0_60/jre/lib/security/
 Disable SELinux
·         Temporarily disable
             check current status using the first command, if the result is enabled/enforcing, proceed second command to disable it
getenforce
setenforce 0
·         Permanently disable
·         edit configuration file /etc/sysconfig/selinux
SELINUX=disabled
 Disable IPTables
service iptables stop
chkconfig iptables off
h.     Prepare Companion Files
·         download companion files
·         Extract it and Define Environment Parameters
i.        Create System Users and Groups
There users will automatically be setup if you choose to install HDP components using the RPMs.
4.Determine HDP Memory Configuration Settings
·         Running HDP utility script
python yarn-utils.py -c 16 -m 64 -d 4 -k True

-c CORES: The number of cores on each host.
-m MEMORY: The amount of memory on each host in GB.
-d DISKS: The number of disks on each host.
-k HBASE: "True" if HBase is installed, "False" if not.
5.Determine HDP Memory Configuration Settings
6.Allocate Adequate Log Space for HDP
Logs are an important part of managing and operating your HDP cluster. The directories and disks that you assign for logging in HDP must have enough space to maintain logs during HDP operations. Allocate at least 10 GB of free space for any disk you want to use for HDP logging.


Thursday, March 12, 2015

Hadoop HDP Installation Isilon

Hadoop HDP Installation Isilon

1.     Introduction
This document describes how to create a Hadoop environment utilizing the Hortonworks Data Platform and EMC Isilon Scale-Out NAS for HDFS accessible shared storage.

The nodes in Isilon OneFS system work together as peers in a shared-nothing hardware architecture with no single point of failure. Each nodes acts as a Hadoop name node and data node, the name node daemon is a distributed process that runs on all the nodes in the cluster. A compute client can connect to any node through HDFS.

As nodes are added, the file system expands dynamically and redistributes data.

2.     Environment
This installation guide is appropriate for the below environment.
·         Apache Ambari 2.X
·         Hortonworks HDP 2.X
·         VMware vSphere 5.5 or later
·         RHEL/CentOS 6.5 or later
·         Internet Explorer 10 or later
·         Isilon OneFS 8.0.0

·         Installation
1.     Overview
Below is the overview of the installation process that this document will describe.
·    Confirm prerequisites
·    Install Isilon OneFS
·    Configure Isilon OneFS
·    Use Ambari Manager to deploy HDP cluster upon Isilon OneFS
·    Validate HDP deployment
·         Confirm Prerequisites
1.     Prepare VMware virtualized environment
Before you start the installation process, a VMware virtualized environment must be ready to provision the virtual machines required for this deployment. ESXi 5.5 and later revision is recommend for the virtualized environment.
·    Prepare HDP Ambari Server and Cluster Nodes
Please prepare the HDP Amari server and cluster nodes based on the instructions in note "Hadoop.HDP.Installation.Ambari"
3.Isilon OneFS
For low-capacity, non-performance testing of Isilon, the EMC Isilon OneFS Simulator can be used instead of a cluster of physical Isilon appliances.
4.Networking
·         10 Gbe Ethernet is required
·         If using EMC Isilon Simulator, at least two static IP addresses are required, one for node ext-1 interface, another for the SmartConnect service IP), each additional Isilon node will require an additional IP address.
·         At a minimum, you will need to allocate one IP address per Access Zone per Isilon node.
·         # of IP addresses = 2 * (# of Isilon Nodes) * (# of Access Zones)
3.     Install Isilon OneFS
In this document, Isilon Simulator 8.0.0.1 will be used to setup a free and non-production use Isilon OneFS 8.0.0 cluster environment for the deployment of HDP 2.4.2.
You can download Isilon Simulator from this link: http://www.emc.com/products-solutions/trial-software-download/isilon.htm

For the detailed installation process of Isilon Simulator, please refer the instruction in note "Isilon.OneFS.Simulator.Installation"

4.     Configure Isilon OneFS
·         Add the Isilon Simulator nodes hostname and IP information to the named server configuration file
·         Add below content to file /var/named/named-forward.zone
hdp-isilon      IN      A       192.168.1.100
·         Add below content to file /var/named/named-reverse.zone
100             IN      PTR     hdp-isilon.bigdata.emc.local.
2.Add license to activate Isilon Simulator HDFS module, the license key listed in below command will be expired by 7/17/2015. Before add the license, you need to change the current date in Isilon Simulator node to apply this license
date 1501010001
isi license licenses activate ACCEL-34PS2-32FWX-RNIWX-LLADX
isi license licenses list
3.Configure Isilon Simulator Hadoop Setting to add Ambari Server and Ambari NameNode information
·         Open OneFS Web UI https://192.168.1.100:8080/ and login using account root and its password
·         Goto Protocols->Hadoop (HDFS) ->Ambari Server Setting, add Ambari Server & Ambari Name Node FQDN name.
Ambari Server: hdp-ambari.bigdata.emc.local
Ambari NameNode: hdp-isilon.bigdata.emc.local
4.Run Isilon Hadoop Tools scripts on Isilon Simulator node to create required users and directories
·         Download Isilon Hadoop Tools script from https://github.com/claudiofahey/isilon-hadoop-tools/releases
·         Upload isilon_create_users.sh and isilon_create_directories.sh on Isilon Simulator node
·         Run these two scripts
bash ./isilon_create_users.sh --dist hwx
bash ./isilon_create_directories.sh --dist hwx --fixperm
5.Run below commands to map the hdfs user to the Isilon super user, this will allow the hdfs user to chown all files
isi zone zones modify System --user-mapping-rules="hdfs=>root"
isi services hdfs disable
isi services hdfs enable

5.     Use Ambari Manager to deploy HDP cluster upon Isilon OneFS
·         Login to Apache Ambari http://192.168.1.10:8080 with default user name and password admin/admin
·         From the Ambari Welcome page, choose Launch Install Wizard.
·         Name your cluster
·         Select Stack HDP 2.4
·         Expand Advanced Repository Options to set the BASE URL of the repository of RHEL
·         Setup Installation Options
·         Target Hosts
·         Input the target hosts without Isilon OneFS nodes and click “Next” button to deploy the Ambari Agent to your HDP cluster nodes and register them
·         Once the Ambari Agent has been deployed and registered, click “Back” button
·         Now add the Isilon SmartConnect or Isilon node IP address of the Isilon cluster to the list of target hosts
·         Check the box “Perform manual registration on hosts and do not use SSH
·         Click the "Next" button, you should see that Ambari agents on all hosts including Isilon become registered
·         Host Registration Information
·         Confirm Hosts
·         Ensure there is no errors and warnings for host installation and validation
·         Choose Services
·         Unselect (remove) Nagios and Ganglia from the list of services to install
·         Ganglia will not function with an Isilon cluster without additional configuration beyond what is available through Ambari.
·         Assign Masters (Follow the mapping listed below)
·         Isilon OneFS cluster node
·         NameNode
·         SNameNode
·         HDP Master Compute node
·         All other master components
·         Assign Slaves and Clients
·         Isilon OneFS cluster node
·         DataNode
·         HDP Master Compute node
·         Client
·         HDP Worker Compute nodes
·         NodeManager
·         RegionServer
·         Supervisor
·         Customize Services
·         Assign passwords to Hive, Oozie and any other selected services that required them
·         Goto “HDFS->Advanced->Advanced hdfs-site” and change the webhdfs port  "dfs.namenode.http-address"  from 50070 to 8082
·         Goto “YARN->Advanced”, set yarn.timeline-service.store-class to org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore
·         Check that all local data directories are within /data/1, /data/2, etc. The following settings should be checked.
ONLY REQUIRED FOR BDE INSTALLATION
·         YARN Node Manager log-dirs
·         YARN Node Manager local-dirs
·         HBase local directory
·         ZooKeeper directory
·         Oozie Data Dir
·         Storm storm.local.dir
                   Review
Carefully review your configuration and then click "Deploy"
                   Complete Installation

             Validate HDP deployment
·         Login HDP Master Compute node and run below command to validate Hadoop Cluster
clear &&
hdfs dfs -ls /  &&
hdfs dfs -put -f /etc/hosts /tmp  &&
hdfs dfs -cat /tmp/hosts  &&
hdfs dfs -rm -skipTrash /tmp/hosts  &&
yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10 1000  &&
echo "Done"



4.     References:
·         EMC Isilon Hadoop Starter Kit for Hortonworks
2.      EMC Isilon Best Practices for Hadoop Data Storage