Monday 24 November 2014

Hadoop Cluster Installation

Two type's of installation  Hadoop Cluster .. if I am not wrong then.
1. Through Cli mode.
2. Gui Mode using Cloudera Manager package

Basic Steps for Both type's of installation
Requirement:-
Basic Machine as per your Requirement.
Linux OS : Any Flavor (Red hat or Ubuntu is Good)
Disabled All Firewall.
Dns Resolution.

Password Less Login
For Password Less login you can click here

Now Next step is Disable firewall.

#iptables -F
#/etc/init.d/iptables stop
#chkconfig iptables off

Now same way you need to disable ip6tables also

2. Disable Selinx and firewall
    for Selinux you need to edit mentioned file
    vim /etc/sysconfig/selinux
     SELINUX=enforcing (change enforcing to disable)
:wq(save & exit )

3. Disable firewall
system-config-furewal-tui
by default * un-comment  and save setting


If you are not having DNS server then make sure your host able to ping each Othe with IP and HOSTNAME.

1. for that you need to make entry into
     #/vim /etc/hosts
       192.168.80.2   Hadoop159
       192.168.80.3   Hadoop160


Once LocalDNS is done then you need to add user hadoop
useradd hadoop
passwd hadoop
New password:
Retype new password:
passwd: all authentication tokens updated successfully.

Create Mention Folder 
mkdir /hadoop
Chage Owner with below mention Command
chown –R hadoop /hadoop

Once You complete these mention steps let's start with Hadoop Cluster Installation Step.

Download latest Version of JAVA and Install the same. And you need to install java master node as well client node's
#wget http://download.oracle.com/otn-pub/java/jdk/7u71-b14/jre-7u71-linux-x64.rpm –no-check-certificates
 rpm -ivh jdk-7u1-linux-x86.rpm


Onece you done with java installation then you need to install Cloudera Repo. you can install through you or rpm as per your..

Download Cloudera Repo

#wget http://archive.cloudera.com/redhat/6/x86_64/cdh/cdh3-repository-1.0-1.noarch.rpm

Through Yum
 yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm

This action need to be perform all server master as well client node's

Now need to install Hadoop Package..

On the Master node, install NameNode and JobTracker packages
       # yum -y install hadoop-0.20-namenode hadoop-0.20-jobtracker 

On the Slave nodes, install DataNode and TaskTracker packages
#yum -y install hadoop-0.20-datanode hadoop-0.20-tasktracker

This is the Basic Cluster Installation. In next blog will do Cluster Configuration. Hoping this is helpful for someone :)