Está en la página 1de 10

HANDS-ON LAB - HADOOP ARCHITECTURE

This lab is recommended to be performed on the IBM Cloud using your


previously created account Lab 1.

1. Log in IBM Cloud

Please use your account to log in IBM Cloud and the resource dashboard is
shown as the below :

2. Create IBM Analytics Engine

Click "Catalog" and go to the page of IBM Cloud services, as shown below :
Choose "Analytics" => "Analytics Engine", it goes to configuration of
"Analytics Engine". Please specify the instance name, like 'Hadoop-Lab2",
and then

1) select a region/location to deploy in

2) select a resource group


Then, click "Configure" and then set specify the parameters for your hadoop
cluster as shown below :

Choose number of compute nodes and software packages as the above by


default to create a service.

notice: It takes 30 to 60 minutes to start an Analytics Engine instance.

Once an Analytics Engine service is created successfully, it's displayed in


your dashboard like the below :
3. Manage Hadoop in Ambari Console

Open your Hadoop cluster from dashboard and then click "Launch
Console" to log in Ambari console using the provided username and
password in the red circle below :

After you log in Ambari console, the Dashboard of your Hadoop cluster is
shown as below :
From the Dashboard, you are able to view all key metrics of your Hadoop
cluster nodes and manage various services, like YARN, HDFS, Hive, etc.

4. HDFS management via command line

To manage files on HDFS in the command line , you need to be able to log in
the host from the remote ssh terminal . Here are steps on how to do it :

1) Generate service credential for ssh

Open Service credentials on the left menu like the below :


Then, in Service credentials page , please choose New credential button
to generate a credential :

select the values like the below to add a new credential:


Click 'Add' button, a new credential is generated and then find the
information for ssh remote login :
2) Log in a cluster node via ssh

From the above information, log in a cluster node via ssh client tool
like PuTTY by providing username and password, like the below :

Now, you are able to run HDFS commands like `haddop fs -ls /` :
Test

Graded Review Questions Instructions


1. Time allowed: Unlimited

 We encourage you to go back and review the materials to find the right answer

 Please remember that the Review Questions are worth 50% of your final mark.

2. Attempts per question:

 One attempt - For True/False questions

 Two attempts - For any question other than True/False

3. Clicking the "Final Check" button when it appears, means your submission is FINAL. You
will NOT be able to resubmit your answer for that question ever again

4. Check your grades in the course at any time by clicking on the "Progress" tab

Network bandwidth between any two nodes in the same rack is greater than
bandwidth between two nodes on different racks. True or False?

True False

Hadoop works best on a large data set. True or False?

True False

HDFS is a fully POSIX compliant file system. True or False?

True False