Become an expert in Hadoop Administrator by getting your hands on Hadoop Cluster that will include monitoring Hadoop Distributed File System and Planning & Deployment. It will also embrace Hands-On Hadoop Ecosystem that consist of YARN, Map Reduce, HDFS, Cloudera Manager, Hadoop Cluster with Hive, HBase, Pig, Flume and from RDBMS using Sqoop.
Become Hadoop Administrator by mastering Hadoop Clusters ! Collabera Big Data and Hadoop Administrator course is specifically designed to provide a hands-on experience to install, configure, and manage the Apache Hadoop platform.
Module 1 – Learning Objectives:
By end of the module, the student will be able to understand the basics of big data, he/she will have the foundation of Hadoop daemons and Hadoop architecture.
a. Understanding Big Data Basics
b. Big Data Use Cases
c. Introduction to Hadoop
d. Understanding Hadoop Ecosystem
e. Introduction to HDFS
i. Introduction to Namenode
ii. Introduction to Datanode
iii. Introduction to Secondary Namenode
f. Introduction to MapReduce
i. Introduction to JobTracker
ii. Introduction to TaskTracker
g. Summarizing Hadoop Architecture
h. Roles and Responsibilities of a Hadoop Administrator
Module 2 – Learning Objectives:
By end of the module, the student will be able to create a multi node Hadoop cluster. For preparing the students to create Hadoop cluster, this module gives the deep understanding of how linux works, how to setup the virtual machines, how to setup the passwordless ssh.
a. Linux internals
i. Commands that are required
ii. Linux basics
b. Hadoop Cluster Installation Pre-requisites
i. Pre-requisites of Hadoop Installation
1. Softwares Download
2. Preparing yo;ur VM
3. Enabling VM with VMware
4. Understanding mandatory changes in the operating system
c. Installation and Configuration
i. Understanding Hadoop cluster installation modes
ii. Understanding Hadoop version 1 installation and configuration
iii. Passwordless SSH setup
d. Hands-On Practice for creating a Hadoop cluster
i. Helping individually in practicing Hadoop cluster installation
Module 3 – Learning Objectives:
By end of the module, the student will be able to understand how to plan a production cluster of Hadoop. Students will understand the hardware and software requirements of Hadoop cluster, performance tuning after cluster creation and benchmarking.
Module 4 – Learning Objectives:
By end of the module, the student will be able to administrate the Hadoop cluster. Students will understand how to copy the data from one Hadoop cluster to another Hadoop cluster, different Hadoop schedulers to run the jobs, backup and recovery of metadata, data, configurations, and applications data and recover the cluster data.
Module 5 – Learning Objectives:
By end of the module, the student will be able to understand how the next version of Hadoop and YARN works. New features of Hadoop version 2, yarn framework, deploying a Hadoop 2 cluster in pseudo distributed and multi distributed mode.
a. Hadoop 2.0 new features
i. Understanding Resource Manager
ii. Understanding Application Master
iii. Understanding Node Manager
iv. Understanding Hadoop 2 Job Execution Framework
c. Hadoop 2 Multi-node cluster creation
i. Pre-requisites of Hadoop Installation
ii. Softwares Download
iii. Preparing your VM
iv. Enabling VM with VMware
v. Understanding mandatory changes in the operating system
vi. Installation and Configuration
vii. Understanding Hadoop version 2 installation and configuration
viii. Passwordless SSH setup
Module 6 – Learning Objectives:
By end of the module, the student will be able to learn how to achieve high availability, how to enable federation in namenode and what the various improvements in Hadoop 2 are.
Practice Hadoop 2 multi-node Cluster Creation
i. Helping individuals in practicing Hadoop 2 cluster installation
b. Sample Yarn Job execution
c. Understanding Issues of Hadoop 1
d. Understanding improvements in Hadoop 2
e. Namenode Federation
i. Enable segregation of HDFS using multiple namenodes
f. Namenode – High Availability
i. Achieving Namenode High-Availability using Quorum Journal Manager
ii. Achieving Namenode High-Availability using Network File System
g. Implementation of NN High Availability
i. Helping individuals achieving Namenode High Availability
Module 7 – Learning Objectives:
By end of the module, the student will be able to administrate the basics of Hadoop ecosystem components like Hive, Hbase, Sqoop, Flume and Pig.
Hadoop Ecosystem Introduction
i. Understanding the integration of Hadoop ecosystem
b. Touchbase with Hive
i. What is Hive
ii. Architecture of Hive
iii. Understanding Hive metastore concepts
i. Understading HBase Basics
ii. Understanding HBase storage Model
iii. Understanding HBase Architecture
iv. Cluster Installation and Configuration
i. What is Pig?
ii. How Pig integrates with Hadoop cluster?
iii. Demo of Pig Jobs using MapReduce
i. What is Sqoop?
ii. How to import and export the data from Sqoop to RDBMS?
iii. Example of Sqoop jobs using MySQL
i. What is F
ii. Sample Flume jobs
Module 8 Learning Objectives:
By end of the module, the student will be able to build a multi node Cloudera cluster using Cloudera Manager, how to achieve high availability and how to add a new node into the cluster using Cloudera Manager.
a. Understanding the internals of Cloudera Manager
b. Understanding the automation of Hadoop installation using Cloudera Manager
c. Understanding Cloudera Hadoop Distribution and Cloudera Manager
d. Understanding the underlying directory structure of Cloudera Hadoop
e. Cloudera Hadoop Cluster Installation – CDH
2 Mbps of internet speed is preferable to attend the LIVE classes and go through the training recording as well.
Yes, Collabera TACT’s Virtual Machine can be installed on local machine.
Your system should have 4GB RAM, a 64 – bit OS and a Virtualization Technology enabled processor.
The Hadoop Administration course at Collabera TACT is a 7 weekedend course.
The recorded session for the class will be available on the LMS for your reference. We also have a support team, so in case you need any clarification on concepts or help in debug or installation etc, the support team will help you on it.
The access to the training infrastructure services will be for first 120 days OR 4 months.
Yes, we do have group discount option. You can contact firstname.lastname@example.org to know more about the group discounts.
Yes, we offer course completion certificate after you successfully complete the training program.