Facebook Pixel

Big Data and Hadoop

Course type:
E-learning
Duration:

36 hours

Delivery:
Online
From {% configuredPrice() | ilxCurrency:currentCurrency %} Was {% configuredPrice() | ilxCurrency:currentCurrency %}
From {% configuredOfferPrice() | ilxCurrency:currentCurrency %}
Without exams

Big Data and Hadoop Administrator E-learning

The amount of data generated today from all industry domains, also known as big data, is huge, so is the demand for certified Hadoop professionals. The certification gives you an edge over other IT professionals and is a proof of your big data skills. This Big Data and Hadoop Administration training course helps you understand the basic and advanced concepts of big data and all the technologies related to the Hadoop ecosystem. 

Course overview

About the course

This big data training course will give you the skills needed to excel in the big data analytics industry. You will learn how to set up, secure, safeguard and monitor big data clusters and their components such as Sqoop, Flume, Pig, Hive and Impala. Also, how to work with Hadoop’s distributed file system, its processing and computation frameworks, core Hadoop distributions, and vendor-specific distributions such as Cloudera. 

What's covered?

The course will cover the following topics:

  • Lesson 1 - Big Data and Hadoop - introduction 
  • Lesson 2 - HDFS Hadoop distributed file system 
  • Lesson 3 - Hadoop cluster setup and working 
  • Lesson 4 - Hadoop configurations and daemon logs 
  • Lesson 5 - Hadoop cluster maintenance and administration 
  • Lesson 6 - Hadoop computational frameworks 
  • Lesson 7 - Scheduling: managing resources 
  • Lesson 8 - Hadoop cluster planning 
  • Lesson 9 - Hadoop clients and Hue interface 
  • Lesson 10 - Data ingestion in Hadoop cluster 
  • Lesson 11 - Hadoop ecosystem ComponentsServices 
  • Lesson 12 - Hadoop security 
  • Lesson 13 - Hadoop cluster monitoring

Successful evaluation of one of the following two projects is part of the Hadoop Admin certification eligibility criteria:

  • Project 1

Scalability: Deploying multiple clusters
Your company wants to set up a new cluster and has procured new machines; however, setting up clusters on new machines will take time. Meanwhile, your company wants you to set up a new cluster on the same set of machines and start testing the new cluster’s working and applications.

  • Project 2

Working with clusters
Demonstrate your understanding of the following tasks (give the steps):

  • Enabling and disabling HA for namenode and resourcemanager in CDH
  • Removing Hue service from your cluster, which has other services such as Hive, HBase, HDFS, and YARN setup
  • Adding a user and granting read access to your Cloudera cluster
  • Changing replication and block size of your cluster
  • Adding Hue as a service, logging in as user HUE, and downloading examples for Hive, Pig, job designer, and others

For additional practice we offer two more projects to help you start your Hadoop administrator journey:

  • Project 3

Data ingestion and usage
Ingesting data from external structured databases into HDFS, working on data on HDFS by loading it into a data warehouse package like Hive, and using HiveQL for querying, analysing, and loading data in another set of tables for further usage. Your organisation already has a large amount of data in an RDBMS and has now set up a big data practice. It is interested in moving data from the RDBMS into HDFS so that it can perform data analysis by using software packages such as Apache Hive. The organisation would like to leverage the benefits of HDFS and features such as auto replication and fault tolerance that HDFS offers.

  • Project 4

Securing data and cluster
Protecting data stored in your Hadoop cluster by safeguarding it and backing it up. Your organisation would like to safeguard its data on multiple Hadoop clusters. The aim is to prevent data loss from accidental deletes and to make critical data available to users/applications even if one or more of these clusters is down.

Duration

36 hours

Target audience

Big data career opportunities are on the rise and Hadoop is a must-know technology for the following professionals:

  • Systems administrators and IT managers
  • IT administrators and operators
  • IT systems engineers
  • Data engineers and database administrators
  • Data analytics administrators
  • Cloud systems administrators
  • Web engineers
  • Individuals who intend to design, deploy and maintain Hadoop clusters

Learning objectives

By the end of this course you will be able to:

  • Understand the fundamentals and characteristics of big data
  • Master the concepts of the Hadoop framework, including architecture, the Hadoop distributed file system and deployment of Hadoop clusters, using core or vendor-specific distributions
  • Use Cloudera manager for setup, deployment, maintenance and monitoring of Hadoop clusters
  • Understand Hadoop administration activities and computational frameworks for processing big data
  • Work with Hadoop clients, nodes for clients and web interfaces like HUE
  • Use cluster planning and tools for data ingestion into Hadoop clusters and cluster monitoring activities
  • Understand security implementation to secure data and clusters

What's included?

The Big Data and Hadoop Administration training course is offered by Simplilearn, a partner of ILX Group.

Materials

20 hours of self-paced video

Duration of access

12 months online access to accredited e-learning

Exam information

Online self-learning:

  • Complete 85% of the course
  • Complete one project and one simulation test with a minimum score of 80%