Introduction to Big Data and Hadoop


The ‘Introduction to Big Data and Hadoop’ is an ideal course package for individuals who want to understand the basic concepts of Big Data and Hadoop. On completing this course, learners will be able to interpret what goes behind the processing of huge volumes of data as the industry switches over from excel-based analytics to real-time analytics. The course focuses on the basics of Big Data and Hadoop. It further provides an overview of the commercial distributions of Hadoop as well as the components of the Hadoop ecosystem.

Price 100 USD
Access Period 180 days

Prerequisite list

  • There are no prerequisites for this course.

Audience list

  • This course is meant for professionals who intend to gain a basic understanding of Big Data and Hadoop. It is ideal for professionals in senior management who requires a theoretical understanding of how Hadoop can solve their Big Data problem.

What is included

  • 3 hours of self-paced video
  • 2 simulation exams
  • 10 quizzes

Certification Info

  • How To Earn?  Complete 85% of the course and Complete 1 simulation test with a minimum score of 60%.
  • How To Maintain?  N/A

Certification Exam Format

  • No Exam

Retake policy

  • N/A.

Enrollment Policy

  • You should pay the online course fee then the online course access will be granted to you within 1 week after receiving payment.
  • Course fee payment is not refundable.

Frequently Asked Questions

Course Outline

Introduction to Big Data and Hadoop
  • Introduction to Big Data and Hadoop
  • Objectives
  • Need for Big Data
  • Three Characteristics of Big Data
  • Characteristics of Big Data Technology
  • Appeal of Big Data Technology
  • Handling Limitations of Big Data
  • Introduction to Hadoop
  • Hadoop Configuration
  • Apache Hadoop Core Components
  • Hadoop Core Components—HDFS
  • Hadoop Core Components—MapReduce
  • HDFS Architecture
  • Ubuntu Server—Introduction
  • Hadoop Installation—Prerequisites
  • Hadoop Multi-Node Installation—Prerequisites
  • Single-Node Cluster vs. Multi-Node Cluster
  • MapReduce
  • Characteristics of MapReduce
  • Real-Time Uses of MapReduce
  • Prerequisites for Hadoop Installation in Ubuntu Desktop
  • Hadoop MapReduce—Features
  • Hadoop MapReduce—Processes
  • Advanced HDFS–Introduction
  • Advanced MapReduce
  • Data Types in Hadoop
  • Distributed Cache
  • Distributed Cache (contd.)
  • Joins in MapReduce
  • Introduction to Pig
  • Components of Pig
  • Data Model
  • Pig vs. SQL
  • Prerequisites to Set the Environment for Pig Latin
  • Summary
Hive HBase and Hadoop Ecosystem Components
  • Hive, HBase and Hadoop Ecosystem Components
  • Objectives
  • Hive—Introduction
  • Hive—Characteristics
  • System Architecture and Components of Hive
  • Basics of Hive Query Language
  • Data Model—Tables
  • Data Types in Hive
  • Serialization and De serialization
  • UDF/UDAF vs. MapReduce Scripts
  • HBase—Introduction
  • Characteristics of HBase
  • HBase Architecture
  • HBase vs. RDBMS
  • Cloudera—Introduction
  • Cloudera Distribution
  • Cloudera Manager
  • Hortonworks Data Platform
  • MapR Data Platform
  • Pivotal HD
  • Introduction to ZooKeeper
  • Features of ZooKeeper
  • Goals of ZooKeeper
  • Uses of ZooKeeper
  • Sqoop—Reasons to Use It
  • Sqoop—Reasons to Use It (contd.)
  • Benefits of Sqoop
  • Apache Hadoop Ecosystem
  • Apache Oozie
  • Introduction to Mahout
  • Usage of Mahout
  • Apache Cassandra
  • Apache Spark
  • Apache Ambari
  • Key Features of Apache Ambari
  • Hadoop Security—Kerberos
  • Summary
  • Quiz