Big Data - Hadoop
Big Data - Hadoop
· Introduction – Big Data
· Hadoop Basic Concepts
· Data Storage Framework – HDFS (Hadoop Distributed File System)
o HDFS Architecture
o Data Read/Write and data integrity
o Basic commands
· Data Processing Framework – MapReduce (MR1)
o MapReduce Architecture
o MapReduce Program Flow
o Various components of MapReduce Program
o Using distributed Cache and counters
o Debugging techniques
· Advance Data Processing Framework – YARN (MR2)
o Motivation for new Framework
o MR1 Vs MR2
o YARN Architecture
· PIG – Hadoop Data Flow Language
o Understanding PIG
o Program/Flow Organization
o Pig Data Types
o Pig Operations
· PIG – Hadoop Data Warehouse System
o What is HIVE
o Data Type
o HiveQL : Data Definition
o HiveQL : Data Manipulation
· Sqoop
o Why Sqoop needed
o Import/export data from RDBMS/HDFS
o Sqoop Query
· Other Hadoop Components - Introduction
o Flume
o Oozie
o Zookeeper
o Hbase