Course Objective

This course will introduce the Big Data concepts and its benefit of business data analytic, the data store and data processing technologies, the tools and techniques of handling big data.

Course Outline

Day 1:

● Big data Introduction

● The big data business importance

● Data characters and data store

● Big data processing technologies

● Hadoop architecture

● MapReduce introductionHDFS introduction

● Hadoop in production web site architecture

● Lab: Hadoop cluster install based on apache Hadoop & CDH(cloudera Manager)

● Comparison of cloudera、MapR、hortonworks and apache Hadoop

● Lab: MapReduce in action with Map、Reduce、combiner、partition、inputFormat、InputSplit、RecordReader

Day 2:

● Pig / Hive / HBase Introduction

● ZooKeeper, Sqoop and Flume

● Spark Introduction

● Working on RDDs

● Spark on Hadoop

● Processing big data

● Handling streaming dataParallel

● ProgrammingStorm introductio

● Hive & Hbase used in production web site practice

● Lab: Hive in action Lab: Spark installation with Version 1.2

● Spark in action by coding with java、python and Scala

● Storm & Spark practice in production web site