Modular Programme – Data Analytics

Programme Summary

This modular course is intended for the candidates who would like to learn to store, manage, process and analyse massive amounts of unstructured data for competitive advantage, select and implement the correct Big Data stores and apply sophisticated analytic techniques and tools to process and analyse big data. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data.

The course provides an overview of how to plan and implement a Big Data solution and the various technologies that comprise Big Data. Many examples and exercises of Big Data systems are provided throughout the course. The programming examples are in Java but the primary focus is on best practices that can be applied to any supported programming language.

Those completing this course will be encouraged to take an accreditation test on Hadoop’s Big Data to obtain a professional certification at any time of their convenience.

Topics Covered

This modular course contains the following topics

Duration

Full Time: 5 days

Course Objectives

At the end of the course, candidates will learn the following:

Candidates will work through the following lab exercises using the Hadoop Data Platform:

Module Outline

Core Modules:

Computing Environment The current mix of computing resources and demands that motivates use of a technology like Apache Hadoop
Hadoop Distributed File System How files are stored and managed in HDFS; the infrastructure that supports HDFS
MapReduce The phases of execution and framework for running a MapReduce job. Expected properties of job runs based on number of mappers, number of reducers and distribution of data
Hadoop API The Java classes that make up the API for developers who wish to write Apache Hadoop MapReduce jobs
Hadoop Platform The basic purpose, design and operation of tools that augment the Apache Hadoop core to make a comprehensive platform, including Hadoop Streaming, fuse-dfs, Apache Hive, Apache Pig, Apache Flume, Apache Sqoop, Apache HBase, Apache Oozie and HUE

Delivery Format

Applicable to Singaporeans/Singapore PRs ONLY