The Big Data Paradigm | StackSkills

Autoplay
Autocomplete

Previous Lesson Complete and Continue

Introduction to Hadoop

Introduction

You, this course and Us

Why is Big Data a Big Deal

The Big Data Paradigm
Serial vs Distributed Computing
What is Hadoop?
HDFS or the Hadoop Distributed File System
MapReduce Introduced
YARN or Yet Another Resource Negotiator

Installing Hadoop in a Local Environment

Hadoop Install Modes
Setup a Virtual Linux Instance (For Windows users)
Hadoop Standalone mode Install
Hadoop Pseudo-Distributed mode Install

The MapReduce "Hello World"

The basic philosophy underlying MapReduce
MapReduce - Visualized And Explained
MapReduce - Digging a little deeper at every step
"Hello World" in MapReduce
The Mapper
The Reducer
The Job

Run a MapReduce Job

Get comfortable with HDFS
Run your first MapReduce Job

HDFS and Yarn

HDFS - Protecting against data loss using replication
HDFS - Name nodes and why they're critical
HDFS - Checkpointing to backup name node information
Yarn - Basic components
Yarn - Submitting a job to Yarn
Yarn - Plug in scheduling policies
Yarn - Configure the scheduler

Setting up a Hadoop Cluster

Manually configuring a Hadoop cluster (Linux VMs)
Getting started with Amazon Web Servicies
Start a Hadoop Cluster with Cloudera Manager on AWS

The Big Data Paradigm

Lesson content locked

If you're already enrolled, you'll need to login.

Enroll in Course to Unlock