Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Advanced MapReduce in Hadoop
Introduction
You, this course and Us
Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API
Parallelize the reduce phase - use the Combiner
Not all Reducers are Combiners
How many mappers and reducers does your MapReduce have?
Parallelizing reduce using Shuffle And Sort
MapReduce is not limited to the Java language - Introducing the Streaming API
Python for MapReduce
MapReduce Customizations For Finer Grained Control
Setting up your MapReduce to accept command line arguments
The Tool, ToolRunner and GenericOptionsParser
Configuring properties of the Job object
Customizing the Partitioner, Sort Comparator, and Group Comparator
The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!
The heart of search engines - The Inverted Index
Generating the inverted index using MapReduce
Custom data types for keys - The Writable Interface
Represent a Bigram using a WritableComparable
MapReduce to count the Bigrams in input text
Test your MapReduce job using MRUnit
Input and Output Formats and Customized Partitioning
Introducing the File Input Format
Text And Sequence File Formats
Data partitioning using a custom partitioner
Make the custom partitioner real in code
Total Order Partitioning
Input Sampling, Distribution, Partitioning and configuring these
Secondary Sort
Configuring properties of the Job object
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock