Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Pig For Wrangling Big Data
You, This Course and Us
You, This Course and Us
Where does Pig fit in?
Pig and the Hadoop ecosystem
Install and set up
How does Pig compare with Hive?
Pig Latin as a data flow language
Pig with HBase
Pig Basics
Operating modes, running a Pig script, the Grunt shell
Loading data and creating our first relation
Scalar data types
Complex data types - The Tuple, Bag and Map
Partial schema specification for relations
Displaying and storing relations - The dump and store commands
Pig Operations And Data Transformations
Selecting fields from a relation
Built-in functions
Evaluation functions
Using the distinct, limit and order by keywords
Filtering records based on a predicate
Advanced Data Transformations
Group by and aggregate transformations
Combining datasets using Join
Concatenating datasets using Union
Generating multiple records by flattening complex fields
Using Co-Group, Semi-Join and Sampling records
The nested Foreach command
Debug Pig scripts using Explain and Illustrate
Optimizing Data Transformations
Parallelize operations using the Parallel keyword
Join Optimizations: Multiple relations join, large and small relation join
Join Optimizations: Skew join and sort-merge join
Common sense optimizations
A real-world example
Parsing server logs
Summarizing error logs
Installing Hadoop in a Local Environment
Hadoop Install Modes
Setup a Virtual Linux Instance (For Windows users)
Hadoop Standalone mode Install
Hadoop Pseudo-Distributed mode Install
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables
Operating modes, running a Pig script, the Grunt shell
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock