Flume & Sqoop for Ingesting Big Data

Import data to HDFS, HBase and Hive from a variety of sources , including Twitter and MySQL

What's Inside

Course Description

Taught by a 4 person team including 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with Java and with billions of rows of data.

Use Flume and Sqoop to import data to HDFS, HBase and Hive from a variety of sources, including Twitter and MySQL

Let’s parse that.

Import data : Flume and Sqoop play a special role in the Hadoop ecosystem. They transport data from sources like local file systems, HTTP, MySQL and Twitter which hold/produce data to data stores like HDFS, HBase and Hive. Both tools come with built-in functionality and abstract away users from the complexity of transporting data between these systems.

Flume: Flume Agents can transport data produced by a streaming application to data stores like HDFS and HBase.

Sqoop: Use Sqoop to bulk import data from traditional RDBMS to Hadoop storage architectures like HDFS or Hive.

What's Covered:

Practical implementations for a variety of sources and data stores ..

  • Sources : Twitter, MySQL, Spooling Directory, HTTP
  • Sinks : HDFS, HBase, Hive

.. Flume features :

Flume Agents, Flume Events, Event bucketing, Channel selectors, Interceptors

.. Sqoop features :

Sqoop import from MySQL, Incremental imports using Sqoop Jobs

Mail us about anything - anything! - and we will always reply :-)

What are the requirements?

  • Knowledge of HDFS is a prerequisite for the course
  • HBase and Hive examples assume basic understanding of HBase and Hive shells
  • HDFS is required to run most of the examples, so you'll need to have a working installation of HDFS

What am I going to get from this course?

  • Use Flume to ingest data to HDFS and HBase
  • Use Sqoop to import data from MySQL to HDFS and Hive
  • Ingest data from a variety of sources including HTTP, Twitter and MySQL

What is the target audience?

  • Yep! Engineers building an application with HDFS/HBase/Hive as the data store
  • Yep! Engineers who want to port data from legacy data stores to HDFS

Get started now!



Certificate Available
2735+ Students
17 Lectures
2+ Hours of Video
Lifetime Access
24/7 Support
Instructor Rating
Loonycorn

Loonycorn is comprised of a couple of individuals —Janani Ravi and Vitthal Srinivasan—who have honed their tech expertises at Google and Stanford. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Popular Bundles