Learn By Example: Spark Streaming 2.x

Standard transformations and operations are performed on streams

What's Inside

Spark is an open-source, distributed analytics engine which is very popular with developers, data analysts and scientists because of how easy and intuitive it is to use. Spark 2.x offers a variety of improvements in terms of performance, efficiency and developer APIs as compared with the original versions of Spark.

In addition to support for batch data, Spark also has extremely powerful support for continuous applications i.e. streaming data where the data is constantly updated and changes in real-time. The dataset is effectively infinitely increasing and Spark 2 dataframes allow you to work with these unbounded datasets in a natural and intuitive manner.

Here is what is covered in this course:

Streaming architectures: Understanding how to work with unbounded datasets

DStreams vs. Structured Streaming: Understanding how Spark 2 processes streams

Triggers and output modes: Determining when transformations are performed and how data sinks are updated

Grouping, aggregations on streams: Perform Spark transformations on continuous data

Sliding and tumbling windows: Partition streams using windows to perform aggregations

Timestamps, watermarks and late data: Learn to work with event time, ingestion time and processing time

Streaming data from Twitter: Perform analysis on real-world streams

Joins and Windowed joins: Perform join operations on batches and streams

Kafka integration: Connect Spark with Kafka to consume tweets and perform analysis

This course is built around hands on demos using datasets from the real world. You'll be analyzing data from restaurants listed on Zomato and real-time Twitter data!

At the end of this course you will comfortable working on big data analysis on streaming data from multiple sources using Spark 2.

Software used: Spark 2.3, Python 3

Get started now!



Certificate Available
2517+ Students
36 Lectures
2+ Hours of Video
Lifetime Access
24/7 Support
Instructor Rating
Loonycorn

Loonycorn is comprised of a couple of individuals —Janani Ravi and Vitthal Srinivasan—who have honed their tech expertises at Google and Stanford. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Popular Bundles