Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Scalable Programming with Scala and Spark
You, This Course and Us
You, This Course and Us (2:16)
Installing Scala and Hello World (9:43)
Introduction to Spark
What does Donald Rumsfeld have to do with data analysis? (8:45)
Why is Spark so cool? (12:23)
An introduction to RDDs - Resilient Distributed Datasets (9:39)
Built-in libraries for Spark (15:37)
Installing Spark (11:44)
The Spark Shell (6:55)
See it in Action : Munging Airlines Data with Spark (3:44)
Transformations and Actions (17:06)
Resilient Distributed Datasets
RDD Characteristics: Partitions and Immutability (12:35)
RDD Characteristics: Lineage, RDDs know where they came from (6:06)
What can you do with RDDs? (11:09)
Create your first RDD from a file (14:54)
Average distance travelled by a flight using map() and reduce() operations (6:59)
Get delayed flights using filter(), cache data using persist() (6:11)
Average flight delay in one-step using aggregate() (12:21)
Frequency histogram of delays using countByValue() (2:10)
Advanced RDDs: Pair Resilient Distributed Datasets
Special Transformations and Actions (14:45)
Average delay per airport, use reduceByKey(), mapValues() and join() (13:35)
Average delay per airport in one step using combineByKey() (8:23)
Get the top airports by delay using sortBy() (2:51)
Lookup airport descriptions using lookup(), collectAsMap(), broadcast() (10:57)
Advanced Spark: Accumulators, Spark Submit, MapReduce , Behind The Scenes
Get information from individual processing nodes using accumulators (9:25)
Long running programs using spark-submit (7:11)
Spark-Submit with Scala - A demo (6:10)
Behind the scenes: What happens when a Spark script runs? (14:30)
Running MapReduce operations (10:53)
PageRank: Ranking Search Results
What is PageRank? (16:44)
The PageRank algorithm (6:15)
Implement PageRank in Spark (9:45)
Join optimization in PageRank using Custom Partitioning (6:28)
Spark SQL
Dataframes: RDDs + Tables (15:48)
MLlib in Spark: Build a recommendations engine
Collaborative filtering algorithms (12:19)
Latent Factor Analysis with the Alternating Least Squares method (11:39)
Music recommendations using the Audioscrobbler dataset (5:38)
Implement code in Spark using MLlib (14:45)
Spark Streaming
Introduction to streaming (9:55)
Implement stream processing in Spark using Dstreams (9:19)
Stateful transformations using sliding windows (8:17)
Graph Libraries
The Marvel social network using Graphs (14:30)
Scala Language Primer
Scala - A "better Java"? (10:13)
How do Classes work in Scala? (11:02)
Classes in Scala - continued (15:50)
Functions are different from Methods (7:30)
Collections in Scala (10:12)
Map, Flatmap - The Functional way of looping (11:36)
First Class Functions revisited (8:46)
Partially Applied Functions (7:31)
Closures (8:07)
Currying (10:34)
Supplementary Installs
Installing Intellij (12:43)
Installing Anaconda (9:00)
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables (8:25)
The Marvel social network using Graphs
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock