data-engineering-zoomcamp
data-engineering-zoomcamp copied to clipboard
Timecodes for "DE Zoomcamp 5.4.2 - GroupBy in Spark"
Youtube video: https://www.youtube.com/watch?v=9qrDsY_2COo
0:00:00 - Spark group by query explained. 0:02:05 - Data analysis and filtering process. 0:04:18 - Order by, group by explanation. 0:06:19 - Combining subresults for group by. 0:08:10 - Reshuffling: Partitioning and Sorting Algorithm. 0:10:13 - Reshuffling, combining, ordering, filtering, repartitioning. 0:12:12 - Shuffling data for optimization.
Updated, thank you!