data-engineering-zoomcamp icon indicating copy to clipboard operation
data-engineering-zoomcamp copied to clipboard

Timecodes for "DE Zoomcamp 5.4.3 - Joins in Spark"

Open alexeygrigorev opened this issue 2 years ago • 1 comments

Youtube video: https://www.youtube.com/watch?v=lu7TrqAWuH4

alexeygrigorev avatar May 20 '22 05:05 alexeygrigorev

0:00:00 - Spark internals, group by, reshuffling, joints 0:01:55 - Join: green and yellow, outer join 0:03:39 - Joining yellow and green datasets 0:05:24 - Complex record creation and reshuffling 0:07:22 - Reshuffling for join using merge sort 0:09:21 - Materializing results for efficient processing 0:11:12 - Joining large tables, small tables 0:13:00 - DataFrame join, drop, save, execution plan 0:14:47 - Small zones, broadcast join, fast

dimzachar avatar Sep 09 '23 16:09 dimzachar

Updated, thanks!

amitfrancis avatar Jan 15 '24 11:01 amitfrancis