data-engineering-zoomcamp icon indicating copy to clipboard operation
data-engineering-zoomcamp copied to clipboard

Timecodes for "DE Zoomcamp 5.3.3 - (Optional) Preparing Yellow and Green Taxi Data"

Open alexeygrigorev opened this issue 2 years ago • 1 comments

Youtube video: https://www.youtube.com/watch?v=CI3P4tAtru4

alexeygrigorev avatar May 20 '22 05:05 alexeygrigorev

0:00:00 - Prepare data set for week. 0:01:36 - Bash script for downloading data. 0:03:32 - Formatting numbers in different languages. 0:05:22 - Executing command, saving locally. 0:07:14 - Compressing and downloading data. 0:09:15 - Downloading and analyzing compressed CSV files. 0:11:03 - Data installation and structure review. 0:13:03 - Creating notebook, defining schema. 0:15:04 - Adjusting CSV file types for Spark. 0:17:17 - Schema definition for data conversion. 0:19:35 - Data partitioning and processing in Spark. 0:21:31 - Repetition, fast, green/yellow, file sizes. 0:23:32 - Compressing CSV, schema benefits, executor tasks.

dimzachar avatar Sep 09 '23 16:09 dimzachar

Updated, thanks!

amitfrancis avatar Jan 15 '24 11:01 amitfrancis