data-engineering-zoomcamp
data-engineering-zoomcamp copied to clipboard
Timecodes for "DE Zoomcamp 5.3.3 - (Optional) Preparing Yellow and Green Taxi Data"
Youtube video: https://www.youtube.com/watch?v=CI3P4tAtru4
0:00:00 - Prepare data set for week. 0:01:36 - Bash script for downloading data. 0:03:32 - Formatting numbers in different languages. 0:05:22 - Executing command, saving locally. 0:07:14 - Compressing and downloading data. 0:09:15 - Downloading and analyzing compressed CSV files. 0:11:03 - Data installation and structure review. 0:13:03 - Creating notebook, defining schema. 0:15:04 - Adjusting CSV file types for Spark. 0:17:17 - Schema definition for data conversion. 0:19:35 - Data partitioning and processing in Spark. 0:21:31 - Repetition, fast, green/yellow, file sizes. 0:23:32 - Compressing CSV, schema benefits, executor tasks.
Updated, thanks!