data-engineer-handbook
data-engineer-handbook copied to clipboard
This is a repo with links to everything you'd ever want to learn about data engineering
-Added new documentation for essential data cleaning steps. -Included Python code for handling missing values, duplicates, and date formats. -Improves resources for Data Engineering beginners.
## About The assignment request references `game_details` table from the lecture, instead of, intended table `events`. ## Details [DataExpert-io/data-engineer-handbook/issues/364](https://github.com/DataExpert-io/data-engineer-handbook/issues/364)
## Change Request Query 1 is referring to 'lecture table', instead of, 'homework table'. From - A query to deduplicate `game_details` from Day 1 so there's no duplicates To -...
update the players table insert query & the season_stats TYPE definition to fix issue where 1. games played "gp" attribute was being loaded into the points table. This caused most...
Added book recommendation from Zack
Windows reserves port 9001 (PID 4), and Docker cannot bind to it. I changed it to 9101 on the host without breaking the container. It worked properly for me.
The handbook still uses the old path, `bootcamp/materials` in the `1-dimensional-data-modeling` README. I updated that to `intermediate-bootcamp/materials/1-dimensional-data-modeling` @EcZachly
In the path, https://github.com/DataExpert-io/data-engineer-handbook/blob/main/intermediate-bootcamp/materials/1-dimensional-data-modeling/sql/load_players_table_day2.sql at line no 68 and 69, current code (seasons[CARDINALITY(seasons)]::season_stats).season = season AS is_active, w.season fix: w.season,(seasons[CARDINALITY(seasons)]::season_stats).season = season AS is_active
- Replace DB setup link with the existing, shorter, more up-to-date, and complete guide. - Retain the video guide as a sub-bullet for those who prefer video. - Clarify that...