data-engineer-handbook issues

Incorrect link for "Learning Spark, 2nd Edition" in book.md

In the `book.md` file, the hyperlink for "Learning Spark, 2nd Edition" currently redirects to the Databricks website. It should instead link to the actual book on Amazon. **Incorrect link**: [https://databricks.com](https://databricks.com)...

Dheerajanuvas07

Update submodule reference

chandrudp29

fix: update README.md for data modeling Docker volumes created

1

This commit addresses potential confusion around a folder being created in the root of the repo, which is not the case. Volumes are created from the given docker-compose.yml. After 'docker...

thirionjwf

Update data modeling README.md

- Remove `edit` from section links for easy navigation - Previously these links prompted users without edit access to fork the repo rather than navigate within it - Slight wording...

Ho1yShif

Exit X difficult to see in DataExpert assignments popup

## Issue **Note:** This is an issue in the DataExpert UI itself, not in this repository, but I wasn't sure where else to raise it. The `X` to exit the...

Ho1yShif

Issue 1: Inconsistent use of environment variables for Kafka SASL configuration

The `start_job.py` file uses a different method of setting the `sasl.jaas.config` property for the Kafka sink compared to the `aggregation_job.py` file. The former constructs the config string inline within the...

PrinceSajjadHussain

Issue 1: Inconsistent App Names

Body: The Spark application name is not consistent across all Spark jobs. `monthly_user_site_hits_job.py` and `players_scd_job.py` use "players_scd," while `team_vertex_job.py` also uses "players_scd". This can lead to confusion when monitoring or...

PrinceSajjadHussain

fix: mc config and homework clarification

1

# Co-author Co-authored with @samlafell ## What - fixes a versioning issue with the mc entrypoint command in the docker file - Before, it was running an outdated `config` command...

Ho1yShif

Issue 1: Missing error handling for Statsig initialization

1

The `statsig.initialize(API_KEY)` call in `server.py` does not have any error handling. If `API_KEY` is not set or is invalid, `statsig.initialize` could fail, causing the entire application to crash. A try-except...

PrinceSajjadHussain

Add the book Streaming Database and mention Timeplus in README

1

Add 2 links in books.md and README.md - [Streaming Databases: Unifying Batch and Stream Processing](https://www.amazon.com/Streaming-Databases-Unifying-Stream-Processing/dp/1098154835) - [Timeplus](https://www.timeplus.com/)

jovezhong

data-engineer-handbook
data-engineer-handbook copied to clipboard

Metadata

Incorrect link for "Learning Spark, 2nd Edition" in book.md

Update submodule reference

fix: update README.md for data modeling Docker volumes created

Update data modeling README.md

Exit X difficult to see in DataExpert assignments popup

Issue 1: Inconsistent use of environment variables for Kafka SASL configuration

Issue 1: Inconsistent App Names

fix: mc config and homework clarification

Issue 1: Missing error handling for Statsig initialization

Add the book Streaming Database and mention Timeplus in README

← Metadata

Owner

Metadata

data-engineer-handbook data-engineer-handbook copied to clipboard

Metadata

← Metadata

Owner

Metadata

data-engineer-handbook
data-engineer-handbook copied to clipboard