astro-sdk icon indicating copy to clipboard operation
astro-sdk copied to clipboard

Give users more visibility of limitations on using Duckdb/Sqlite

Open rajaths010494 opened this issue 1 year ago • 0 comments

Context

The Python SDK makes it very simple to exchange between databases. However, users may not realise that locally some operations will work fine (multiple tasks pointing to a file-based database, such as Sqlite and DuckDB), but when deploying to a cluster with multiple Airlfow Workers, they would fail.

There was awareness to this potential problem in this PR / discussion: https://github.com/astronomer/astro-sdk/pull/1750#issuecomment-1433699281

@rajaths010494 this seems like a potentially larger issue than just changing the example DAGs. If we support sqlite and duckdb and those systems fail only any airflow with more than 1 worker we need to have a bigger discussion on what supporting these systems looks like (e.g. do we put up a warning, do we only allow for LocalExecutor, do we encourage remote execution solutions, etc.)

This ticket is meant to help us following up and making a decision if and how we'd like to warn users if they opt to use these type of databases, that their deployment will fail in an Airflow with multiple workers.

Acceptance criteria

Decision on how we want to handle these scenarios and potential implementation.

rajaths010494 avatar Feb 17 '23 10:02 rajaths010494