sqlmesh icon indicating copy to clipboard operation
sqlmesh copied to clipboard

Support Athena

Open nicor88 opened this issue 2 years ago • 10 comments

Support for AWS Athena as adapter.

Happy to propose a PR - given some guidelines.

nicor88 avatar Aug 15 '23 18:08 nicor88

We don't have plans for this right now, but we're happy to accept a PR.

Come join tobikodata.com/slack to chat with us about it.

Going to close this in favor of https://github.com/TobikoData/sqlmesh/issues/124, but again, happy to help guide you through a pr.

tobymao avatar Aug 15 '23 18:08 tobymao

@tobymao Athena is based on trino/presto - but there are some internal differences.

See for example https://github.com/dbt-athena/dbt-athena and https://github.com/starburstdata/dbt-trino Most of the SQL commands should be compatible (with some exception e.g. the default catalog), but how the connection is created and how the result is returned back from athena is quite different from trino - I still suggest keeping this as another "issue" and avoid to use the one for trino/presto.

nicor88 avatar Aug 15 '23 18:08 nicor88

ok sounds good

tobymao avatar Aug 15 '23 18:08 tobymao

hi! I am here just wondering how is this feature going on, I am currently giving SQLMesh a try and the Athena adapter is something I would definitively love

Thank you in advance!

alvaro-ponce avatar Jan 10 '24 12:01 alvaro-ponce

As someone that builds Greenfield data estates we often start with Athena then switch to other tooling later to manage our data estate when there is volume. It would be great to see Athena supported here like it is with DBT core

pixie79 avatar Jun 27 '24 07:06 pixie79

@erindru should we close this as completed given #3154?

georgesittas avatar Sep 25 '24 10:09 georgesittas

@erindru before closing this, I would like to know if:

  • partitioned tables are supported
  • table properties for iceberg specifically are supported (see this)

If not, those are relevant features that I've see used quite a lot in dbt-athena, and might be relevant to have those features in.

nicor88 avatar Sep 25 '24 12:09 nicor88

With the initial implementation:

  • Partitioned Iceberg tables are supported
  • Partitioned Hive tables are partially supported. They can be created but operations that need to delete data will currently fail
  • Table properties are supported, any physical_properties that don't have a special meaning to the adapter are passed through

Follow-up PR's will add:

  • Full support for partitioned Hive tables (in so far as emulating the DBT method for incremental models of identifying affected partitions, looking up their location in the metastore, deleting the objects from S3 and then dropping the partition)
  • Support for loading dbt-athena projects

So @georgesittas this isnt quite finished yet

erindru avatar Sep 25 '24 20:09 erindru

@erindru Nice, those are the type of features that I was expecting, great work, I will give it a try soon:)

Is the follow up PR WIP? I didn't find anything in the open PR regarding Athena.

nicor88 avatar Sep 26 '24 08:09 nicor88

Yep, still working on it. I'll link to this issue in the next PR when it's raised

erindru avatar Sep 26 '24 18:09 erindru

This is now released in v0.125.5

The docs are here: https://sqlmesh.readthedocs.io/en/stable/integrations/engines/athena/

Take it for a spin and let us know if you encounter any issues!

erindru avatar Oct 08 '24 02:10 erindru