soda-sql icon indicating copy to clipboard operation
soda-sql copied to clipboard

staging_dir for Athena dialect should be optional

Open mehd-io opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. For athena connection, we need to provide a staging_dir parameter where Athena will output the query result. However, this is not required anymore by boto3 since quite some times. This simplify the interaction with Athena :

  • No need for that parameter
  • No need to have an s3 bucket + write access to this one
  • bonus (not sure about the exact Athena mechanism) : speed up a little bit the query as it doesn't dump the data

I can see that the dialecte is based on pyathena and this parameter is already optional over there I think here :

Describe the solution you'd like Make the staging_dir optional in the Athena dialect https://github.com/sodadata/soda-sql/blob/43e63e1e82024e64241ae7f9a281984f46effff3/packages/athena/sodasql/dialects/athena_dialect.py#L72

mehd-io avatar Sep 14 '21 12:09 mehd-io

According to this : https://github.com/laughingman7743/PyAthena/blob/master/pyathena/connection.py#L94 either staging_dir or workgroup must be set, otherwise it will fail. Are you sure that staging_dir is not required?

vijaykiran avatar Oct 25 '21 14:10 vijaykiran

It does break pyathena: https://github.com/sodadata/soda-sql/runs/3997637517?check_suite_focus=true#step:5:4163

vijaykiran avatar Oct 25 '21 14:10 vijaykiran