redash icon indicating copy to clipboard operation
redash copied to clipboard

Athena schema refresh failure due to timeout

Open kartheek7895 opened this issue 4 years ago • 5 comments
trafficstars

Issue Summary

The Athena schema refresh fails due to a worker timeout of 30 secs. I've checked the relevant query and it is taking around 500 sec as there lot of DBs, tables, and a lot of queries running at a time on Athena. However, in the data source settings, I've mentioned the schema_name (instead of default) hence in case of an on-demand refresh (i.e from UI) redash should query for schema for that specific schema_name as this query runs much faster.

Current Query:

SELECT table_schema, table_name, column_name
        FROM information_schema.columns
        WHERE table_schema NOT IN ('information_schema')

The above query can be changed to something like:

SELECT table_schema, table_name, column_name
        FROM information_schema.columns
        WHERE table_schema = '<schema_name>'

Reference: https://github.com/getredash/redash/blob/master/redash/query_runner/athena.py

This also fixes the case when one table has a schema issue(i.e having multiple columns with the same name etc.) leading the complete schema refresh to fail.

Technical details:

  • Redash Version: 8.0.2
  • Browser/OS: Any
  • How did you install Redash: Docker

kartheek7895 avatar Feb 22 '21 10:02 kartheek7895

Any workarounds for this issue?

kartheek7895 avatar Mar 02 '21 05:03 kartheek7895

hey @kartheek7895 any workaround you found to fix this issue ??? Please Share was struck with same

srinivas-engg-sre avatar Oct 06 '23 10:10 srinivas-engg-sre

hey @srinivas-engg-sre , @kartheek7895 did you find any workaround to fix this issue ? I am facing the same problem.

anandkummari avatar Feb 29 '24 05:02 anandkummari

rq job default timeout is 180 seconds for automatic refreshes. There is another limit on manual refreshes (when you click the 2024-06-05_14-01 on the UI) which is 300 seconds. These can be confirmed by your Athena console "Recent queries" tab.

2024-06-05_13-58

To workaround thses limits, I had to update the source code:

  1. Manual refreshes: redash/tasks/general.py
-@job("schemas", queue_class=Queue, at_front=True, timeout=300, ttl=90)
+@job("schemas", queue_class=Queue, at_front=True, timeout=1300, ttl=90)
  1. Automatic refreshes: redash/tasks/queries/maintenance.py
-@job("schemas")
+@job("schemas", timeout=1300)

A better way is of course making it an env var.

zachliu avatar Jun 05 '24 18:06 zachliu

If you don't want to update the source code, you can always use the Glue data catalog instead.

2024-06-05_14-10

Since it uses boto3 to traverse all dbs/tables/columns, it's much faster than sending a plain query to Athena.

zachliu avatar Jun 05 '24 18:06 zachliu