redash icon indicating copy to clipboard operation
redash copied to clipboard

Possible race condition? AttributeError: 'NoneType' object has no attribute 'type'

Open gpspake opened this issue 2 years ago • 3 comments

Issue Summary

We're experiencing what I expect could be a race condition in execution.py

Using a Redshift data source, both the test and queries using the data source fail more often than not with the error: AttributeError: 'NoneType' object has no attribute 'type'

Steps to Reproduce

  1. Set up a new redshift data source in the dashboard
  2. Click test until the datasource test fails with Connection Test Failed: No row was found for one() (This may take a few tries because the failure is intermittent.
  3. Create a query using the redshift datasource and run it. Again, you should get results intermittently but most of the time it will throw the error and fail.

Technical details:

The error in question is happening here https://github.com/getredash/redash/blob/master/redash/tasks/queries/execution.py#L250

Here's what I've gathered and why I think this may be a race condition The QueryExecutor init assigns

self.data_source = self._load_data_source()

Then run() logs via

self._log_progress("executing_query")

which is where we get that error when _log_progress uses self.data_source.type and it blows up.

Whether or not the query fails may depend on whether self._load_data_source() has returned yet. I'm not sure if there's a reason QueryExecutor gets data_source this way as opposed to requiring it on init

When it works, it works so it seems like everything is configured properly which leads us to think this is a bug.

This is happening on latest master with redash running in ECS.

gpspake avatar Oct 04 '23 18:10 gpspake

Anecdote from a team member in case it's useful

FWIW, the redshift connector works perfectly fine for me with Redash running in my local docker. I'm wondering if it's something about access/credentials existing on the right Redash components? There's 3 different redash components, maybe after 10 seconds it hands off to the "scheduler" instead of the "worker" and that one doesn't have access or something? The actual NoneType reference seems like a symptom of some failure

gpspake avatar Oct 05 '23 21:10 gpspake

Guessing there's not more error info (tracebacks, etc) in the docker logs (scheduler, worker, or server) at around the time you're seeing this NoneType one?

justinclift avatar Oct 05 '23 23:10 justinclift

Not that I'm aware of but I'm glad to share more details if anyone has specific ideas about what to look for. Here's a sample of the logs when we see the error image

gpspake avatar Oct 10 '23 16:10 gpspake