redash
redash copied to clipboard
Possible race condition? AttributeError: 'NoneType' object has no attribute 'type'
Issue Summary
We're experiencing what I expect could be a race condition in execution.py
Using a Redshift data source, both the test and queries using the data source fail more often than not with the error:
AttributeError: 'NoneType' object has no attribute 'type'
Steps to Reproduce
- Set up a new redshift data source in the dashboard
- Click test until the datasource test fails with
Connection Test Failed: No row was found for one()(This may take a few tries because the failure is intermittent. - Create a query using the redshift datasource and run it. Again, you should get results intermittently but most of the time it will throw the error and fail.
Technical details:
The error in question is happening here https://github.com/getredash/redash/blob/master/redash/tasks/queries/execution.py#L250
Here's what I've gathered and why I think this may be a race condition
The QueryExecutor init assigns
self.data_source = self._load_data_source()
Then run() logs via
self._log_progress("executing_query")
which is where we get that error when _log_progress uses self.data_source.type and it blows up.
Whether or not the query fails may depend on whether self._load_data_source() has returned yet. I'm not sure if there's a reason QueryExecutor gets data_source this way as opposed to requiring it on init
When it works, it works so it seems like everything is configured properly which leads us to think this is a bug.
This is happening on latest master with redash running in ECS.
Anecdote from a team member in case it's useful
FWIW, the redshift connector works perfectly fine for me with Redash running in my local docker. I'm wondering if it's something about access/credentials existing on the right Redash components? There's 3 different redash components, maybe after 10 seconds it hands off to the "scheduler" instead of the "worker" and that one doesn't have access or something? The actual NoneType reference seems like a symptom of some failure
Guessing there's not more error info (tracebacks, etc) in the docker logs (scheduler, worker, or server) at around the time you're seeing this NoneType one?
Not that I'm aware of but I'm glad to share more details if anyone has specific ideas about what to look for. Here's a sample of the logs when we see the error