redash icon indicating copy to clipboard operation
redash copied to clipboard

Data source groups aren't checked for uniqueness

Open benmanns opened this issue 8 years ago • 1 comments

We ran into an issue with our groups not showing any data sources. I traced it to our GET /api/groups/{id}/data_sources call returning a 500. Sentry records an error (really helpful, btw):

MultipleResultsFound
Multiple rows were found for one()

on this line:

https://github.com/getredash/redash/blob/5b54a777d91e18398f68fcae4bdc669f438faec0/redash/models.py#L491

And sure enough, we had multiple rows in data_source_groups for a single (data_source_id, group_id) pair.

I traced it back to the API for creating data source group associations:

https://github.com/getredash/redash/blob/5b54a777d91e18398f68fcae4bdc669f438faec0/redash/handlers/groups.py#L130-L131

Which doesn't check if a matching data source group row exists, and can be triggered by repeated requests if you click to add a data source multiple times or in different browsers.

What do you think about adding a unique index to (data_source_id, group_id), which would prevent the duplicate data in the DB? We can also add a guard to check for an existing row and return that instead of creating a new one. Checking only in Python, though, still leaves us open to a race condition on the write.

benmanns avatar Aug 31 '17 16:08 benmanns

Extra interesting because DELETE /api/groups/{groupId}/data_sources/{dataSourceId} will delete all duplicates, but GET returns 500 as you reported 🤷

fjeldstad avatar Jun 03 '24 11:06 fjeldstad