Can't update datasets with duplicated names
Datasets can not be updated when they point to different tables that have the same name.
How to reproduce the bug
- Go to Datasets
- Click on + Dataset
- Select a database
- Select a schema
- Select a table (i.e. "report")
- Click on Create Dataset and Create Chart
- Go to Datasets again
- Click on + Dataset
- Select a database
- Select a schema, different from step 4
- Select a table, different from the one selected in step 5 but select one with the same name (i.e. "report")
- Click on Create Dataset and Create Chart (This step only works when the metastore database is not SQLite)
- Go to Datasets
- Click on Edit button (pencil icon) next to any dataset created in the previous steps
- Click on Save and then on Ok
Expected results
Successful message is displayed: "The dataset has been saved".
Actual results
An error message is displayed: "Dataset {dataset name} already exists".
Screenshots
Environment
- browser type and version: Google Chrome Version 114.0.5735.133 (Official Build) (arm64)
- superset version: Superset 0.0.0-dev
- python version: Python 3.9.16
- node.js version: (unsure about this, I'm using the docker image https://hub.docker.com/r/apache/superset)
- any feature flags active: None
Checklist
Make sure to follow these steps before submitting your issue - thank you!
- [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [x] I have reproduced the issue with at least the latest released version of superset.
- [x] I have checked the issue tracker for the same issue and I haven't found one similar.
Additional context
The Database Connection used is a Google Bigquery one.
curl request extracted from the browser network tab:
curl 'http://[REDACTED]/api/v1/dataset/2' \
-X 'PUT' \
-H 'Accept: application/json' \
-H 'Accept-Language: en-US,en;q=0.9' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Cookie: session=[REDACTED]' \
-H 'Origin: http://34.72.116.158' \
-H 'Referer: http://34.72.116.158/tablemodelview/list/?pageIndex=0&sortColumn=schema&sortOrder=asc' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36' \
-H 'X-CSRFToken: [REDACTED]' \
--data-raw '{
"table_name": "combined_report",
"database_id": 1,
"sql": null,
"filter_select_enabled": true,
"fetch_values_predicate": null,
"schema": "[REDACTED]",
"description": null,
"main_dttm_col": "date",
"offset": 0,
"default_endpoint": null,
"cache_timeout": null,
"is_sqllab_view": false,
"template_params": null,
"extra": null,
"is_managed_externally": false,
"metrics": [
{
"expression": "COUNT(*)",
"description": null,
"metric_name": "count",
"metric_type": "count",
"d3format": null,
"verbose_name": "COUNT(*)",
"warning_text": null,
"extra": "{}",
"id": 2
}
],
"columns": [
{
"id": 14,
"column_name": "account_id",
"type": "STRING",
"advanced_data_type": null,
"verbose_name": null,
"description": null,
"expression": null,
"filterable": true,
"groupby": true,
"is_active": true,
"is_dttm": false,
"python_date_format": null,
"uuid": "[REDACTED]",
"extra": "{}"
}
],
"owners": [
1
]
}
' \
--compressed \
--insecure
response extracted from the browser newtork tab:
422 UNPROCESSABLE ENTITY {"message":{"table_name":["Dataset combined_report already exists"]}}
Hi!
I have the same error when i want to save dataset after syncing a new column.
I have the same issue as well.
@zowen-ch what version of Superset are you using? The rest of this thread might be using Superset 2.x, which isn't supported any more (we're on 3.1, and almost to 4.0)
Thank you for having a look @rusackas! Our instance is currently 3.0.0. If you think 3.1 would solve the issue, I will work to get it updated. But yes, I'm experiencing this error where repeated table names from different schemas are preventing column refreshes with the same error modal as above.
I don't currently have access to any databases with identical table names under different schemas. It would be appreciated if anyone can test this on a current 3.1.x or 4.x release, otherwise we might close it as stale. CC @sadpandajoe @michael-s-molina in case they have the means to test this.
Can't even create 2 dataset where the table name is the same but it's in different databases:
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.IntegrityError) UNIQUE constraint failed: tables.table_name
[SQL: INSERT INTO tables (uuid, created_on, changed_on, description, default_endpoint, is_featured, filter_select_enabled, "offset", cache_timeout, params, perm, schema_perm, is_managed_externally, external_url, table_name, main_dttm_col, database_id, fetch_values_predicate, schema, sql, is_sqllab_view, template_params, extra, normalize_columns, always_filter_main_dttm, created_by_fk, changed_by_fk) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
$ pip list|grep apache-superset
apache-superset 4.0.1
Ps, I would really like to do this tho!
I'm still having the problem in 4.0.2
This has been silent for quite a while now. Is it still an issue in 4.1.2/5.0.0?