Superset chart incorrectly throwing error Virtual dataset query must be read-only
A chart built off of a virtual datasource is erroring saying that the virtual datasource must be readonly but the query is only a select query (it is not modifying any table). It does contain WITH statements, and removing the WITH statements doesn't result in the error. It seems like queries with WITH statements are getting miscategorized as "Readonly".
Result:
Error thrown: Virtual dataset query must be read-only
Expected Result: Chart should load
Looks like regression caused by https://github.com/apache/incubator-superset/pull/11236#issuecomment-710669738
Environment
(please complete the following information):
- superset version: master
Checklist
Make sure these boxes are checked before submitting your issue - thank you!
- [ ] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [x] I have reproduced the issue with at least the latest released version of superset.
- [x] I have checked the issue tracker for the same issue and I haven't found one similar.
Issue-Label Bot is automatically applying the label #bug to this issue, with a confidence of 0.93. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!
Links: app homepage, dashboard and code for this bot.
+1 getting the same error
@ktmud @zuzana-vej
Please check if you are in the master branch. I took a look at this SQL parser logic if you use the with statement and it will not cause this problem. SQL validate logic at here: https://github.com/apache/incubator-superset/blob/31e4a90440885c5e0100408e65fedc33786240db/superset/sql_parse.py#L116

@zhaoyongjie I tested on apache master. This problem specifically occurs when there is no space between AS and the select statement. i.e. WITH test AS(select country, ....
>>> sqlparse.parse("WITH test AS(SELECT 1 AS a) SELECT * FROM test")[0].get_type()
'UNKNOWN'
>>> sqlparse.parse("WITH test AS (SELECT 1 AS a) SELECT * FROM test")[0].get_type()
'SELECT'
WITH statements also break if you name the temporary table events. Some reserved keyword issue?
>>> sqlparse.parse("WITH events AS (SELECT 1 AS a) SELECT * FROM events")[0].get_type()
'UNKNOWN'
>>> sqlparse.parse("WITH not_events AS (SELECT 1 AS a) SELECT * FROM not_events")[0].get_type()
'SELECT'
I'm guessing these issues occur because the sql parser is for some really generic sql. Maybe we should default to allowing the query through if the parsed type is UNKNOWN? I know that could be dangerous, but these errors are really confusing to users and broke a number of our charts.
For reference, I think this is where the lack of magic happens: https://github.com/andialbrecht/sqlparse/blob/master/sqlparse/sql.py#L425-L436
fixed in #11365
i have same issues, can confirm that its not fully fixed.
this check is especially disgusting and irrelevant when you have read-only database access.
Facing the same issue in the version 3.0.0. I recently upgraded to 3.0.0-dev image from the 2.1.1-dev image and few of the charts which are using the WITH keyword are failing to load with an error of Virtual dataset query must be read-only.
Requesting to open this issue again.
I recently applied an update from version 2.1.0 to 3.0.0 and I'm getting the same error. It's a bit annoying that this doesn't happen in version 2.1.0 but it does in 3.0.0
Well, I was exploring the error and discovered that (in my case) it occurred because the queries to build the virtual datasets had the UNION or UNION ALL statements, and when they were validated in sql_parse they generated the error.
This bug was fixed in #25290 by @betodealmeida. Therefore, the solution I applied was to upgrade to version 3.0.1.
Thanks @JersonFerrer for the update! Can anyone else test with 3.0.1 and verify that it addressed their problem? I could then close the issue.
We just updated to 3.0.1 to try and address this, and still have the same symptom as @slice-viditp
Using WITH in the query results in getting the error Virtual dataset query must be read-only
@artzyres can you give us an example query/chart that fails (preferably using example data) so we can better understand this? Also, is this still unresolved as of 3.1.x?
Hopefully we can close out this antique issue if we can get a handle on the current state of affairs via a reproducible test case. Otherwise, this is at risk of being closed as stale.
We have just upgraded to 3.1.2 and facing same error wherever we have UNION or UNION ALL inside CTE . Is there any solution we have? Please share. This Issue I have faced with Redshift and Postgresql both.
For ref I have attached Error Query and working query.
problematic query:
`with combine_data as (select 'delivery_order' as table_name,
date(ado.finished_date) as order_finished_date,
count(case when status = 'CANCELLED' then id end) as cancelled_order,
count(case when status = 'COMPLETED' then id end) as completed_order,
count(id) as total_order
from staging.amdelivery_delivery_order ado
where date(ado.finished_date) = current_date - 1
group by 2
UNION ALL
(select 'order_flow_history' as table_name,
date(creation_date) as order_finished_date,
count(case when order_flow_event = 'CANCELLED' then id end) as cancelled_order,
count(case when order_flow_event = 'COMPLETED' then id end) as completed_order,
count(id) as total_order
from staging.amdelivery_order_flow_history ofh
where date(creation_date) = current_date - 1
group by 2))
select table_name,
order_finished_date,
cancelled_order,
completed_order,
total_order
from combine_data`
At same Time when I removed UNION ALL statement and make separate CTE and then apply union , this worked well.
working query:
`with delivery_order as (select 'delivery_order' as table_name,
date(ado.finished_date) as order_finished_date,
count(case when status = 'CANCELLED' then id end) as cancelled_order,
count(case when status = 'COMPLETED' then id end) as completed_order,
count(id) as total_order
from staging.amdelivery_delivery_order ado
where ado.service_type IN ('delivery', 'courier')
and date(ado.finished_date) = current_date - 1
group by 2),
order_flow_history as
(select 'order_flow_history' as table_name,
date(creation_date) as order_finished_date,
count(case when order_flow_event = 'CANCELLED' then id end) as cancelled_order,
count(case when order_flow_event = 'COMPLETED' then id end) as completed_order,
count(id) as total_order
from staging.amdelivery_order_flow_history ofh
where date(creation_date) = current_date - 1
group by 2),
combine_data as (select *
from delivery_order
UNION ALL
select *
from order_flow_history)
select table_name,
order_finished_date,
cancelled_order,
completed_order,
total_order
from combine_data
`
The problem seems to be related to parentheses. If you remove the ( ) from your second subquery (select 'order_flow_history'....), it should work.
Right, those parenthesis after the UNION ALL are throwing off the parser, and probably not standard/ANSI-SQL
Thank you for reporting this issue. However, the issue was reported on a version of Superset that we no longer support or it does not contain a valid Superset version. As of this moment, we only actively support Superset 4.0 and 4.1. To maintain a more actionable Issues backlog, we're going to close this issue for now. If you (or anyone reading this) are still experiencing this on a currently supported version of Superset, please either reopen this Issue, or file a new one with updated context (screenshots, reproduction steps) and we'll do our best to support it. Thank you