superset icon indicating copy to clipboard operation
superset copied to clipboard

Superset chart incorrectly throwing error Virtual dataset query must be read-only

Open zuzana-vej opened this issue 5 years ago • 17 comments

A chart built off of a virtual datasource is erroring saying that the virtual datasource must be readonly but the query is only a select query (it is not modifying any table). It does contain WITH statements, and removing the WITH statements doesn't result in the error. It seems like queries with WITH statements are getting miscategorized as "Readonly".

Result: Error thrown: Virtual dataset query must be read-only

Expected Result: Chart should load

Looks like regression caused by https://github.com/apache/incubator-superset/pull/11236#issuecomment-710669738

Environment

(please complete the following information):

  • superset version: master

Checklist

Make sure these boxes are checked before submitting your issue - thank you!

  • [ ] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.

zuzana-vej avatar Oct 16 '20 22:10 zuzana-vej

Issue-Label Bot is automatically applying the label #bug to this issue, with a confidence of 0.93. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

issue-label-bot[bot] avatar Oct 16 '20 22:10 issue-label-bot[bot]

+1 getting the same error

ktmud avatar Oct 16 '20 23:10 ktmud

@ktmud @zuzana-vej

Please check if you are in the master branch. I took a look at this SQL parser logic if you use the with statement and it will not cause this problem. SQL validate logic at here: https://github.com/apache/incubator-superset/blob/31e4a90440885c5e0100408e65fedc33786240db/superset/sql_parse.py#L116

image

zhaoyongjie avatar Oct 17 '20 15:10 zhaoyongjie

@zhaoyongjie I tested on apache master. This problem specifically occurs when there is no space between AS and the select statement. i.e. WITH test AS(select country, ....

>>> sqlparse.parse("WITH test AS(SELECT 1 AS a) SELECT * FROM test")[0].get_type()
'UNKNOWN'
>>> sqlparse.parse("WITH test AS (SELECT 1 AS a) SELECT * FROM test")[0].get_type()
'SELECT'

serenajiang avatar Oct 19 '20 22:10 serenajiang

WITH statements also break if you name the temporary table events. Some reserved keyword issue?

>>> sqlparse.parse("WITH events AS (SELECT 1 AS a) SELECT * FROM events")[0].get_type()
'UNKNOWN'
>>> sqlparse.parse("WITH not_events AS (SELECT 1 AS a) SELECT * FROM not_events")[0].get_type()
'SELECT'

I'm guessing these issues occur because the sql parser is for some really generic sql. Maybe we should default to allowing the query through if the parsed type is UNKNOWN? I know that could be dangerous, but these errors are really confusing to users and broke a number of our charts.

serenajiang avatar Oct 20 '20 00:10 serenajiang

For reference, I think this is where the lack of magic happens: https://github.com/andialbrecht/sqlparse/blob/master/sqlparse/sql.py#L425-L436

mistercrunch avatar Oct 20 '20 04:10 mistercrunch

fixed in #11365

serenajiang avatar Oct 21 '20 18:10 serenajiang

i have same issues, can confirm that its not fully fixed.

this check is especially disgusting and irrelevant when you have read-only database access.

slavanorm avatar Sep 01 '23 08:09 slavanorm

Facing the same issue in the version 3.0.0. I recently upgraded to 3.0.0-dev image from the 2.1.1-dev image and few of the charts which are using the WITH keyword are failing to load with an error of Virtual dataset query must be read-only.

Requesting to open this issue again.

slice-viditp avatar Oct 17 '23 09:10 slice-viditp

I recently applied an update from version 2.1.0 to 3.0.0 and I'm getting the same error. It's a bit annoying that this doesn't happen in version 2.1.0 but it does in 3.0.0

JersonFerrer avatar Oct 26 '23 13:10 JersonFerrer

Well, I was exploring the error and discovered that (in my case) it occurred because the queries to build the virtual datasets had the UNION or UNION ALL statements, and when they were validated in sql_parse they generated the error.

This bug was fixed in #25290 by @betodealmeida. Therefore, the solution I applied was to upgrade to version 3.0.1.

JersonFerrer avatar Oct 26 '23 21:10 JersonFerrer

Thanks @JersonFerrer for the update! Can anyone else test with 3.0.1 and verify that it addressed their problem? I could then close the issue.

sfirke avatar Oct 31 '23 16:10 sfirke

We just updated to 3.0.1 to try and address this, and still have the same symptom as @slice-viditp Using WITH in the query results in getting the error Virtual dataset query must be read-only

artzyres avatar Nov 06 '23 10:11 artzyres

@artzyres can you give us an example query/chart that fails (preferably using example data) so we can better understand this? Also, is this still unresolved as of 3.1.x?

Hopefully we can close out this antique issue if we can get a handle on the current state of affairs via a reproducible test case. Otherwise, this is at risk of being closed as stale.

rusackas avatar Apr 04 '24 18:04 rusackas

We have just upgraded to 3.1.2 and facing same error wherever we have UNION or UNION ALL inside CTE . Is there any solution we have? Please share. This Issue I have faced with Redshift and Postgresql both.

For ref I have attached Error Query and working query.
problematic query:

`with combine_data as (select 'delivery_order'                                  as table_name,
                             date(ado.finished_date)                           as order_finished_date,
                             count(case when status = 'CANCELLED' then id end) as cancelled_order,
                             count(case when status = 'COMPLETED' then id end) as completed_order,
                             count(id)                                         as total_order
                      from staging.amdelivery_delivery_order ado
                      where date(ado.finished_date) = current_date - 1
                      group by 2

                      UNION ALL

                      (select 'order_flow_history'                                        as table_name,
                              date(creation_date)                                         as order_finished_date,
                              count(case when order_flow_event = 'CANCELLED' then id end) as cancelled_order,
                              count(case when order_flow_event = 'COMPLETED' then id end) as completed_order,
                              count(id)                                                   as total_order
                       from staging.amdelivery_order_flow_history ofh
                       where date(creation_date) = current_date - 1
                       group by 2))

select table_name,
       order_finished_date,
       cancelled_order,
       completed_order,
       total_order
from combine_data`

At same Time when I removed UNION ALL statement and make separate CTE and then apply union , this worked well.

working query:

`with delivery_order as (select 'delivery_order'                                  as table_name,
                               date(ado.finished_date)                           as order_finished_date,
                               count(case when status = 'CANCELLED' then id end) as cancelled_order,
                               count(case when status = 'COMPLETED' then id end) as completed_order,
                               count(id)                                         as total_order
                        from staging.amdelivery_delivery_order ado
                        where ado.service_type IN ('delivery', 'courier')
                          and date(ado.finished_date) = current_date - 1
                        group by 2),

     order_flow_history as
         (select 'order_flow_history'                                        as table_name,
                 date(creation_date)                                         as order_finished_date,
                 count(case when order_flow_event = 'CANCELLED' then id end) as cancelled_order,
                 count(case when order_flow_event = 'COMPLETED' then id end) as completed_order,
                 count(id)                                                   as total_order
          from staging.amdelivery_order_flow_history ofh
          where date(creation_date) = current_date - 1
          group by 2),

     combine_data as (select *
                      from delivery_order
                      UNION ALL
                      select *
                      from order_flow_history)

select table_name,
       order_finished_date,
       cancelled_order,
       completed_order,
       total_order
from combine_data



`

VikramKumarArammeem avatar May 21 '24 14:05 VikramKumarArammeem

The problem seems to be related to parentheses. If you remove the ( ) from your second subquery (select 'order_flow_history'....), it should work.

rahmedrbx avatar Oct 16 '24 00:10 rahmedrbx

Right, those parenthesis after the UNION ALL are throwing off the parser, and probably not standard/ANSI-SQL Screenshot 2024-10-17 at 8 41 31 AM

mistercrunch avatar Oct 16 '24 23:10 mistercrunch

Thank you for reporting this issue. However, the issue was reported on a version of Superset that we no longer support or it does not contain a valid Superset version. As of this moment, we only actively support Superset 4.0 and 4.1. To maintain a more actionable Issues backlog, we're going to close this issue for now. If you (or anyone reading this) are still experiencing this on a currently supported version of Superset, please either reopen this Issue, or file a new one with updated context (screenshots, reproduction steps) and we'll do our best to support it. Thank you

michael-s-molina avatar Jan 23 '25 19:01 michael-s-molina