superset icon indicating copy to clipboard operation
superset copied to clipboard

Field names with multiple spaces cause unexpected error, need nontrivial manual fix

Open torzsmokus opened this issue 8 months ago • 10 comments

TL;DR: Please update field name handling so that an SQL field with multiple spaces in its name is handled correctly. If that is not trivial, please update the error message to help the user find the issue and fix it using the CUSTOM SQL feature.


Bug description

Steps to reproduce

  1. Set up a Superset instance with MS SQL support (pymysql).
  2. Add a data source where the table has column names with multiple spaces.
  3. Start setting up a chart with those columns.

Expected result

Can use all table columns in the chart.

Actual result

Error: Custom SQL fields cannot contain sub-queries.

Screenshots/recordings

Problem:

Image

Solution:

Image

Superset version

4.1.1 not 4.1.2 since that one is not in the helm repo yet

Python version

I don't know

Node version

I don't know

Browser

Chrome (Edge)

Additional context

Story

As a newcomer to superset, I tried to set up my very first chart, just to get the cryptic error message Custom SQL fields cannot contain sub-queries.

Googling for that message, I found myself in a rabbit hole with suggestions like use a feature flag (a what?) that weakens security (smh). I felt something was not right, I just wanted to display a very simple dataset (from a MS SQL Server table) without any subqueries or other magic. Then, buried deep in an old but necromanced discussion #23574 , I found the actual issue: my field names have spaces, and I need to quote them according to the SQL flavour of my data source using the CUSTOM SQL feature to be able to use them as columns (dimensions):

Fields with only one space in its name needed no fix in my case, just the one(s) with more than one space.

Remarks

This is almost, but not quite a duplicate of https://github.com/apache/superset/issues/25072 as I do not use any special queries.

Stack trace

2025-03-27 09:50:38,407:WARNING:superset.views.error_handling:SupersetErrorException
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/views/base_api.py", line 119, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/app/superset/utils/core.py", line 1364, in time_function
    response = func(*args, **kwargs)
  File "/app/superset/utils/log.py", line 303, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/charts/data/api.py", line 260, in data
    return self._get_data_response(
  File "/app/superset/charts/data/api.py", line 416, in _get_data_response
    result = command.run(force_cached=force_cached)
  File "/app/superset/commands/chart/data/get_data_command.py", line 45, in run
    payload = self._query_context.get_payload(
  File "/app/superset/common/query_context.py", line 102, in get_payload
    return self._processor.get_payload(cache_query_context, force_cached)
  File "/app/superset/common/query_context_processor.py", line 675, in get_payload
    query_results = [
  File "/app/superset/common/query_context_processor.py", line 676, in <listcomp>
    get_query_results(
  File "/app/superset/common/query_actions.py", line 227, in get_query_results
    return result_func(query_context, query_obj, force_cached)
  File "/app/superset/common/query_actions.py", line 103, in _get_full
    payload = query_context.get_df_payload(query_obj, force_cached=force_cached)
  File "/app/superset/common/query_context.py", line 123, in get_df_payload
    return self._processor.get_df_payload(
  File "/app/superset/common/query_context_processor.py", line 158, in get_df_payload
    query_result = self.get_query_result(query_obj)
  File "/app/superset/common/query_context_processor.py", line 237, in get_query_result
    result = query_context.datasource.query(query_object.to_dict())
  File "/app/superset/connectors/sqla/models.py", line 1729, in query
    query_str_ext = self.get_query_str_extended(query_obj)
  File "/app/superset/models/helpers.py", line 880, in get_query_str_extended
    sqlaq = self.get_sqla_query(**query_obj)
  File "/app/superset/models/helpers.py", line 1653, in get_sqla_query
    selected = validate_adhoc_subquery(
  File "/app/superset/models/helpers.py", line 138, in validate_adhoc_subquery
    raise SupersetSecurityException(
superset.exceptions.SupersetSecurityException: Custom SQL fields cannot contain sub-queries.

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

torzsmokus avatar Mar 27 '25 10:03 torzsmokus

Hey @torzsmokus, I'm here to help you with your Superset issue and answer any questions you might have. Let's work together to solve this!

The error message "Custom SQL fields cannot contain sub-queries" is related to the validate_adhoc_subquery function in Superset, which checks for sub-queries in SQL expressions and raises this error if they are not allowed [1]. In your case, the issue seems to be related to field names with multiple spaces.

To address this, you can use the CUSTOM SQL feature to quote field names according to the SQL dialect of your data source. For MS SQL, you typically need to use square brackets around field names with spaces, like [Jelenlét Pont címe]. This should help Superset correctly interpret the field names and avoid the error.

If you continue to encounter issues, consider checking if there are any specific configurations or feature flags that might affect SQL field name handling in your Superset setup.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Mar 27 '25 10:03 dosubot[bot]

Is this unique to MS SQL, or can you try to reproduce with any other database types?

@sujiplr is a MS SQL user, perhaps they can validate/help here, if we're lucky :D

rusackas avatar Mar 31 '25 17:03 rusackas

I highly doubt the missing quoting is MS SQL specific, but I cannot be sure: I was only able to check this with MS SQL since that is the only available datasource in the environment where I am working with superset. I may be able to check with another data source later.

The Custom SQL workaround (that both I and the helpful bot found) is quite painful with 300+ columns 😦

torzsmokus avatar Apr 28 '25 13:04 torzsmokus

@dosu Is there any way to automatically convert e.g. underscores in column names to spaces when displaying them? That way I could work around this issue by storing column names with underscores instead of spaces, but still retain readable labels without manual work.

torzsmokus avatar Apr 28 '25 14:04 torzsmokus

Yes, Superset provides a way to define custom display labels for columns, which can help you achieve the transformation of underscores to spaces. You can use the customColumnName control to specify a custom column name for display purposes. This allows you to replace underscores with spaces in the display label without altering the actual column names in the database. Here's a relevant code snippet:

const customColumnName: ControlFormItemSpec<'Input'> = {
  controlType: 'Input',
  label: t('Display column name'),
  description: t('Custom column name (leave blank for default)'),
  debounceDelay: 200,
};

By using this feature, you can store column names with underscores in your database and configure Superset to display them with spaces, making them more readable without manual adjustments for each column [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Apr 28 '25 14:04 dosubot[bot]

Also, the error message could be much more helpful with e.g. pointing to the field which cannot be parsed

torzsmokus avatar Apr 28 '25 16:04 torzsmokus

I just reproduced the bug with a column name having only one space but also a number: Eladó 1

torzsmokus avatar Apr 28 '25 16:04 torzsmokus

Ah, I'll take a look. I'm finishing the work on SIP-117, it might help with this.

betodealmeida avatar May 22 '25 19:05 betodealmeida

This seems to work fine in master after I finished SIP-117:

Image

betodealmeida avatar Jun 05 '25 17:06 betodealmeida

Thanks, will check in the next release (I have no dev env at the moment)

torzsmokus avatar Jun 11 '25 09:06 torzsmokus

This issue is still existing with version 5.0.0

Image

xavier-GitHub76 avatar Oct 14 '25 10:10 xavier-GitHub76

The behavior is OK with ALLOW_ADHOC_SUBQUERY=True in FEATURE_FLAGS (Test done with 5.0.0)

Image

xavier-GitHub76 avatar Oct 30 '25 09:10 xavier-GitHub76