superset icon indicating copy to clipboard operation
superset copied to clipboard

fix(charts): add query_context validation to prevent incomplete metadata

Open ysinghc opened this issue 1 month ago • 3 comments

SUMMARY

This PR adds comprehensive validation for the query_context field in both ChartPostSchema and ChartPutSchema to ensure that when provided, it contains the required metadata fields (datasource and queries) needed for chart data retrieval.

Changes:

  • Created a new validate_query_context_metadata() function in superset/utils/schema.py that validates the structure of query_context JSON
  • Updated ChartPostSchema to use the new validator (previously used generic validate_json)
  • Updated ChartPutSchema to add the new validator (previously had no validation)
  • Added comprehensive unit tests for the new validator in tests/unit_tests/utils/test_schema.py
  • Added integration tests in tests/unit_tests/charts/test_schemas.py to verify schema-level validation

Rationale: Previously, the query_context field only validated that the value was valid JSON, but didn't ensure it contained the necessary structure for chart operations. This could lead to runtime errors when charts are rendered with incomplete query context data. The new validation ensures data integrity at the API level.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A - Backend validation change with no UI impact

TESTING INSTRUCTIONS

  1. Run the unit tests:

    pytest tests/unit_tests/utils/test_schema.py -v
    pytest tests/unit_tests/charts/test_schemas.py::test_chart_post_schema_query_context_validation -v
    pytest tests/unit_tests/charts/test_schemas.py::test_chart_put_schema_query_context_validation -v
    
  2. Manual API testing (optional):

    • Start Superset in development mode
    • Try to create a chart (POST /api/v1/chart/) with invalid query_context (missing datasource or queries)
    • Verify that the API returns a validation error
    • Try with valid query_context containing both fields - should succeed
    • Try with null query_context - should succeed (field is nullable)
  3. Verify pre-commit hooks pass:

    git add .
    pre-commit run
    

ADDITIONAL INFORMATION

  • [X] Has associated issue: Fixes #35774
  • [ ] Required feature flags:
  • [ ] Changes UI
  • [ ] Includes DB Migration (follow approval process in SIP-59)
    • [ ] Migration is atomic, supports rollback & is backwards-compatible
    • [ ] Confirm DB migration upgrade and downgrade tested
    • [ ] Runtime estimates and downtime expectations provided
  • [ ] Introduces new feature or API
  • [ ] Removes existing feature or API

ysinghc avatar Nov 11 '25 17:11 ysinghc

Code Review Agent Run #50c92e

Actionable Suggestions - 0
Additional Suggestions - 10
  • tests/unit_tests/charts/test_schemas.py - 2
    • Docstring missing period at end · Line 161-161
      Docstring should end with a period. This issue also appears on line 246.
      Code suggestion
       @@ -161,1 +161,1 @@
      -    """Test that ChartPostSchema validates query_context contains required metadata"""
      +    """Test that ChartPostSchema validates query_context contains required metadata."""
      
    • Missing trailing comma in dictionary · Line 169-169
      Missing trailing comma in dictionary/list literals. This issue appears on lines 169, 192, 254, and 265.
      Code suggestion
       @@ -168,2 +168,2 @@
      -            "queries": [{"metrics": ["count"], "columns": []}],
      -        }
      +            "queries": [{"metrics": ["count"], "columns": []}],
      +        }
      
  • superset/utils/schema.py - 3
    • Docstring format and imperative mood · Line 57-57
      Docstring should start summary on first line and use imperative mood. Change `"Validator for query_context field"` to `"Validate query_context field"`.
      Code suggestion
       @@ -56,4 +56,3 @@
       def validate_query_context_metadata(value: bytes | bytearray | str | None) -> None:
      -    """
      -    Validator for query_context field to ensure it contains required metadata.
      +    """Validate query_context field to ensure it contains required metadata.
      
           Validates that the query_context JSON contains the required 'datasource' and
      
    • String literal in exception message · Line 72-72
      Exception message `"JSON not valid"` should be assigned to a variable first. This is part of multiple similar issues (lines 53, 72, 76) that should be addressed consistently.
      Code suggestion
       @@ -69,3 +69,4 @@
           try:
               parsed_data = json.loads(value)
           except json.JSONDecodeError as ex:
      -        raise ValidationError("JSON not valid") from ex
      +        error_msg = "JSON not valid"
      +        raise ValidationError(error_msg) from ex
      
    • String literal in exception message · Line 76-76
      Exception message `"Query context must be a valid JSON object"` should be assigned to a variable first. This is part of multiple similar issues that should be addressed consistently.
      Code suggestion
       @@ -74,2 +74,3 @@
           # Validate required fields exist in the query_context
           if not isinstance(parsed_data, dict):
      -        raise ValidationError("Query context must be a valid JSON object")
      +        error_msg = "Query context must be a valid JSON object"
      +        raise ValidationError(error_msg)
      
  • tests/unit_tests/utils/test_schema.py - 5
    • Missing module docstring at file start · Line 1-1
      Add a module-level docstring to describe the purpose of this test file. This improves code documentation and follows Python conventions.
      Code suggestion
       @@ -17,6 +17,9 @@
      -
      -import pytest
      +"""
      +Unit tests for superset.utils.schema module.
      +"""
      +
      +import pytest
      
    • Docstring should end with period punctuation · Line 30-30
      Multiple docstrings throughout the file are missing proper punctuation. Function docstrings should end with a period. This affects lines 30, 37, 44, 52, 58, 65, 73, 79, 91, 103, 111, 119, 131, 141, 152, 166, 178, 190, 202, 208, 215, 223, 231, 239, 270, 277, 284, 291, 298, 305, 312, and 321.
      Code suggestion
       @@ -29,2 +29,2 @@
      -def test_validate_json_valid() -> None:
      -    """Test validate_json with valid JSON string"""
      +def test_validate_json_valid() -> None:
      +    """Test validate_json with valid JSON string."""
      
    • Missing trailing comma in dictionary literal · Line 84-84
      Multiple dictionary literals are missing trailing commas. This affects lines 84, 96, 121, 159, 171, 183, 195, 259, and 263. Adding trailing commas improves code consistency and makes diffs cleaner.
      Code suggestion
       @@ -81,5 +81,5 @@
      -        {
      -            "datasource": {"type": "table", "id": 1},
      -            "queries": [{"metrics": ["count"], "columns": []}],
      -        }
      +        {
      +            "datasource": {"type": "table", "id": 1},
      +            "queries": [{"metrics": ["count"], "columns": []}],
      +        }
      
    • Boolean positional argument in function call · Line 232-232
      Boolean value `True` is passed as a positional argument to `json.dumps()`. Consider using a keyword argument for better readability.
      Code suggestion
       @@ -231,2 +231,2 @@
      -    """Test validate_query_context_metadata with boolean instead of JSON object"""
      -    bool_value = json.dumps(True)
      +    """Test validate_query_context_metadata with boolean instead of JSON object"""
      +    bool_value = json.dumps(obj=True)
      
    • Magic number used in comparison statement · Line 301-301
      Magic numbers `2` are used in comparisons on lines 301 and 317. Consider defining these as named constants for better code maintainability.
      Code suggestion
       @@ -298,4 +298,5 @@
      -def test_one_of_case_insensitive_non_string_valid() -> None:
      -    """Test OneOfCaseInsensitive validator with non-string valid value"""
      -    validator = OneOfCaseInsensitive([1, 2, 3])
      -    result = validator(2)
      +def test_one_of_case_insensitive_non_string_valid() -> None:
      +    """Test OneOfCaseInsensitive validator with non-string valid value"""
      +    VALID_NUMBER = 2
      +    validator = OneOfCaseInsensitive([1, VALID_NUMBER, 3])
      +    result = validator(VALID_NUMBER)
      
Review Details
  • Files reviewed - 4 · Commit Range: 287674d..287674d
    • superset/charts/schemas.py
    • superset/utils/schema.py
    • tests/unit_tests/charts/test_schemas.py
    • tests/unit_tests/utils/test_schema.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

bito-code-review[bot] avatar Nov 11 '25 17:11 bito-code-review[bot]

Codecov Report

:x: Patch coverage is 20.00000% with 12 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 67.75%. Comparing base (d123249) to head (e1b7c46). :warning: Report is 255 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/schema.py 20.00% 12 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #36076       +/-   ##
===========================================
+ Coverage        0   67.75%   +67.75%     
===========================================
  Files           0      637      +637     
  Lines           0    47117    +47117     
  Branches        0     5134     +5134     
===========================================
+ Hits            0    31923    +31923     
- Misses          0    13917    +13917     
- Partials        0     1277     +1277     
Flag Coverage Δ
hive 43.51% <6.66%> (?)
mysql 66.81% <20.00%> (?)
postgres 66.86% <20.00%> (?)
presto 47.14% <6.66%> (?)
python 67.72% <20.00%> (?)
sqlite 66.58% <20.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Nov 11 '25 18:11 codecov[bot]

Running CI, but I think i defer to @betodealmeida here :D

rusackas avatar Nov 17 '25 22:11 rusackas

CodeAnt AI is reviewing your PR.

CodeAnt AI finished reviewing your PR.

💡 Enhance Your PR Reviews

We noticed that 3 feature(s) are not configured for this repository. Enabling these features can help improve your code quality and workflow:

🚦 Quality Gates

Status: Quality Gates are not enabled at the organization level Learn more about Quality Gates

🎫 Jira Ticket Compliance

Status: Jira credentials file not found. Please configure Jira integration in your settings Learn more about Jira Integration

⚙️ Custom Rules

Status: No custom rules configured. Add rules via organization settings or .codeant/review.json in your repository Learn more about Custom Rules


Want to enable these features? Contact your organization admin or check our documentation for setup instructions.

CodeAnt AI is running Incremental review