redash icon indicating copy to clipboard operation
redash copied to clipboard

Change query_hash calculation method with separate params

Open ehooi opened this issue 9 months ago • 6 comments

What type of PR is this?

Change query_hash calculation method.

Example

  • Query: select c1, c2 from table where c3="{{ key }}"
  • Parameter: key=value
  • Apply Auto Limit

Before:

selectc1,c2fromtablewherec3="value"LIMIT1000

After:

selectc1,c2fromtablewherec3="{{key}}"
{"auto_limit":true,"parameters":{"key":"value"}}
  • [X] Refactor
  • [ ] Feature
  • [ ] Bug Fix
  • [ ] New Query Runner (Data Source)
  • [ ] New Alert Destination
  • [ ] Other

Description

I want to fundamentally fix #4514 #6063 #6683 I believe the problem lies in the dynamic date(-range) parameters being evaluated on the frontend side. Before evaluating dynamic params on the backend side, we need to separate query text and parameters.

This is a breaking change. All query hashes will be changed. You will need to re-save and execute all scheduled queries once to be scheduled properly afterward.

How is this tested?

  • [x] Unit tests (pytest, jest)
  • [ ] E2E Tests (Cypress)
  • [ ] Manually
  • [ ] N/A

Related Tickets & Documents

#4514 #6063 #6683

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

ehooi avatar May 10 '24 07:05 ehooi

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 63.84%. Comparing base (10a46fd) to head (d479136). Report is 2 commits behind head on master.

:exclamation: Current head d479136 differs from pull request most recent head 2d13628

Please upload reports for the commit 2d13628 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6960      +/-   ##
==========================================
- Coverage   63.85%   63.84%   -0.01%     
==========================================
  Files         161      161              
  Lines       13082    13095      +13     
  Branches     1811     1815       +4     
==========================================
+ Hits         8353     8361       +8     
- Misses       4429     4431       +2     
- Partials      300      303       +3     
Files Coverage Δ
redash/handlers/query_results.py 81.77% <100.00%> (+0.19%) :arrow_up:
redash/models/__init__.py 91.93% <100.00%> (-0.03%) :arrow_down:
redash/query_runner/__init__.py 77.01% <100.00%> (+0.38%) :arrow_up:
redash/tasks/queries/execution.py 91.66% <100.00%> (-0.26%) :arrow_down:
redash/tasks/queries/maintenance.py 64.65% <ø> (ø)
redash/utils/__init__.py 72.53% <100.00%> (+0.39%) :arrow_up:

... and 4 files with indirect coverage changes

codecov[bot] avatar May 10 '24 07:05 codecov[bot]

This is a breaking change. All query hashes will be changed. You will need to re-save and execute all scheduled queries once to be scheduled properly afterward.

Reckon this could be made automatic with a migration of some sort?

justinclift avatar May 12 '24 23:05 justinclift

Either migration or maybe some period of checking both hashes (old and new).

arikfr avatar May 17 '24 14:05 arikfr

Either migration or maybe some period of checking both hashes (old and new).

I can't think of how a database migration would do this correctly. The technique I have been using is to use a script to POST an empty payload to force the server to recalculate hashes:

for query in queries:
    # Empty payload will still cause the query has to be re-calculated with default parameters applied
    updated_query = redash.post(f"/api/queries/{query['id']}", data={})

eradman avatar May 17 '24 14:05 eradman

Hmmm, this sounds like a pain, but perhaps a necessary one.

Migration wise, hmmm. Seems like it needs to have some part of it Python driven (as @eradman mentions) as the database can't drive it purely using database functions.

Does our Python migration framework not provide some existing capability for this?

justinclift avatar May 17 '24 19:05 justinclift

I found a migration file. 1038c2174f5d_make_case_insensitive_hash_of_query_text.py It seems to be something similar to what we needed. Plus, I realized that I broke that migration file. Haha.

I wrote the migration code. It has no downgrade code. Because the dynamic date(-range) parameters were lost. This is the root cause of the problem I am trying to fix.

ehooi avatar May 19 '24 18:05 ehooi