fix: Missing sql_editor_id index
SUMMARY
As titled, it adds the index for sql_editor_id in query table which seems to have caused a slowdown in the sqllab_bootstrap query.
TESTING INSTRUCTIONS
locally test superset db upgrade
ADDITIONAL INFORMATION
- [ ] Has associated issue:
- [ ] Required feature flags:
- [ ] Changes UI
- [ ] Includes DB Migration (follow approval process in SIP-59)
- [ ] Migration is atomic, supports rollback & is backwards-compatible
- [ ] Confirm DB migration upgrade and downgrade tested
- [ ] Runtime estimates and downtime expectations provided
- [ ] Introduces new feature or API
- [ ] Removes existing feature or API
@justinpark you'll also have to:
- Update the model to reflect the parity between how the SQLAlchemy model is defined and the underlying schema.
- Add a line item to
UPDATING.md*.
Creating a new index to the query could be very expensive (given the size of the table) and may require downtime.
Codecov Report
Attention: Patch coverage is 37.50000% with 5 lines in your changes are missing coverage. Please review.
Project coverage is 69.71%. Comparing base (
721977a) to head (a6ab95f). Report is 9 commits behind head on master.
| Files | Patch % | Lines |
|---|---|---|
| superset/migrations/shared/utils.py | 28.57% | 5 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #27392 +/- ##
==========================================
+ Coverage 67.34% 69.71% +2.37%
==========================================
Files 1909 1909
Lines 74592 74616 +24
Branches 8320 8320
==========================================
+ Hits 50235 52022 +1787
+ Misses 22305 20542 -1763
Partials 2052 2052
| Flag | Coverage Δ | |
|---|---|---|
| hive | 53.74% <37.50%> (?) |
|
| mysql | 77.98% <37.50%> (+<0.01%) |
:arrow_up: |
| postgres | 78.08% <37.50%> (+<0.01%) |
:arrow_up: |
| presto | 53.69% <37.50%> (?) |
|
| python | 83.13% <37.50%> (+4.91%) |
:arrow_up: |
| sqlite | 77.60% <37.50%> (+<0.01%) |
:arrow_up: |
| unit | 56.53% <37.50%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
From experience that table can get massive and depending on your SQL engine, and creating an index can be a locking operation, which can cascade into deadlock territory on this particular table (I think there's still a pretty crazy amount of polling happening in SQL Lab to look at query state). If you're using MySQL I'd treat this migration very very sensitively. I think there are ways to create index in a non-locking way in MySQL, but have to go out of your way to do it (ALGORITHM=INPLACE, LOCK=NONE;). Postgres should do better.
Maybe not a bad time to archive the query table table in your environment ... For most shops I think 90 days worth of query history is enough to go by, not a bad thing to schedule something that deletes 90+ day old queries daily... Note that changed_on is also indexed to help and support and efficient DELETE operation there.
About my earlier comment, the issue may be significant enough that you'd want to write a custom migration for MySQL for your own sake at Airbnb, you can probably go direct DDL CREATE INDEX with the proper flags just for MySQL. I remember fighting hard on some simple migration on that specific table. Connections / locks kept coming, had to silence all k8s deployments, kill MySQL sessions (which kept coming back with polling) and ultimately wait for things to get through. (ALGORITHM=INPLACE, LOCK=NONE;) may allow you to not schedule or incur downtime.
@michael-s-molina @john-bodley PTAL
Codecov Report
Attention: Patch coverage is 28.57143% with 5 lines in your changes are missing coverage. Please review.
Project coverage is 69.98%. Comparing base (
2e5f3ed) to head (3e2c382). Report is 28 commits behind head on master.
| Files | Patch % | Lines |
|---|---|---|
| superset/migrations/shared/utils.py | 16.66% | 5 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #27392 +/- ##
==========================================
+ Coverage 60.49% 69.98% +9.49%
==========================================
Files 1931 1933 +2
Lines 76241 76498 +257
Branches 8566 8566
==========================================
+ Hits 46122 53539 +7417
+ Misses 28015 20855 -7160
Partials 2104 2104
| Flag | Coverage Δ | |
|---|---|---|
| hive | 49.21% <28.57%> (+0.04%) |
:arrow_up: |
| mysql | 77.56% <28.57%> (?) |
|
| postgres | 77.66% <28.57%> (?) |
|
| presto | ? |
|
| python | 83.14% <28.57%> (+19.66%) |
:arrow_up: |
| sqlite | 77.13% <28.57%> (?) |
|
| unit | 57.71% <28.57%> (+0.09%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.