server icon indicating copy to clipboard operation
server copied to clipboard

[PM-21044] - optimize security task ReadByUserIdStatus

Open jaasen-livefront opened this issue 8 months ago • 5 comments

🎟️ Tracking

https://bitwarden.atlassian.net/browse/PM-21044

📔 Objective

The original query What we had was a single flat query with numerous JOINs and a GROUP BY,. This forced the optimizer to consider large intermediate rowsets and perform hidden de‑duping and scans.

PLEASE NOTE - I'm not familiar with db migrations so they're currently absent

After taking @shane-melton's initial plan (see ticket) to use focused CTE's I added further refinements:

Aspect First Draft Final Version
Source tables Used the *View tables (OrganizationUserView, SecurityTaskView, CipherView) plus a join to Organization Switched to the base tables (OrganizationUser, SecurityTask, etc.), avoiding any hidden complexity or overhead that the views might introduce.
Status filter WHERE ST.Status = COALESCE(@Status, ST.Status) → non‑sargable, forced scans/filters Rewritten as WHERE @Status IS NULL OR ST.Status = @Status → fully sargable, pushes directly into an Index Seek on (Status, OrganizationId, CreationDate DESC) and eliminates scan + downstream SORT.
Deduplication UNION in AccessibleCiphers + SELECT DISTINCT in final query (both introduce hidden sorts) Changed to use UNION ALL (no dedupe) and a simple NOT EXISTS semi‑join to exclude duplicates before the append—no Sort or Stream Aggregate needed
EXISTS vs JOIN OR EXISTS (SELECT 1 FROM AccessibleCiphers AC WHERE AC.CipherId = ST.CipherId) (correlated subquery per row) Replaced with a single LEFT JOIN AccessibleCiphers AC ON ST.CipherId = AC.CipherId and WHERE ST.CipherId IS NULL OR AC.CipherId IS NOT NULL. This flattens the plan into one nested‑loops join over the small CTE, cutting I/O and simplifying the optimizer’s job.
Sorting ORDER BY ST.CreationDate DESC always triggered a SORT operator By including CreationDate DESC in the third key of our covering IX_SecurityTask_Status_OrgId_CreationDateDesc index, the engine can return rows in descending order directly—no separate SORT required.
Overall cost ~0.036 (estimated subtree cost) ~0.005 (estimated), 5 ms cold run, 2 logical reads on SecurityTask, no SORT, and only tiny seeks on the CTEs.

Added Four targeted non‑clustered indexes. Eliminated scans on CollectionGroup, SecurityTask, etc., and covered every lookup.

  1. IX_CollectionGroup_GroupId_ReadOnly on (GroupId, ReadOnly)
CREATE NONCLUSTERED INDEX IX_CollectionGroup_GroupId_ReadOnly
  ON dbo.CollectionGroup (GroupId, ReadOnly)
  INCLUDE (CollectionId);
  1. IX_CollectionUser_OrganizationUserId_ReadOnly on (OrganizationUserId, ReadOnly)
CREATE NONCLUSTERED INDEX IX_CollectionUser_OrganizationUserId_ReadOnly
  ON dbo.CollectionUser (OrganizationUserId, ReadOnly)
  INCLUDE (CollectionId);
  1. IX_SecurityTask_Status_OrgId_CreationDateDesc on (Status, OrganizationId, CreationDate DESC) including (CipherId, Type, RevisionDate)
CREATE NONCLUSTERED INDEX IX_SecurityTask_Status_OrgId_CreationDateDesc
  ON dbo.SecurityTask (Status, OrganizationId, CreationDate DESC)
  INCLUDE (CipherId, [Type], RevisionDate);

📸 Screenshots

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

jaasen-livefront avatar May 06 '25 21:05 jaasen-livefront

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 53.76%. Comparing base (353b596) to head (3b7154d). :warning: Report is 33 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5779      +/-   ##
==========================================
+ Coverage   49.51%   53.76%   +4.25%     
==========================================
  Files        1783     1783              
  Lines       79365    79365              
  Branches     7054     7054              
==========================================
+ Hits        39299    42674    +3375     
+ Misses      38535    35077    -3458     
- Partials     1531     1614      +83     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar May 06 '25 21:05 codecov[bot]

Logo Checkmarx One – Scan Summary & Detailsd3477724-0c38-48ab-9faa-56a10914a8f1

Great job! No new security vulnerabilities introduced in this pull request

github-actions[bot] avatar May 06 '25 21:05 github-actions[bot]

Changes to the Sproc look good! Just need to make a small change to the migration script.

Added Four targeted non‑clustered indexes. Eliminated scans on CollectionGroup, SecurityTask, etc., and covered every lookup.

I'm not seeing any new indexes being created as mentioned in the PR description?

@shane-melton 🤦 It's been so long since I put this together I completely forgot to add them. They're now added.

jaasen-livefront avatar Jun 13 '25 21:06 jaasen-livefront

Sproc change looks good to me. I'm less familiar with the impacts of the new indexes being created.

@rkac-bw would you mind taking a look at the new nonclustered indexes being added?

Indeed they could be overkill. I'm not married to them if you guys prefer to leave them out. ;)

jaasen-livefront avatar Jun 16 '25 16:06 jaasen-livefront

@rkac-bw @shane-melton Ready for another look!

jaasen-livefront avatar Jun 27 '25 00:06 jaasen-livefront

@rkac-bw Thanks for the detailed and thoughtful feedback. Much appreciated! I believe I have fulfilled all your requests save for the UPDATE STATISTICS bit as I'm unsure as to how exactly you'd like it implemented. Let me know if you feel it should be included here.

jaasen-livefront avatar Jul 23 '25 22:07 jaasen-livefront

@rkac-bw Thanks for the detailed and thoughtful feedback. Much appreciated! I believe I have fulfilled all your requests save for the UPDATE STATISTICS bit as I'm unsure as to how exactly you'd like it implemented. Let me know if you feel it should be included here.

Yeah stats not to be included, server side job takes care of that

r-kac avatar Aug 07 '25 18:08 r-kac

@rkac-bw Thanks for the detailed feedback. I appreciate it! I've applied your suggestions, should be ready for another look when you have time. ;)

jaasen-livefront avatar Aug 11 '25 17:08 jaasen-livefront

@mkincaid-bw Thanks for your detailed analysis. All your comments have been addressed and ready for another look. ;)

jaasen-livefront avatar Sep 04 '25 00:09 jaasen-livefront

Looks good except one small thing I missed before. Can you please add a comment indicating what the value of 2 represents for Status? The ReadOnly and Enabled values are obvious but Status isn't.

Done! https://github.com/bitwarden/server/pull/5779/commits/6d03a40a7a5914f5163bcff397700138541b2499

jaasen-livefront avatar Sep 04 '25 19:09 jaasen-livefront