addons icon indicating copy to clipboard operation
addons copied to clipboard

Slow reviewer tools queue pages load

Open wagnerand opened this issue 5 years ago • 19 comments
trafficstars

The filters for the mad queue we added in mozilla/addons-server#15224 and mozilla/addons-server#15264 are extremely slow. Since they are also used to display the count in queue header links, every queue page is affected, as well as submitting a listed review (which redirects to a queue page). Queue pages take 20s or longer to load.

┆Issue is synchronized with this Jira Task

wagnerand avatar Sep 01 '20 10:09 wagnerand

There are a couple compounding issues causing this:

  • When the queryset has a distinct(), Django wraps the full query in a subquery and counts on that. We can try to at least force it to avoid needlessly selecting the columns when it does that, locally it does seem to improve things a tiny bit.
  • We have a few queries that are not using indexes - not just the mad queue, but that one is particularly bad. I'll review the queries and make a migration to add relevant indexes.

diox avatar Sep 01 '20 14:09 diox

Closing because mozilla/addons-server#15401 should help, please re-open if that's no the case once it hits prod.

diox avatar Sep 09 '20 08:09 diox

Unfortunately, the queries are still slow, I tried a few times and response times were between 15 and 20 seconds.

wagnerand avatar Sep 10 '20 17:09 wagnerand

The rather dramatic perf hit can be seen on Grafana: https://earthangel-b40313e5.influxcloud.net/d/sGEtfYcZk/amo-reviewer-tools-performance-and-monitoring?viewPanel=8&orgId=1&from=now-90d&to=now

diox avatar Sep 14 '20 10:09 diox

The problematic query in all its glory:

SELECT DISTINCT `addons`.`created`,
       `addons`.`modified`,
       `addons`.`id`,
       `addons`.`guid`,
       `addons`.`slug`,
       `addons`.`name`,
       `addons`.`defaultlocale`,
       `addons`.`addontype_id`,
       `addons`.`status`,
       `addons`.`icontype`,
       `addons`.`icon_hash`,
       `addons`.`homepage`,
       `addons`.`supportemail`,
       `addons`.`supporturl`,
       `addons`.`description`,
       `addons`.`summary`,
       `addons`.`developercomments`,
       `addons`.`eula`,
       `addons`.`privacypolicy`,
       `addons`.`averagerating`,
       `addons`.`bayesianrating`,
       `addons`.`totalreviews`,
       `addons`.`textreviewscount`,
       `addons`.`weeklydownloads`,
       `addons`.`hotness`,
       `addons`.`average_daily_users`,
       `addons`.`last_updated`,
       `addons`.`inactive`,
       `addons`.`target_locale`,
       `addons`.`contributions`,
       `addons`.`current_version`,
       `addons`.`experimental`,
       `addons`.`reputation`,
       `addons`.`requires_payment`,
       `versions`.`created`,
       `versions`.`modified`,
       `versions`.`id`,
       `versions`.`addon_id`,
       `versions`.`license_id`,
       `versions`.`releasenotes`,
       `versions`.`approvalnotes`,
       `versions`.`version`,
       `versions`.`nomination`,
       `versions`.`reviewed`,
       `versions`.`deleted`,
       `versions`.`source`,
       `versions`.`channel`,
       `versions`.`git_hash`,
       `versions`.`needs_human_review`,
       `editors_autoapprovalsummary`.`created`,
       `editors_autoapprovalsummary`.`modified`,
       `editors_autoapprovalsummary`.`version_id`,
       `editors_autoapprovalsummary`.`is_locked`,
       `editors_autoapprovalsummary`.`has_auto_approval_disabled`,
       `editors_autoapprovalsummary`.`is_promoted_prereview`,
       `editors_autoapprovalsummary`.`should_be_delayed`,
       `editors_autoapprovalsummary`.`is_blocked`,
       `editors_autoapprovalsummary`.`verdict`,
       `editors_autoapprovalsummary`.`weight`,
       `editors_autoapprovalsummary`.`weight_info`,
       `editors_autoapprovalsummary`.`confirmed`,
       `addons_addonreviewerflags`.`created`,
       `addons_addonreviewerflags`.`modified`,
       `addons_addonreviewerflags`.`addon_id`,
       `addons_addonreviewerflags`.`needs_admin_code_review`,
       `addons_addonreviewerflags`.`needs_admin_content_review`,
       `addons_addonreviewerflags`.`needs_admin_theme_review`,
       `addons_addonreviewerflags`.`auto_approval_disabled`,
       `addons_addonreviewerflags`.`auto_approval_disabled_until_next_approval`,
       `addons_addonreviewerflags`.`auto_approval_delayed_until`,
       `addons_addonreviewerflags`.`notified_about_auto_approval_delay`,
       `addons_addonreviewerflags`.`notified_about_expiring_delayed_rejections`,
       `addons_addonapprovalscounter`.`created`,
       `addons_addonapprovalscounter`.`modified`,
       `addons_addonapprovalscounter`.`addon_id`,
       `addons_addonapprovalscounter`.`counter`,
       `addons_addonapprovalscounter`.`last_human_review`,
       `addons_addonapprovalscounter`.`last_content_review`
  FROM `addons`
  LEFT OUTER JOIN `versions`
    ON (`addons`.`current_version` = `versions`.`id`)
  LEFT OUTER JOIN `versions_versionreviewerflags`
    ON (`versions`.`id` = `versions_versionreviewerflags`.`version_id`)
 INNER JOIN `versions` T4
    ON (`addons`.`id` = T4.`addon_id`)
 INNER JOIN `files`
    ON (T4.`id` = `files`.`version_id`)
  LEFT OUTER JOIN `versions_versionreviewerflags` T6
    ON (T4.`id` = T6.`version_id`)
  LEFT OUTER JOIN `editors_autoapprovalsummary`
    ON (`versions`.`id` = `editors_autoapprovalsummary`.`version_id`)
  LEFT OUTER JOIN `addons_addonreviewerflags`
    ON (`addons`.`id` = `addons_addonreviewerflags`.`addon_id`)
  LEFT OUTER JOIN `addons_addonapprovalscounter`
    ON (`addons`.`id` = `addons_addonapprovalscounter`.`addon_id`)
 WHERE (('en-us'='en-us') AND NOT (`addons`.`status` = 11) AND `versions_versionreviewerflags`.`pending_rejection` IS NULL AND `addons`.`status` IN (4, 3, 0) AND `files`.`status` IN (4, 1) AND ((T4.`channel` = 1 AND T6.`needs_human_review_by_mad` = 1) OR `versions_versionreviewerflags`.`needs_human_review_by_mad` = 1))
 ORDER BY `addons`.`created` ASC

diox avatar Sep 14 '20 13:09 diox

It doesn't feel right that we need such a complicated query for something that needs to be done routinely, tbh. I can understand the complexity if it's for some one-time query, but for routine stuff, maybe it's time to re-think the data model...

bqbn avatar Sep 15 '20 07:09 bqbn

The query is not actually that complex. It can be simplified into:

SELECT DISTINCT `addons`.`created`,
       `addons`.`modified`,
       `addons`.`id`
  FROM `addons`
  LEFT OUTER JOIN `versions`
    ON (`addons`.`current_version` = `versions`.`id`)
  LEFT OUTER JOIN `versions_versionreviewerflags`
    ON (`versions`.`id` = `versions_versionreviewerflags`.`version_id`)
 INNER JOIN `versions` T4
    ON (`addons`.`id` = T4.`addon_id`)
 INNER JOIN `files`
    ON (T4.`id` = `files`.`version_id`)
  LEFT OUTER JOIN `versions_versionreviewerflags` T6
    ON (T4.`id` = T6.`version_id`)
 WHERE (`versions_versionreviewerflags`.`pending_rejection` IS NULL AND `addons`.`status` IN (4, 3, 0) AND `files`.`status` IN (4, 1) AND ((T4.`channel` = 1 AND T6.`needs_human_review_by_mad` = 1) OR `versions_versionreviewerflags`.`needs_human_review_by_mad` = 1))
 ORDER BY `addons`.`created` ASC LIMIT 10;

And the perf problem remains. The biggest issue with it is that we are trying to deal with unlisted and listed versions in a single query and they have completely different requirements.

diox avatar Sep 15 '20 10:09 diox

We discussed this with Andreas today and I need to test that out with a production database snapshot but my idea would be:

  • Separate mad queue (and the other needs human review queue) into one for unlisted, one for listed
  • Don't show counts for the queues in the tab bar (keep them on the dashboard), and do show the count for the current queue
  • Bonus: tweak layout, the tab bar was never meant to display that many queues originally...

diox avatar Sep 15 '20 20:09 diox

https://github.com/mozilla/addons/issues/7936 will hide the queue counts and tweak the layout. Keeping this one opened to separate the Flagged by scanners / Flagged for human review queues into separate listed/unlisted queues.

diox avatar Sep 16 '20 13:09 diox

Actually I've filed https://github.com/mozilla/addons/issues/7937 for the separation of the queues, and we'll keep this one opened until all the perf issues are gone.

diox avatar Sep 16 '20 13:09 diox

As expected: now that we've removed the counts most of the queues are fast, the dashboard and flagged for human review queues are still slow.

diox avatar Sep 28 '20 15:09 diox

mozilla/addons-server#15767 should address dashboard performance by removing scanner queues counts.

wagnerand avatar Oct 19 '20 11:10 wagnerand

Closing as we have done as much as we could in the scope of this issue. Further performance improvements would need more invasive measures, like a mysql upgrade (which is already planned) or a complete architectural, visual and/or workflow redesign of how add-ons are reviewed.

wagnerand avatar Oct 19 '20 16:10 wagnerand

This has regressed a lot, page load takes 35-45s now for the "Flagged for human review" queue.

wagnerand avatar Apr 20 '21 10:04 wagnerand

Note that the more results there are, the slower that query is going to be, so technically it might not be a regression in terms of code changes. It would be interesting to test https://github.com/mozilla/addons/issues/1835 with and without the DISTINCT to see if removing it helps: now that we have done a bunch of work to try to ensure there is only one file per version at all times, maybe we can get away with removing it. (https://github.com/mozilla/addons/issues/8181 would guarantee it)

diox avatar Apr 20 '21 11:04 diox

Sadly, I spoke too soon: the DISTINCT here is on addons table so we can't remove it. It's here because there are multiple versions that could cause the add-on to be in the queue, so it needs to stay.

diox avatar Apr 21 '21 10:04 diox

Reviewer Tools Performance for the past 10 months:

reviewer_tools_perf

You can see on this graph the initial loss in performance caused by mozilla/addons-server#15224 and mozilla/addons-server#15264 towards mid-August (which made the queue more correct) and the slight improvement mozilla/addons-server#15401, mozilla/addons#7936 and mozilla/addons#7937 made towards the end of September. It's been stable around 30-40s since.

diox avatar Apr 21 '21 11:04 diox

https://github.com/mozilla/addons/issues/8814 should hopefully make the scanners and mad queues slightly faster. There is a little more that could be gained by avoiding computing the count of the number of results and do an infinite-like pagination instead.

Ultimately though, we'll likely soon reach a ceiling of what can be done with the current approach - the "queues" are querying a lot of different tables and MySQL needs to merge the results together, and that's an expensive operation, especially with ordering added into the mix. More de-normalization with dedicated tables for the queues would avoid this entirely.

diox avatar Jul 05 '22 15:07 diox

Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-46

KevinMind avatar May 03 '24 16:05 KevinMind

@diox should we remove the MAD queue, given the stat of MAD?

wagnerand avatar Aug 29 '24 12:08 wagnerand