Ghost icon indicating copy to clipboard operation
Ghost copied to clipboard

SLO-106 Added index on status field for posts table

Open vershwal opened this issue 1 year ago • 4 comments

There are slow queries on the posts table that take up to 5 minutes to browse posts for customers with a large number of posts. Sentry link: https://ghost-foundation.sentry.io/performance/?isDefaultQuery=false&metricSearchSetting=metricsOnly&project=1879038&query=transaction%3A%22GET+%2Fghost%2Fapi%2Fadmin%2Fposts%2F%22&statsPeriod=7d&utc=true

There are two slow queries and an index on [status, type] will improve the performance of both of these queries.

vershwal avatar May 09 '24 10:05 vershwal

It looks like this PR contains a migration 👀 Here's the checklist for reviewing migrations:

General requirements

  • [x] Satisfies idempotency requirement (both up() and down())
  • [x] Does not reference models
  • [x] Filename is in the correct format (and correctly ordered)
  • [x] Targets the next minor version
  • [ ] All code paths have appropriate log messages
  • [x] Uses the correct utils
  • [x] Contains a minimal changeset
  • [x] Does not mix DDL/DML operations

Schema changes

  • [x] Both schema change and related migration have been implemented
  • [ ] For index changes: has been performance tested for large tables
  • [ ] ~For new tables/columns: fields use the appropriate predefined field lengths~
  • [ ] ~For new tables/columns: field names follow the appropriate conventions~
  • [ ] ~Does not drop a non-alpha table outside of a major version~

Data changes

  • [ ] ~Mass updates/inserts are batched appropriately~
  • [ ] ~Does not loop over large tables/datasets~
  • [ ] ~Defends against missing or invalid data~
  • [ ] ~For settings updates: follows the appropriate guidelines~

github-actions[bot] avatar May 09 '24 10:05 github-actions[bot]

@vershwal Thanks for creating this! Can you attach EXPLAIN ANALYZE output of this query before and after the change so we can understand the impact it's having on the query analyzer? Can you also test this before/after on a database in staging that's having this problem and share what improvement this is making for the customer as experienced from the browser?

erik-ghost avatar May 09 '24 13:05 erik-ghost

Query 1

SELECT `posts`.*, (
SELECT COUNT(DISTINCT `members_click_events`.`member_id`)
FROM `members_click_events`
INNER JOIN `redirects` ON
  `members_click_events`.`redirect_id` = `redirects`.`id`
WHERE posts.id = redirects.post_id) AS `count__clicks`
FROM `posts`
WHERE (
  `posts`.`status` in ('draft') AND `posts`.`type` = 'post'
)
ORDER BY CASE WHEN posts.status = 'scheduled' THEN 1 WHEN
  posts.status = 'draft' THEN 2 ELSE 3 END ASC,CASE WHEN
  posts.status != 'draft' THEN posts.published_at END
  DESC,posts.updated_at DESC,posts.id DESC
LIMIT 20

Current behavior:-

-> Limit: 20 row(s)  (cost=6015 rows=20) (actual time=876..876 rows=20 loops=1)
    -> Sort row IDs: (case when (posts.`status` = 'scheduled') then 1 when (posts.`status` = 'draft') then 2 else 3 end), (case when (posts.`status` <> 'draft') then posts.published_at end) DESC, posts.updated_at DESC, posts.id DESC, limit input to 20 row(s) per chunk  (cost=6015 rows=7991) (actual time=876..876 rows=20 loops=1)
        -> Filter: ((posts.`type` = 'post') and (posts.`status` = 'draft'))  (cost=6015 rows=7991) (actual time=0.615..875 rows=800 loops=1)
            -> Table scan on posts  (cost=6015 rows=7991) (actual time=0.604..872 rows=15000 loops=1)
-> Select #2 (subquery in projection; dependent)
    -> Aggregate: count(distinct members_click_events.member_id)  (cost=1091 rows=1) (actual time=0.0633..0.0633 rows=1 loops=20)
        -> Nested loop inner join  (cost=1000 rows=908) (actual time=0.063..0.063 rows=0 loops=20)
            -> Covering index lookup on redirects using redirects_post_id_foreign (post_id=posts.id)  (cost=1.6 rows=5.29) (actual time=0.0629..0.0629 rows=0 loops=20)
            -> Index lookup on members_click_events using members_click_events_redirect_id_foreign (redirect_id=redirects.id)  (cost=175 rows=171) (never executed)

After adding index:-

-> Limit: 20 row(s)  (cost=280 rows=20) (actual time=120..121 rows=20 loops=1)
    -> Sort row IDs: (case when (posts.`status` = 'scheduled') then 1 when (posts.`status` = 'draft') then 2 else 3 end), (case when (posts.`status` <> 'draft') then posts.published_at end) DESC, posts.updated_at DESC, posts.id DESC, limit input to 20 row(s) per chunk  (cost=280 rows=800) (actual time=120..121 rows=20 loops=1)
        -> Index lookup on posts using status_type (status='draft', type='post')  (cost=280 rows=800) (actual time=0.662..119 rows=800 loops=1)
-> Select #2 (subquery in projection; dependent)
    -> Aggregate: count(distinct members_click_events.member_id)  (cost=1091 rows=1) (actual time=0.0395..0.0395 rows=1 loops=20)
        -> Nested loop inner join  (cost=1000 rows=908) (actual time=0.0391..0.0391 rows=0 loops=20)
            -> Covering index lookup on redirects using redirects_post_id_foreign (post_id=posts.id)  (cost=1.59 rows=5.29) (actual time=0.0389..0.0389 rows=0 loops=20)
            -> Index lookup on members_click_events using members_click_events_redirect_id_foreign (redirect_id=redirects.id)  (cost=175 rows=171) (never executed)

The estimated cost has decreased from 6015 to 280 The actual time taken has decreased from 876 ms to 120 ms

vershwal avatar May 09 '24 13:05 vershwal

Query 2

SELECT COUNT(DISTINCT posts.id) AS aggregate
FROM `posts`
WHERE (
  `posts`.`status` in ('draft') AND `posts`.`type` = 'post'
)

Current behavior:-

-> Limit: 200 row(s)  (cost=6744 rows=1) (actual time=882..882 rows=1 loops=1)
    -> Aggregate: count(distinct posts.id)  (cost=6744 rows=1) (actual time=882..882 rows=1 loops=1)
        -> Filter: ((posts.`type` = 'post') and (posts.`status` = 'draft'))  (cost=6736 rows=79.9) (actual time=1.53..880 rows=800 loops=1)
            -> Table scan on posts  (cost=6736 rows=7991) (actual time=1.51..878 rows=15000 loops=1)

After adding index

-> Limit: 200 row(s)  (cost=207 rows=1) (actual time=20.5..20.5 rows=1 loops=1)
    -> Aggregate: count(distinct posts.id)  (cost=207 rows=1) (actual time=20.5..20.5 rows=1 loops=1)
        -> Covering index lookup on posts using status_type (status='draft', type='post')  (cost=127 rows=800) (actual time=0.126..0.938 rows=800 loops=1)

The estimated cost has decreased from 6744 to 207 The actual time taken has decreased from 882 ms to 20.5 ms

vershwal avatar May 09 '24 14:05 vershwal