guacamole-client GUACAMOLE-1573: Slow selection of users on scale on Postgres

Replaced left join and MAX(start_date) with selecting last record in history by start_date

Mar 31 '22 11:03 Hepri

Fixed commit message.

Currently requests accepts list of users, which usually means that number of passed user names is considered as small. As for N unique subqueries, it's not true actually - postgres using Index Only Scan at final step of selection and it works as usual join. I did some tests on real database, 963k users, 17.5m user_history records.

Here is the plan for selecting all users with left join + group by + max

Finalize GroupAggregate  (cost=14831818.70..16921374.20 rows=17301844 width=1373) (actual time=42046.398..45327.396 rows=963527 loops=1)
"  Group Key: guacamole_user.user_id, guacamole_entity.entity_id"
"  Buffers: shared hit=7827722 read=133911 dirtied=5910, temp read=1025423 written=1026599"
  I/O Timings: read=32732.059
  ->  Gather Merge  (cost=14831818.70..16640219.23 rows=14418204 width=1373) (actual time=42046.380..44885.919 rows=963527 loops=1)
        Workers Planned: 2
        Workers Launched: 2
"        Buffers: shared hit=7827722 read=133911 dirtied=5910, temp read=1025423 written=1026599"
        I/O Timings: read=32732.059
        ->  Partial GroupAggregate  (cost=14830818.68..14975000.72 rows=7209102 width=1373) (actual time=39179.294..40950.738 rows=321176 loops=3)
"              Group Key: guacamole_user.user_id, guacamole_entity.entity_id"
"              Buffers: shared hit=7827722 read=133911 dirtied=5910, temp read=1025423 written=1026599"
              I/O Timings: read=32732.059
              ->  Sort  (cost=14830818.68..14848841.43 rows=7209102 width=1373) (actual time=38626.534..39682.340 rows=5905855 loops=3)
"                    Sort Key: guacamole_user.user_id, guacamole_entity.entity_id"
                    Sort Method: external merge  Disk: 1145272kB
                    Worker 0:  Sort Method: external merge  Disk: 639824kB
                    Worker 1:  Sort Method: external merge  Disk: 642448kB
"                    Buffers: shared hit=7827722 read=133911 dirtied=5910, temp read=1025423 written=1026599"
                    I/O Timings: read=32732.059
                    ->  Merge Left Join  (cost=209055.40..1073389.42 rows=7209102 width=1373) (actual time=2177.709..32539.696 rows=5905855 loops=3)
                          Merge Cond: (guacamole_user.user_id = guacamole_user_history.user_id)
                          Buffers: shared hit=7827714 read=133911 dirtied=5910
                          I/O Timings: read=32732.059
                          ->  Nested Loop  (cost=0.85..245764.48 rows=398933 width=1365) (actual time=0.521..1734.830 rows=321176 loops=3)
                                Buffers: shared hit=4422640 read=19582 dirtied=1
                                I/O Timings: read=2302.272
                                ->  Parallel Index Scan using guacamole_user_pkey on guacamole_user  (cost=0.42..48764.29 rows=398933 width=1341) (actual time=0.233..608.004 rows=321176 loops=3)
                                      Buffers: shared hit=568501 read=15572 dirtied=1
                                      I/O Timings: read=1280.976
                                ->  Index Scan using guacamole_entity_pkey on guacamole_entity  (cost=0.42..0.49 rows=1 width=28) (actual time=0.003..0.003 rows=1 loops=963527)
                                      Index Cond: (entity_id = guacamole_user.entity_id)
                                      Filter: (type = 'USER'::guacamole_entity_type)
                                      Buffers: shared hit=3854139 read=4010
                                      I/O Timings: read=1021.296
                          ->  Index Only Scan using guacamole_user_history_user_id_start_date on guacamole_user_history  (cost=0.56..720887.62 rows=17301844 width=12) (actual time=0.023..28557.541 rows=17506069 loops=3)
                                Heap Fetches: 3571677
                                Buffers: shared hit=3405074 read=114329 dirtied=5909
                                I/O Timings: read=30429.786
Planning Time: 0.319 ms
Execution Time: 45976.878 ms

And here is the plan for nested query

Group  (cost=1911971.99..2510148.36 rows=957439 width=1373) (actual time=2859.696..5722.555 rows=963531 loops=1)
"  Group Key: guacamole_user.user_id, guacamole_entity.entity_id"
"  Buffers: shared hit=4973236 read=4785 dirtied=18, temp read=33234 written=33302"
  I/O Timings: read=537.094
  ->  Sort  (cost=1911971.99..1914365.59 rows=957439 width=1365) (actual time=2859.650..3068.616 rows=963531 loops=1)
"        Sort Key: guacamole_user.user_id, guacamole_entity.entity_id"
        Sort Method: external merge  Disk: 129712kB
"        Buffers: shared hit=918611 read=4472 dirtied=18, temp read=33234 written=33302"
        I/O Timings: read=532.826
        ->  Merge Join  (cost=0.99..108611.07 rows=957439 width=1365) (actual time=0.013..1899.534 rows=963531 loops=1)
              Merge Cond: (guacamole_user.entity_id = guacamole_entity.entity_id)
              Buffers: shared hit=918611 read=4472 dirtied=18
              I/O Timings: read=532.826
              ->  Index Scan using guacamole_user_single_entity on guacamole_user  (cost=0.42..53808.28 rows=957439 width=1341) (actual time=0.005..943.326 rows=963531 loops=1)
                    Buffers: shared hit=471018 read=4472 dirtied=1
                    I/O Timings: read=532.826
              ->  Index Scan using guacamole_entity_pkey on guacamole_entity  (cost=0.42..40865.32 rows=959338 width=28) (actual time=0.005..352.519 rows=963531 loops=1)
                    Filter: (type = 'USER'::guacamole_entity_type)
                    Buffers: shared hit=447593 dirtied=17
  SubPlan 1
    ->  Limit  (cost=0.56..0.62 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=963531)
          Buffers: shared hit=4054625 read=313
          I/O Timings: read=4.268
          ->  Index Only Scan Backward using guacamole_user_history_user_id_start_date on guacamole_user_history  (cost=0.56..35.83 rows=644 width=8) (actual time=0.002..0.002 rows=1 loops=963531)
                Index Cond: (user_id = guacamole_user.user_id)
                Heap Fetches: 30627
                Buffers: shared hit=4054625 read=313
                I/O Timings: read=4.268
Planning Time: 0.288 ms
Execution Time: 5796.778 ms

And there is the example on selecting 10 users: Left join

HashAggregate  (cost=337.53..339.34 rows=181 width=1373) (actual time=0.263..0.267 rows=10 loops=1)
"  Group Key: guacamole_user.user_id, guacamole_entity.entity_id"
  Buffers: shared hit=126
  ->  Nested Loop Left Join  (cost=1.41..336.17 rows=181 width=1373) (actual time=0.033..0.210 rows=258 loops=1)
        Buffers: shared hit=126
        ->  Nested Loop  (cost=0.85..148.63 rows=10 width=1365) (actual time=0.024..0.114 rows=10 loops=1)
              Buffers: shared hit=80
              ->  Index Scan using guacamole_entity_name_scope on guacamole_entity  (cost=0.42..64.20 rows=10 width=28) (actual time=0.017..0.081 rows=10 loops=1)
"                    Index Cond: ((type = 'USER'::guacamole_entity_type) AND ((name)::text = ANY ('{user1,user2,user3,user4,user5,user6,user7,user8,user9,user10}'::text[])))"
                    Buffers: shared hit=39
              ->  Index Scan using guacamole_user_single_entity on guacamole_user  (cost=0.42..8.44 rows=1 width=1341) (actual time=0.002..0.002 rows=1 loops=10)
                    Index Cond: (entity_id = guacamole_entity.entity_id)
                    Buffers: shared hit=41
        ->  Index Only Scan using guacamole_user_history_user_id_start_date on guacamole_user_history  (cost=0.56..12.31 rows=644 width=12) (actual time=0.003..0.006 rows=26 loops=10)
              Index Cond: (user_id = guacamole_user.user_id)
              Heap Fetches: 0
              Buffers: shared hit=46
Planning Time: 0.329 ms
Execution Time: 0.302 ms

Nested join

Group  (cost=148.79..155.04 rows=10 width=1373) (actual time=0.176..0.213 rows=10 loops=1)
"  Group Key: guacamole_user.user_id, guacamole_entity.entity_id"
  Buffers: shared hit=124
  ->  Sort  (cost=148.79..148.82 rows=10 width=1365) (actual time=0.163..0.165 rows=10 loops=1)
"        Sort Key: guacamole_user.user_id, guacamole_entity.entity_id"
        Sort Method: quicksort  Memory: 27kB
        Buffers: shared hit=80
        ->  Nested Loop  (cost=0.85..148.63 rows=10 width=1365) (actual time=0.034..0.137 rows=10 loops=1)
              Buffers: shared hit=80
              ->  Index Scan using guacamole_entity_name_scope on guacamole_entity  (cost=0.42..64.20 rows=10 width=28) (actual time=0.025..0.097 rows=10 loops=1)
"                    Index Cond: ((type = 'USER'::guacamole_entity_type) AND ((name)::text = ANY ('{user1,user2,user3,user4,user5,user6,user7,user8,user9,user10}'::text[])))"
                    Buffers: shared hit=39
              ->  Index Scan using guacamole_user_single_entity on guacamole_user  (cost=0.42..8.44 rows=1 width=1341) (actual time=0.003..0.003 rows=1 loops=10)
                    Index Cond: (entity_id = guacamole_entity.entity_id)
                    Buffers: shared hit=41
  SubPlan 1
    ->  Limit  (cost=0.56..0.62 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=10)
          Buffers: shared hit=44
          ->  Index Only Scan Backward using guacamole_user_history_user_id_start_date on guacamole_user_history  (cost=0.56..35.83 rows=644 width=8) (actual time=0.004..0.004 rows=1 loops=10)
                Index Cond: (user_id = guacamole_user.user_id)
                Heap Fetches: 0
                Buffers: shared hit=44
Planning Time: 0.244 ms
Execution Time: 0.253 ms

As you can see cost is bigger in case of left join in both of cases

Apr 04 '22 11:04 Hepri

While I generally lean toward using JOIN instead of nested queries in SQL, and I'm skeptical of situations like this, I think this actually makes sense - it is pre-sorting and eliminating potentially a bunch of extra data from the guacamole_user_history table that then doesn't have to be JOINed to the data pulled out of the guacamole_user table.

The only situation where I think this may end up being sub-optimal is if, for some reason, the data in the guacamole_user_history table is more fragmented or unsorted, and the ORDER BY start_date DESC and then LIMIT 1 cost ends up high, but even in those situations it still may be lower cost than having to pull out all of the guacamole_user_history records and JOINing them.

Aug 06 '22 12:08 necouchman