[IMP] fill_stock_move_bom_line_id query performance
We got performance problems on this query for a big stock.move table, the alternative is a lot faster.
Needs a review to verify if the results are 100% the same, though.
@ddejong-therp
@MiquelRForgeFlow fixed
@MiquelRForgeFlow have you checked that the result query is the same?
No, but it seems it has to be the same result by just comparing the code of the two queries :S
I don't get why 2 subselect is more optimal than the previous one, and no stats or explain analyze is shared for comparing them.
I don't get why 2 subselect is more optimal than the previous one
Me neither, but I suppose @thomaspaulb can give us some numbers, right?
@pedrobaeza I sent a message to the person who actually ran the query, maybe he can share an EXPLAIN ANALYZE.
The stats were that on roughly 1.8 million stock moves, the original query took 5+ hours, and the new query was done in a few minutes, but the person mentioned above can confirm also.
I can confirm what @thomaspaulb described - original query took at least 5 hours before I manually aborted it (it never completed), new query took a few minutes.
I will try to get an EXPLAIN ANALYZE if it's helpful - but might take a bit before I do. Will post here.
The important thing also is the postgres version. Maybe the problem is a too old PG version.
The version is:
/usr/lib/postgresql/13/bin/postgres --version
postgres (PostgreSQL) 13.8 (Ubuntu 13.8-1.pgdg18.04+1)
I have tried on a v10 DB with PG 13 both queries removing the NULL condition for applying to all records (and with a decent number or records), with very similar results in time, and more convoluted with your query (but probably PG 13 optimizes correctly, not sure about older ones):
WITH EXISTING QUERY
EXPLAIN ANALYZE UPDATE stock_move sm_update
SET bom_line_id = mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id)
WHERE sm_update.raw_material_production_id = mp.id
AND sm_update.id = sm.id;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=68615.23..112151.30 rows=23 width=380) (actual time=27933.209..27933.216 rows=0 loops=1)
-> Hash Left Join (cost=68615.23..112151.30 rows=23 width=380) (actual time=1145.381..5529.130 rows=331251 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Nested Loop (cost=52906.67..96442.63 rows=23 width=348) (actual time=551.805..4635.343 rows=331251 loops=1)
-> Hash Join (cost=52906.25..96432.12 rows=23 width=342) (actual time=551.721..1487.570 rows=331251 loops=1)
Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
-> Seq Scan on stock_move sm_update (cost=0.00..38474.90 rows=962090 width=328) (actual time=53.167..746.456 rows=615840 loops=1)
-> Hash (cost=38474.90..38474.90 rows=962090 width=18) (actual time=495.553..495.555 rows=331251 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 25014kB
-> Seq Scan on stock_move sm (cost=0.00..38474.90 rows=962090 width=18) (actual time=0.034..387.277 rows=615840 loops=1)
-> Index Scan using mrp_production_pkey on mrp_production mp (cost=0.42..0.46 rows=1 width=14) (actual time=0.009..0.009 rows=1 loops=331251)
Index Cond: (id = sm.raw_material_production_id)
-> Hash (cost=14636.73..14636.73 rows=71455 width=48) (actual time=593.118..593.120 rows=430396 loops=1)
Buckets: 524288 (originally 131072) Batches: 1 (originally 1) Memory Usage: 37721kB
-> Subquery Scan on mbl (cost=13207.63..14636.73 rows=71455 width=48) (actual time=280.643..460.069 rows=430396 loops=1)
-> HashAggregate (cost=13207.63..13922.18 rows=71455 width=12) (actual time=280.630..386.938 rows=430396 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Batches: 1 Memory Usage: 65553kB
-> Seq Scan on mrp_bom_line (cost=0.00..9775.79 rows=457579 width=12) (actual time=0.267..64.367 rows=457493 loops=1)
Planning Time: 2.019 ms
Trigger for constraint stock_move_bom_line_id_fkey: time=400.128 calls=10741
JIT:
Functions: 34
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 4.723 ms, Inlining 0.000 ms, Optimization 3.774 ms, Emission 49.385 ms, Total 57.882 ms
Execution Time: 28470.381 ms
WITH YOUR QUERY
EXPLAIN ANALYZE UPDATE stock_move sm_update
SET bom_line_id = q.bom_line_id
FROM (
SELECT sm.id, mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id
)
WHERE sm.raw_material_production_id = mp.id
) q
WHERE sm_update.id = q.id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=77582.77..130574.52 rows=684522 width=380) (actual time=25668.488..25668.506 rows=0 loops=1)
-> Hash Left Join (cost=77582.77..130574.52 rows=684522 width=380) (actual time=1311.538..2827.397 rows=331251 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Hash Join (cost=61874.22..111272.22 rows=684522 width=348) (actual time=765.058..2011.946 rows=331251 loops=1)
Hash Cond: (sm.raw_material_production_id = mp.id)
-> Hash Join (cost=54360.75..101961.85 rows=684522 width=342) (actual time=635.989..1665.379 rows=615840 loops=1)
Hash Cond: (sm_update.id = sm.id)
-> Seq Scan on stock_move sm_update (cost=0.00..45804.22 rows=684522 width=328) (actual time=0.040..662.032 rows=615840 loops=1)
-> Hash (cost=45804.22..45804.22 rows=684522 width=18) (actual time=632.795..632.797 rows=615840 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 38354kB
-> Seq Scan on stock_move sm (cost=0.00..45804.22 rows=684522 width=18) (actual time=0.027..438.542 rows=615840 loops=1)
-> Hash (cost=5679.32..5679.32 rows=146732 width=14) (actual time=128.130..128.131 rows=146549 loops=1)
Buckets: 262144 Batches: 1 Memory Usage: 8917kB
-> Seq Scan on mrp_production mp (cost=0.00..5679.32 rows=146732 width=14) (actual time=42.048..83.243 rows=146549 loops=1)
-> Hash (cost=14636.73..14636.73 rows=71455 width=48) (actual time=546.018..546.020 rows=430396 loops=1)
Buckets: 524288 (originally 131072) Batches: 1 (originally 1) Memory Usage: 37721kB
-> Subquery Scan on mbl (cost=13207.63..14636.73 rows=71455 width=48) (actual time=257.731..427.076 rows=430396 loops=1)
-> HashAggregate (cost=13207.63..13922.18 rows=71455 width=12) (actual time=257.720..357.283 rows=430396 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Batches: 1 Memory Usage: 65553kB
-> Seq Scan on mrp_bom_line (cost=0.00..9775.79 rows=457579 width=12) (actual time=0.031..42.753 rows=457493 loops=1)
Planning Time: 3.334 ms
Trigger for constraint stock_move_bom_line_id_fkey: time=437.679 calls=10741
JIT:
Functions: 34
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 5.496 ms, Inlining 0.000 ms, Optimization 3.364 ms, Emission 38.766 ms, Total 47.627 ms
Execution Time: 26135.226 ms
so there's still something weird in you statement about the big difference in times. Maybe a missing index when running one query or the other?
To give some context, this is all part of our migration from V8.0 to (eventually) V14.0, and I just happen to be around to do it, but my knowledge here is very limited. So I can't say if anything is 'missing' - but here are the indexes that exist in stock_move along the way:
V8 Original Indexes
Indexes:
"stock_move_pkey" PRIMARY KEY, btree (id)
"stock_move_company_id_index" btree (company_id)
"stock_move_create_date_index" btree (create_date)
"stock_move_date_expected_index" btree (date_expected)
"stock_move_date_index" btree (date)
"stock_move_invoice_state_index" btree (invoice_state)
"stock_move_location_dest_id_index" btree (location_dest_id)
"stock_move_location_id_index" btree (location_id)
"stock_move_move_dest_id_index" btree (move_dest_id)
"stock_move_name_index" btree (name)
"stock_move_picking_id_index" btree (picking_id)
"stock_move_product_id_index" btree (product_id)
"stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
"stock_move_production_id_index" btree (production_id)
"stock_move_purchase_line_id_index" btree (purchase_line_id)
"stock_move_raw_material_production_id_index" btree (raw_material_production_id)
"stock_move_state_index" btree (state)
V9 Indexes
Indexes:
"stock_move_pkey" PRIMARY KEY, btree (id)
"stock_move_company_id_index" btree (company_id)
"stock_move_create_date_index" btree (create_date)
"stock_move_date_expected_index" btree (date_expected)
"stock_move_date_index" btree (date)
"stock_move_location_dest_id_index" btree (location_dest_id)
"stock_move_location_id_index" btree (location_id)
"stock_move_move_dest_id_index" btree (move_dest_id)
"stock_move_name_index" btree (name)
"stock_move_openupgrade_legacy_9_0_invoice_state_index" btree (openupgrade_legacy_9_0_invoice_state)
"stock_move_picking_id_index" btree (picking_id)
"stock_move_product_id_index" btree (product_id)
"stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
"stock_move_production_id_index" btree (production_id)
"stock_move_purchase_line_id_index" btree (purchase_line_id)
"stock_move_raw_material_production_id_index" btree (raw_material_production_id)
"stock_move_state_index" btree (state)
V10 Indexes
Indexes:
"stock_move_pkey" PRIMARY KEY, btree (id)
"stock_move_company_id_index" btree (company_id)
"stock_move_create_date_index" btree (create_date)
"stock_move_date_expected_index" btree (date_expected)
"stock_move_date_index" btree (date)
"stock_move_location_dest_id_index" btree (location_dest_id)
"stock_move_location_id_index" btree (location_id)
"stock_move_move_dest_id_index" btree (move_dest_id)
"stock_move_name_index" btree (name)
"stock_move_openupgrade_legacy_9_0_invoice_state_index" btree (openupgrade_legacy_9_0_invoice_state)
"stock_move_picking_id_index" btree (picking_id)
"stock_move_product_id_index" btree (product_id)
"stock_move_product_location_index" btree (product_id, location_id, location_dest_id, company_id, state)
"stock_move_purchase_line_id_index" btree (purchase_line_id)
"stock_move_state_index" btree (state)
I would actually love to hear that there's a missing index, as I'm currently running stock_account/migrations/13.0.1.1/post-migration.py (from this PR) for 24h+, on 1.8M stock_moves with no end in site.
New query:
EXPLAIN ANALYSE UPDATE stock_move sm_update
SET bom_line_id = q.bom_line_id
FROM (
SELECT sm.id, mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id
)
WHERE sm.raw_material_production_id = mp.id
AND mbl.bom_line_id IS NOT NULL
) q
WHERE sm_update.id = q.id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=507.11..17103.77 rows=493 width=386) (actual time=3661.297..3661.301 rows=0 loops=1)
-> Nested Loop (cost=507.11..17103.77 rows=493 width=386) (actual time=15.258..638.819 rows=48895 loops=1)
-> Hash Join (cost=506.69..16816.23 rows=493 width=56) (actual time=14.608..293.031 rows=48895 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Hash Join (cost=282.36..14910.94 rows=320182 width=24) (actual time=5.012..250.555 rows=163082 loops=1)
Hash Cond: (sm.raw_material_production_id = mp.id)
-> Seq Scan on stock_move sm (cost=0.00..13787.82 rows=320182 width=18) (actual time=0.247..193.144 rows=320182 loops=1)
-> Hash (cost=204.38..204.38 rows=6238 width=14) (actual time=4.720..4.721 rows=6238 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 355kB
-> Seq Scan on mrp_production mp (cost=0.00..204.38 rows=6238 width=14) (actual time=0.261..3.281 rows=6238 loops=1)
-> Hash (cost=206.72..206.72 rows=1174 width=48) (actual time=9.580..9.581 rows=5332 loops=1)
Buckets: 8192 (originally 2048) Batches: 1 (originally 1) Memory Usage: 481kB
-> Subquery Scan on mbl (cost=183.18..206.72 rows=1174 width=48) (actual time=6.202..8.366 rows=5332 loops=1)
-> HashAggregate (cost=183.18..194.98 rows=1174 width=12) (actual time=6.190..7.223 rows=5332 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Filter: (min(mrp_bom_line.id) IS NOT NULL)
Batches: 1 Memory Usage: 721kB
-> Seq Scan on mrp_bom_line (cost=0.00..129.59 rows=5359 width=12) (actual time=0.594..2.514 rows=5359 loops=1)
-> Index Scan using stock_move_pkey on stock_move sm_update (cost=0.42..0.58 rows=1 width=334) (actual time=0.006..0.006 rows=1 loops=48895)
Index Cond: (id = sm.id)
Planning Time: 25.015 ms
Trigger for constraint stock_move_bom_line_id_fkey: time=3.612 calls=308
Execution Time: 3665.446 ms
(23 rows)
Old query:
EXPLAIN ANALYSE UPDATE stock_move sm_update
SET bom_line_id = mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id)
WHERE sm_update.raw_material_production_id = mp.id
AND sm_update.id = sm.id
AND sm_update.bom_line_id IS NULL;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=20678.92..60506.90 rows=26 width=386) (actual time=6599.045..6599.048 rows=0 loops=1)
-> Hash Left Join (cost=20678.92..60506.90 rows=26 width=386) (actual time=490.282..1305.841 rows=102883 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Nested Loop (cost=20467.83..60295.68 rows=26 width=354) (actual time=471.067..1236.679 rows=102883 loops=1)
-> Hash Join (cost=20467.55..60287.83 rows=26 width=348) (actual time=387.341..877.659 rows=102883 loops=1)
Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
-> Seq Scan on stock_move sm_update (cost=0.00..13787.82 rows=259326 width=334) (actual time=0.801..242.594 rows=259580 loops=1)
Filter: (bom_line_id IS NULL)
Rows Removed by Filter: 60602
-> Hash (cost=13787.82..13787.82 rows=320182 width=18) (actual time=385.398..385.399 rows=163082 loops=1)
Buckets: 65536 Batches: 8 Memory Usage: 1561kB
-> Seq Scan on stock_move sm (cost=0.00..13787.82 rows=320182 width=18) (actual time=0.010..340.770 rows=320182 loops=1)
-> Index Scan using mrp_production_pkey on mrp_production mp (cost=0.28..0.30 rows=1 width=14) (actual time=0.003..0.003 rows=1 loops=102883)
Index Cond: (id = sm.raw_material_production_id)
-> Hash (cost=193.38..193.38 rows=1180 width=48) (actual time=19.155..19.156 rows=5332 loops=1)
Buckets: 8192 (originally 2048) Batches: 1 (originally 1) Memory Usage: 481kB
-> Subquery Scan on mbl (cost=169.78..193.38 rows=1180 width=48) (actual time=10.956..16.098 rows=5332 loops=1)
-> HashAggregate (cost=169.78..181.58 rows=1180 width=12) (actual time=10.924..13.178 rows=5332 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Batches: 1 Memory Usage: 721kB
-> Seq Scan on mrp_bom_line (cost=0.00..129.59 rows=5359 width=12) (actual time=0.326..3.181 rows=5359 loops=1)
Planning Time: 35.661 ms
Trigger for constraint stock_move_bom_line_id_fkey: time=3.780 calls=255
Execution Time: 6603.923 ms
In other words, there seems to be a speed difference for me here, but not with the same order of magnitude compared with thomas'/(gal)amit's database. In my personal test situation, it didn't seem too significant. I'm using postgres 13.8 myself. What version are you using @galamit86 ?
Not really difference, as the time gained on execution is spent on planning.
My postgres version is the same as @thomaspaulb mentioned:
/usr/lib/postgresql/13/bin/postgres --version
postgres (PostgreSQL) 13.8 (Ubuntu 13.8-1.pgdg18.04+1)
Following a comment from @ddejong-therp, regarding an index that exists on his DB but not on mine, I've added the following index:
"stock_move_location_id_location_dest_id_product_id_state" btree (product_id, state, location_id, location_dest_id)
which dramatically improved the resulting query time, so good lead @pedrobaeza !
See below the detailed results. The new query still beats the existing one, being about 4 times as fast - but it becomes a non-issue in the overall process now, so I can imagine we might not want to change it to the new one as it is less tested, and I might have a missing index or a different discrepancy on this table or mrp_production (see below those indexes for completeness). I leave it to the experts to decide :)
Happy to provide more info if needed for the decision.
Existing Query
EXPLAIN ANALYZE UPDATE stock_move sm_update
SET bom_line_id = mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id)
WHERE sm_update.raw_material_production_id = mp.id
AND sm_update.id = sm.id
AND sm_update.bom_line_id IS NULL
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=313252.00..725043.04 rows=164 width=358) (actual time=202893.685..202893.690 rows=0 loops=1)
-> Hash Left Join (cost=313252.00..725043.04 rows=164 width=358) (actual time=3410.207..16275.697 rows=1400594 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Nested Loop (cost=302182.57..713972.74 rows=164 width=326) (actual time=3039.606..11192.597 rows=1400594 loops=1)
-> Hash Join (cost=302182.28..713703.49 rows=164 width=320) (actual time=3039.363..8643.307 rows=1400594 loops=1)
Hash Cond: ((sm_update.raw_material_production_id = sm.raw_material_production_id) AND (sm_update.id = sm.id))
-> Seq Scan on stock_move sm_update (cost=0.00..265070.31 rows=1558459 width=306) (actual time=546.962..2687.483 rows=1582529 loops=1)
Filter: (bom_line_id IS NULL)
Rows Removed by Filter: 228189
-> Hash (cost=265070.31..265070.31 rows=1779131 width=18) (actual time=2488.035..2488.036 rows=1628783 loops=1)
Buckets: 32768 Batches: 128 Memory Usage: 903kB
-> Seq Scan on stock_move sm (cost=0.00..265070.31 rows=1779131 width=18) (actual time=0.011..2032.264 rows=1810718 loops=1)
-> Index Scan using mrp_production_pkey on mrp_production mp (cost=0.29..1.64 rows=1 width=14) (actual time=0.001..0.001 rows=1 loops=1400594)
Index Cond: (id = sm.raw_material_production_id)
-> Hash (cost=10853.56..10853.56 rows=14392 width=48) (actual time=370.000..370.001 rows=143883 loops=1)
Buckets: 16384 (originally 16384) Batches: 16 (originally 1) Memory Usage: 1183kB
-> Subquery Scan on mbl (cost=9160.21..10853.56 rows=14392 width=48) (actual time=147.276..319.263 rows=143883 loops=1)
-> HashAggregate (cost=9160.21..10709.64 rows=14392 width=12) (actual time=147.270..284.995 rows=143883 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Batches: 33 Memory Usage: 1393kB Disk Usage: 7248kB
-> Seq Scan on mrp_bom_line (cost=0.00..3656.24 rows=143924 width=12) (actual time=0.184..33.980 rows=143924 loops=1)
Planning Time: 5.459 ms
JIT:
Functions: 40
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 5.572 ms, Inlining 79.877 ms, Optimization 317.485 ms, Emission 192.228 ms, Total 595.162 ms
Execution Time: 202900.788 ms
New Query
EXPLAIN ANALYZE UPDATE stock_move sm_update
SET bom_line_id = q.bom_line_id
FROM (
SELECT sm.id, mbl.bom_line_id
FROM stock_move sm
INNER JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
LEFT JOIN (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
) mbl ON (
mbl.bom_id = mp.bom_id
AND mbl.product_id = sm.product_id
AND mbl.bom_line_id IS NOT NULL
)
WHERE sm.raw_material_production_id = mp.id
AND mbl.bom_line_id IS NOT NULL
) q
WHERE sm_update.id = q.id; QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Update on stock_move sm_update (cost=18488.14..322749.50 rows=5535 width=412) (actual time=47441.125..47441.129 rows=0 loops=1)
-> Nested Loop (cost=18488.14..322749.50 rows=5535 width=412) (actual time=1971.757..29514.071 rows=227535 loops=1)
-> Hash Join (cost=18487.71..312777.15 rows=5535 width=56) (actual time=1971.667..4286.512 rows=227535 loops=1)
Hash Cond: ((mp.bom_id = mbl.bom_id) AND (sm.product_id = mbl.product_id))
-> Hash Join (cost=6702.26..291480.92 rows=1811577 width=24) (actual time=107.229..3071.688 rows=1628783 loops=1)
Hash Cond: (sm.raw_material_production_id = mp.id)
-> Seq Scan on stock_move sm (cost=0.00..258612.77 rows=1811577 width=18) (actual time=32.917..2358.605 rows=1810718 loops=1)
-> Hash (cost=6063.78..6063.78 rows=36678 width=14) (actual time=74.158..74.159 rows=36678 loops=1)
Buckets: 32768 Batches: 2 Memory Usage: 1132kB
-> Seq Scan on mrp_production mp (cost=0.00..6063.78 rows=36678 width=14) (actual time=25.975..67.058 rows=36678 loops=1)
-> Hash (cost=11571.74..11571.74 rows=14248 width=48) (actual time=246.002..246.003 rows=143883 loops=1)
Buckets: 16384 (originally 16384) Batches: 16 (originally 1) Memory Usage: 1183kB
-> Subquery Scan on mbl (cost=9879.83..11571.74 rows=14248 width=48) (actual time=79.662..206.653 rows=143883 loops=1)
-> HashAggregate (cost=9879.83..11429.26 rows=14248 width=12) (actual time=79.657..176.290 rows=143883 loops=1)
Group Key: mrp_bom_line.bom_id, mrp_bom_line.product_id
Filter: ((min(mrp_bom_line.id) IS NOT NULL) AND (min(mrp_bom_line.id) IS NOT NULL))
Batches: 33 Memory Usage: 1393kB Disk Usage: 7248kB
-> Seq Scan on mrp_bom_line (cost=0.00..3656.24 rows=143924 width=12) (actual time=0.004..9.685 rows=143924 loops=1)
-> Index Scan using stock_move_pkey on stock_move sm_update (cost=0.43..1.80 rows=1 width=360) (actual time=0.109..0.109 rows=1 loops=227535)
Index Cond: (id = sm.id)
Planning Time: 2.549 ms
JIT:
Functions: 37
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 5.684 ms, Inlining 0.000 ms, Optimization 2.307 ms, Emission 33.151 ms, Total 41.142 ms
Execution Time: 47446.693 ms
Indexes on mrp_production
Indexes:
"mrp_production_pkey" PRIMARY KEY, btree (id)
"mrp_production_company_id_index" btree (company_id)
"mrp_production_date_deadline_index" btree (date_deadline)
"mrp_production_date_finished_index" btree (date_finished)
"mrp_production_date_planned_finished_index" btree (date_planned_finished)
"mrp_production_date_planned_start_index" btree (date_planned_start)
"mrp_production_date_start_index" btree (date_start)
"mrp_production_message_main_attachment_id_index" btree (message_main_attachment_id)
"mrp_production_name_uniq" UNIQUE CONSTRAINT, btree (name, company_id)
"mrp_production_reservation_state_index" btree (reservation_state)
"mrp_production_state_index" btree (state)
Okay, I think while my new query does indeed lead to better performance, the real issue was the missing index.
@ddejong-therp We are by default running upgrade base on the pre-migration version, right? This should try to recreate any missing indexes, if I'm not mistaken. Could we perhaps add a script that checks the output of that for any error messages about why it could not create that index, and if so, error out? In that case, we would have spotted the missing index almost immediately and could have created it.
I think it's better to do -u all
It's encouraging that this is still improving performance, but I am not sure that the readability is improved. Do you think a CTE version like this will have the same performance improvement and do you agree it is more readable? Warning: untested query.
with bom_line_products as (
SELECT min(id) AS bom_line_id, bom_id, product_id
FROM mrp_bom_line
GROUP BY bom_id, product_id
),
move_bom_lines as (
SELECT sm.id as move_id, blp.bom_line_id
FROM stock_move sm
JOIN mrp_production mp
ON sm.raw_material_production_id = mp.id
JOIN bom_line_products blp
ON blp.bom_id = mp.bom_id
AND blp.product_id = sm.product_id
)
UPDATE stock_move sm
SET bom_line_id = mbl.bom_line_id
FROM move_bom_lines mbl
WHERE sm.id = mbl.move_id;