chainloop
chainloop copied to clipboard
Add indexes to workflows and workflow runs
There are some queries that can be improved in terms of performance by adding some indexes, for example the retrieval of workflow runs based on a workflow, currently, the SQL plan looks like to retrieve last 3 workflows runs of a workflow:
SELECT "workflow_runs"."id",
"workflow_runs"."created_at",
"workflow_runs"."finished_at",
"workflow_runs"."state",
"workflow_runs"."reason",
"workflow_runs"."run_url",
"workflow_runs"."runner_type",
"workflow_runs"."attestation",
"workflow_runs"."attestation_digest",
"workflow_runs"."attestation_state",
"workflow_runs"."contract_revision_used",
"workflow_runs"."contract_revision_latest"
FROM "workflow_runs"
WHERE EXISTS (SELECT "workflows"."id"
FROM "workflows"
WHERE "workflow_runs"."workflow_workflowruns" = "workflows"."id"
AND "workflows"."id" = 'f09a517b-6975-489b-956e-14446b30b0d5')
ORDER BY "workflow_runs"."created_at" DESC
LIMIT 3
Limit (cost=1970.54..1970.55 rows=3 width=229)
-> Sort (cost=1970.54..1970.55 rows=3 width=229)
Sort Key: workflow_runs.created_at DESC
-> Nested Loop (cost=0.00..1970.52 rows=3 width=229)
-> Seq Scan on workflows (cost=0.00..6.28 rows=1 width=16)
Filter: (id = 'f09a517b-6975-489b-956e-14446b30b0d5'::uuid)
-> Seq Scan on workflow_runs (cost=0.00..1964.21 rows=3 width=245)
Filter: (workflow_workflowruns = 'f09a517b-6975-489b-956e-14446b30b0d5'::uuid)
Just by adding an index on the workflow_workflowruns, improves:
Limit (cost=18.43..18.44 rows=3 width=229)
-> Sort (cost=18.43..18.44 rows=3 width=229)
Sort Key: workflow_runs.created_at DESC
-> Nested Loop (cost=0.29..18.40 rows=3 width=229)
-> Seq Scan on workflows (cost=0.00..6.28 rows=1 width=16)
Filter: (id = 'f09a517b-6975-489b-956e-14446b30b0d5'::uuid)
-> Index Scan using workflowrun_workflow on workflow_runs (cost=0.29..12.10 rows=3 width=245)
Index Cond: (workflow_workflowruns = 'f09a517b-6975-489b-956e-14446b30b0d5'::uuid)
Great find @javirln, could you point out to the code that's making these calls? I'd also like to review that those queries are properly formed. wdyt?
btw, since foreign keys do not mean indexes, we should probably look into some other potential indexes.
Anything related to organizations should probably be indexed, especially the tables that have a lot of entries, like casbackend_mappings
make -C app/controlplane visualize-data-model
should we spend some time evaluating the relationships we have here and creating indexes accordingly?
Totally, I would go through the queries and try to identify low hanging fruits to improve