camunda-bpm-platform
camunda-bpm-platform copied to clipboard
Additional Indexes for external task and activity instances
Acceptance Criteria (Required on creation)
Hints
Background: We observed some slowness during our performance testing of camunda in our ecosystem and further analysing one of the area was queries running slower upon calling camunda APIs, to improve this query performance a little bit we added additional indexes which I want to contribute back to help others.
Below are the index I wanted to introduce.
ACT_RU_EXT_TASK(PROC_INST_ID_)
ACT_HI_JOB_LOG(DEPLOYMENT_ID_)
ACT_HI_ACTINST(PROC_DEF_ID_)
Attached details around some improvements we achieved with additional indexes. index_verification.docx
Links
Breakdown
### Pull Requests
- [ ] https://github.com/camunda/camunda-bpm-platform/pull/4181
Dev2QA handover
- [ ] Does this ticket need a QA test and the testing goals are not clear from the description? Add a Dev2QA handover comment
Hi @sumankumarpani,
Thank you for raising this issue and for proposing those indexes.
In general, adding indexes to the product is nothing we can do thoughtless since they can also have a negative performance impact on everyone using PostgreSQL + Camunda 7. While indexes might improve the performance of certain queries, other SQL statements like INSERT, DELETE, and UPDATE might perform worse since, in addition, the respective indexes need to be maintained.
Before adding new indexes to the product, we need to be assured that these indexes positively impact queries that most of our users benefit from (e.g., because the performance of a core feature is improved).
To understand better if adding these indexes makes sense, could you please answer the following questions:
- What are the queries that perform better after applying the indexes?
- Can you share query execution plans before and after applying these indexes that prove the performance is improved?
Thanks a lot for your attention and participation. 🙂 👍
Best, Tassilo
Hi @sumankumarpani,
Any updates here?
Best, Tassilo
Closed due to inactivity.
Hello @tasso94 , sorry for the late reply , I did some analysis and found below index is making more sense to add. While the cost saving doesn't look to be too promising, once we add this , our asynchronous delete of process instances (POST /process-instance/delete) is way more faster. Let me know what are your thoughts.
ACT_HI_ACTINST(PROC_DEF_ID_)
explain select distinct RES.* from ACT_HI_ACTINST RES leftjoin ( select A.* from ACT_RU_AUTHORIZATION A where A.TYPE_< 9 and( A.USER_ID_in('xyz', 'abc','def') or A.GROUP_ID_in (select distinct A.GROUP_ID_ from ACT_RU_AUTHORIZATION A) ) and( ( A.RESOURCE_TYPE_= 14and A.PERMS_& 1=1) ) ) AUTH ON( AUTH.RESOURCE_ID_in( RES.PROC_DEF_KEY_, 'Process_1da4zcw') ) where RES.PROC_DEF_ID_= 'bffda88b-93ce-11eb-b38c-029c8faa5e3f'and RES.START_TIME_>= '2021-06-25 17:57:07.145' and RES.END_TIME_<= '2025-06-25 17:57:07.145' and AUTH.RESOURCE_ID_isnull and ( RES.TENANT_ID_isnull) orderby RES.END_TIME_desc limit 100
QUERY PLAN (WITHOUT INDEX) | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Limit (cost=20840285.07..20840290.32 rows=100 width=751) | -> Unique (cost=20840285.07..20982301.04 rows=2705066 width=751) | -> Sort (cost=20840285.07..20847047.74 rows=2705066 width=751) | Sort Key: res.end_time_ DESC, res.id_, res.parent_act_inst_id_, res.proc_def_key_, res.root_proc_inst_id_, res.proc_inst_id_, res.execution_id_, res.act_id_, res.task_id_, res.call_proc_inst_id_, res.call_case_inst_id_, res.act_name_, res.ac| -> Nested Loop Anti Join (cost=7832627.99..17860745.20 rows=2705066 width=751) | Join Filter: (((a.resource_id_)::text = (res.proc_def_key_)::text) OR ((a.resource_id_)::text = 'Process_1da4zcw'::text)) | -> Gather (cost=7831572.54..17811688.97 rows=2705920 width=751) | Workers Planned: 2 | -> Parallel Bitmap Heap Scan on act_hi_actinst res (cost=7830572.54..17540096.97 rows=1127467 width=751) | Recheck Cond: (((proc_def_id_)::text = 'bffda88b-93ce-11eb-b38c-029c8faa5e3f'::text) AND (end_time_ <= '2025-06-25 17:57:07.145'::timestamp without time zone) AND (start_time_ >= '2021-06-25 17:57:07.145'::timestamp without| Filter: (tenant_id_ IS NULL) | -> BitmapAnd (cost=7830572.54..7830572.54 rows=2705920 width=0) | -> Bitmap Index Scan on act_idx_hi_ai_pdefid_end_time (cost=0.00..2747820.49 rows=8219980 width=0) | Index Cond: (((proc_def_id_)::text = 'bffda88b-93ce-11eb-b38c-029c8faa5e3f'::text) AND (end_time_ <= '2025-06-25 17:57:07.145'::timestamp without time zone)) | -> Bitmap Index Scan on act_idx_hi_act_inst_start (cost=0.00..5081398.84 rows=55111769 width=0) | Index Cond: (start_time_ >= '2021-06-25 17:57:07.145'::timestamp without time zone) | -> Materialize (cost=1055.45..1702.64 rows=1 width=33) | -> Seq Scan on act_ru_authorization a (cost=1055.45..1702.64 rows=1 width=33) | Filter: ((type_ < 9) AND (resource_type_ = 14) AND ((perms_ & 1) = 1) AND (((user_id_)::text = ANY ('{xyz,abc,def}'::text[])) OR (hash| SubPlan 1 | -> Unique (cost=0.28..1055.18 rows=107 width=33) | -> Index Only Scan using act_idx_auth_group_id on act_ru_authorization a_1 (cost=0.28..1039.34 rows=6337 width=33) |
ADDED INDEX QUERY PLAN | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Limit (cost=18447748.60..18447753.85 rows=100 width=751) | -> Unique (cost=18447748.60..18589764.57 rows=2705066 width=751) | -> Sort (cost=18447748.60..18454511.27 rows=2705066 width=751) | Sort Key: res.end_time_ DESC, res.id_, res.parent_act_inst_id_, res.proc_def_key_, res.root_proc_inst_id_, res.proc_inst_id_, res.execution_id_, res.act_id_, res.task_id_, res.call_proc_inst_id_, res.call_case_inst_id_, res.act_name_, res.ac| -> Nested Loop Anti Join (cost=5420115.02..15468208.73 rows=2705066 width=751) | Join Filter: (((a.resource_id_)::text = (res.proc_def_key_)::text) OR ((a.resource_id_)::text = 'Process_1da4zcw'::text)) | -> Gather (cost=5419059.57..15419152.49 rows=2705920 width=751) | Workers Planned: 2 | -> Parallel Bitmap Heap Scan on act_hi_actinst res (cost=5418059.57..15147560.49 rows=1127467 width=751) | Recheck Cond: (((proc_def_id_)::text = 'bffda88b-93ce-11eb-b38c-029c8faa5e3f'::text) AND (start_time_ >= '2021-06-25 17:57:07.145'::timestamp without time zone)) | Filter: ((tenant_id_ IS NULL) AND (end_time_ <= '2025-06-25 17:57:07.145'::timestamp without time zone)) | -> BitmapAnd (cost=5418059.57..5418059.57 rows=2713336 width=0) | -> Bitmap Index Scan on act_idx_hi_act_inst_proc_def_id (cost=0.00..335307.52 rows=8242510 width=0) | Index Cond: ((proc_def_id_)::text = 'bffda88b-93ce-11eb-b38c-029c8faa5e3f'::text) | -> Bitmap Index Scan on act_idx_hi_act_inst_start (cost=0.00..5081398.84 rows=55111769 width=0) | Index Cond: (start_time_ >= '2021-06-25 17:57:07.145'::timestamp without time zone) | -> Materialize (cost=1055.45..1702.64 rows=1 width=33) | -> Seq Scan on act_ru_authorization a (cost=1055.45..1702.64 rows=1 width=33) | Filter: ((type_ < 9) AND (resource_type_ = 14) AND ((perms_ & 1) = 1) AND (((user_id_)::text = ANY ('{xyz,abc,def}'::text[])) OR (hash| SubPlan 1 | -> Unique (cost=0.28..1055.18 rows=107 width=33) | -> Index Only Scan using act_idx_auth_group_id on act_ru_authorization a_1 (cost=0.28..1039.34 rows=6337 width=33) | |