gpdb
gpdb copied to clipboard
[6X] Using postgres query optimizer instead of ORCA to run resgroup group pipeline tests
This is the backport of #13974.
The resource group pipeline uses ORCA as an optimizer by default. But as a resource management tool, it's unimportant which optimizer we use.
So use postgres query optimizer instead of ORCA to run resource group pipeline tests. After that, we can remove the file of resgroup_bypass_optimizer.source and resgroup_bypass_optimizer_1.source.
I have run the full pipeline in my local, and all tests of the resource group are passed.
I have run the full pipeline in my local, and all tests of the resource group are passed.
Thus we can push this directly since it only modifies test cases.
I am confused by the rationale. If it is unimportant which optimizer is used, why change it?
I have same question as @vraghavan78 (if don't wish to maintain multiple answer files then can eliminate Planner answer files)? ORCA being the default optimizer seems safer to validate with default optimizer. Is it to ease/faster running tests locally during development (and based on community feedback) the rational for this change (as everyone for sure always builds with Planner)?
Hi @ashwinstar @vraghavan78
If it is unimportant which optimizer is used, why change it?
a. some cases have different output (mem usage) under different optimizer b. enable cassert or disable cassert also lead to some number different of mem usage c. we do not have pipeline job to run rg under both optimizer d. engineers thus will often forget to update planner's ansfile e. at least for me, when coding RG, I often compile without orca
based on all the above, there are two ways:
- let's add a job in pipeline to run rg under planner thus engineers will never forget
- let's only run rg under a specific one
For master we open a pr to do as 2, https://github.com/greenplum-db/gpdb/pull/13974 and we wait for several days, seem team is busy on other things no other comments we push it, and this is backport.
No problem, let re-discuss here, what is the best solution:
- add a new job to run rg under planner
- or what?
If for RG testing optimizer doesn't matter at all and developers (community) find it easy to develop and test via Planner then fine to have pipeline run just with that. Just was curious to know and would be good to reflect that rational in commit message for future reference.