gpdb [6X] Using postgres query optimizer instead of ORCA to run resgroup group pipeline tests

[6X] Using postgres query optimizer instead of ORCA to run resgroup group pipeline tests

Open SmartKeyerror opened this issue 2 years ago • 6 comments

This is the backport of #13974.

The resource group pipeline uses ORCA as an optimizer by default. But as a resource management tool, it's unimportant which optimizer we use.

So use postgres query optimizer instead of ORCA to run resource group pipeline tests. After that, we can remove the file of resgroup_bypass_optimizer.source and resgroup_bypass_optimizer_1.source.

Aug 24 '22 05:08 SmartKeyerror

I have run the full pipeline in my local, and all tests of the resource group are passed.

Aug 24 '22 05:08 SmartKeyerror

I have run the full pipeline in my local, and all tests of the resource group are passed.

Thus we can push this directly since it only modifies test cases.

Aug 24 '22 05:08 kainwen

I am confused by the rationale. If it is unimportant which optimizer is used, why change it?

Aug 24 '22 16:08 vraghavan78

I have same question as @vraghavan78 (if don't wish to maintain multiple answer files then can eliminate Planner answer files)? ORCA being the default optimizer seems safer to validate with default optimizer. Is it to ease/faster running tests locally during development (and based on community feedback) the rational for this change (as everyone for sure always builds with Planner)?

Aug 24 '22 17:08 ashwinstar

Hi @ashwinstar @vraghavan78

If it is unimportant which optimizer is used, why change it?

a. some cases have different output (mem usage) under different optimizer b. enable cassert or disable cassert also lead to some number different of mem usage c. we do not have pipeline job to run rg under both optimizer d. engineers thus will often forget to update planner's ansfile e. at least for me, when coding RG, I often compile without orca

based on all the above, there are two ways:

let's add a job in pipeline to run rg under planner thus engineers will never forget
let's only run rg under a specific one

For master we open a pr to do as 2, https://github.com/greenplum-db/gpdb/pull/13974 and we wait for several days, seem team is busy on other things no other comments we push it, and this is backport.

No problem, let re-discuss here, what is the best solution:

add a new job to run rg under planner
or what?

Aug 24 '22 23:08 kainwen

If for RG testing optimizer doesn't matter at all and developers (community) find it easy to develop and test via Planner then fine to have pipeline run just with that. Just was curious to know and would be good to reflect that rational in commit message for future reference.

Aug 24 '22 23:08 ashwinstar

gpdb gpdb copied to clipboard

[6X] Using postgres query optimizer instead of ORCA to run resgroup group pipeline tests

gpdb
gpdb copied to clipboard