gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Allow certain functions to be safely executed on replicated slices in Orca

Open chrishajas opened this issue 2 years ago • 0 comments

Previously, Orca disallowed all aggregate functions from being executed on replicated slices. This meant that the results were broadcasted or gathered on a single segment to ensure consistency and correct results. This is necessary because some functions such as array_agg and custom user-created functions are sensitive to the order of data. This can cause wrong results in some cases.

However, many functions, especially commonly used ones such as sum, avg, count, min, and max, are not sensitive to the order of data and can be safely executed. We now make an exception for these common cases, currently the above agg functions on ints.

See https://github.com/greenplum-db/gpdb/pull/10978 for previous discussion.

Some additional questions:

  1. We can put the logic to determine whether an agg is "safe" in the translator if we want. Should it be there instead? We would still need the threading through Logical/PhysicalGb.

  2. Are there additional types/functions we want to allow at this time? Currently I only have ints and count(any), but we can also allow floats/others if they may commonly be used. There is a case where we could conceivably get different results for floats, which is why I didn't initially include it here. See https://github.com/greenplum-db/gpdb/pull/10505#issuecomment-668183894 for details. Basically, the order that we add floats can lead to different results.

chrishajas avatar Jul 28 '22 22:07 chrishajas

@dgkimura Catalog is a good place to store the info. 7X is fine. How do we deal with 6X given that it requires binary compatible?

gpopt avatar Aug 23 '22 23:08 gpopt

On Tuesday, August 23, 2022 4:08 PM gpopt wrote:

@dgkimura Catalog is a good place to store the info. 7X is fine. How do we deal with 6X given that it requires binary compatible?

I think you're right that it can't be done in 6X (current approach is probably best we can do). But maybe it's still worth trying for 7X.

dgkimura avatar Aug 23 '22 23:08 dgkimura

On Tuesday, August 23, 2022 4:08 PM gpopt wrote:

@dgkimura Catalog is a good place to store the info. 7X is fine. How do we deal with 6X given that it requires binary compatible?

I think you're right that it can't be done in 6X (current approach is probably best we can do). But maybe it's still worth trying for 7X.

I agree that this is something we should do for 7X and can't for 6X as it would require a catalog change. Additionally, since users can create custom aggs, we would also need to add some syntax for them to indicate whether it can be safely executed on a replicated slice (if they wanted that performance benefit).

This would be more effort, so it may still be good to get this PR into 7X and 6X and then prioritize getting it into the catalog on 7X as needed.

chrishajas avatar Sep 07 '22 17:09 chrishajas

This would be more effort, so it may still be good to get this PR into 7X and 6X and then prioritize getting it into the catalog on 7X as needed.

Agree.

gpopt avatar Sep 07 '22 17:09 gpopt