velox icon indicating copy to clipboard operation
velox copied to clipboard

Compute decimal average by storing overflow

Open karteekmurthys opened this issue 2 years ago • 1 comments

This technique is similar to JAVA approach here:https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/aggregation/DecimalAverageAggregation.java#L222.

The accumulator is of the form:

struct DecimalAccumulator {
   int128_t sum;
   int64_t count;
   int64_t overflow;
};

We track overflow and use the overflow to compute the final average:

Average = overflow ( overflow_multiplier /  count    +      sum / (count * overflow))

Here we try to minimize integer multiplication that is likely to cause int128 bit overflows.

The division operations used above return integer quotient and an integer remainder. The remainder part is further divided by count to get a double value. So, the accuracy of the final result depends the (double)remainder/count value.

karteekmurthys avatar Oct 19 '22 04:10 karteekmurthys

Deploy Preview for meta-velox canceled.

Name Link
Latest commit ef3fc03bc04b4b266c8567a23f38bd8e371d24b0
Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/635ffd53b17d6100088f884a

netlify[bot] avatar Oct 19 '22 04:10 netlify[bot]

@Yuhta Would you please review the PR. Now we are covering partial and intermediate aggregations.

karteekmurthys avatar Oct 26 '22 17:10 karteekmurthys

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Oct 31 '22 17:10 facebook-github-bot

@Yuhta I see that build tests and linter workflows have failed. Is there anything I can help with?

karteekmurthys avatar Oct 31 '22 21:10 karteekmurthys

@karteekmurthys The errors I saw are trivial ones so far, no worry for now

Yuhta avatar Oct 31 '22 21:10 Yuhta