velox icon indicating copy to clipboard operation
velox copied to clipboard

Refactor greatest and least Presto functions using simple function API

Open Real-Chen-Happy opened this issue 11 months ago β€’ 13 comments

This PR is trying to resolve https://github.com/facebookincubator/velox/issues/3728

Refactor the greatest/least functions using simple function API so that it can be automatically vectorized. The following changes are made: (1) rename GreatestLeast.cpp to GreatestLeast.h since we are using simple functions API (2) GreatestLeast Functions will support the following CPP Type: bool int8_t int16_t int32_t int64_t int128_t float double Varchar LongDecimal ShortDecimal Date Timestamp, which is the same as the previous GreatestLeast Functions before refactoring (3) Now, GreatestLeast Function will also support NaN comparisons for DOUBLE and REAL data types. NaN is considered as the biggest according to https://github.com/prestodb/presto/issues/22391

As https://github.com/facebookincubator/velox/issues/3728#issuecomment-2027075925 points out we don't need benchmark for this change.

Thank you for helping review!

Real-Chen-Happy avatar Mar 29 '24 20:03 Real-Chen-Happy

Hi @Real-Chen-Happy!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot avatar Mar 29 '24 20:03 facebook-github-bot

Deploy Preview for meta-velox canceled.

Name Link
Latest commit 7f769579eadb39b57d262564661f18caa33cf77a
Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6610402589bcf10008dc4320

netlify[bot] avatar Mar 29 '24 20:03 netlify[bot]

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

facebook-github-bot avatar Mar 29 '24 22:03 facebook-github-bot

Aware of the decimal compatibility errors. Fix in progress

Real-Chen-Happy avatar Mar 30 '24 00:03 Real-Chen-Happy

The error happens in one of the circleci check The previous signature is least(DECIMAL(precision,scale)...) -> DECIMAL(precision,scale) and greatest(DECIMAL(precision,scale)...) -> DECIMAL(precision,scale) But current signature changed to least(decimal(i1,i5)...) -> decimal(i1,i5) and greatest(decimal(i1,i5)...) -> decimal(i1,i5) I just investigated the errors, and I am not sure what's going wrong. I also check the previous PR https://github.com/facebookincubator/velox/pull/9096, and it seems like the error also happens there. @mbasmanova May I know if the error can be ignored? If not, do you have any recommended way for fixing it? Thanks

Real-Chen-Happy avatar Mar 30 '24 07:03 Real-Chen-Happy

@Real-Chen-Happy The error you are seeing is a limitation of the CI check: https://github.com/facebookincubator/velox/issues/9240

CC @kgpai

mbasmanova avatar Apr 01 '24 11:04 mbasmanova

@Real-Chen-Happy Thank you for the refactoring. Looks great overall. Some comments.

In Presto, greatest and least functions allow array and struct inputs as well. It would be nice to add support for these in a follow-up.

Sure! For clarification, do you mean adding these support in this PR or in a separate PR?

Real-Chen-Happy avatar Apr 01 '24 17:04 Real-Chen-Happy

do you mean adding these support in this PR or in a separate PR?

Separate PR would be better, I think.

mbasmanova avatar Apr 01 '24 17:04 mbasmanova

@mbasmanova Could you help review the updated code again? For this version, I make the following changes

  1. Resolve nit issues mentioning in the code reviews (removing unused includes, rewriting the comments, etc)
  2. Use call for the function instead of callNullFree
  3. Assert the size of input is greater than 0 by changing registration code
  4. Add support NaN comparisons for DOUBLE and REAL data types
  5. Add unit test for NaN comparisons

Thank you!

edit: It seems like we need some additional clarifications on NaN https://github.com/prestodb/presto/issues/22391 before proceeding.

Real-Chen-Happy avatar Apr 02 '24 08:04 Real-Chen-Happy

@mbasmanova Could you help review the updated code again? For the third version, I make the following changes:

  1. Fix the issues mentioned in the code review
  2. Remove the VELOX_USER_CHECK

Just confirmed from https://github.com/prestodb/presto/issues/22391, NaN in DOUBLE and REAL will be considered as the biggest element in the greatest/least function.

Thank you!

Real-Chen-Happy avatar Apr 02 '24 21:04 Real-Chen-Happy

Circle Check seems to throw another error. The signature is changed from greatest(type...) -> type to greatest(type type...) -> type presto_pr_signatures.json

I think it's because the registration code is modified from registerFunction<ParameterBinder<GreatestFunction, T>, T, Variadic<T>>(...) to registerFunction<ParameterBinder<GreatestFunction, T>, T, T, Variadic<T>>(...)

May I know if the compatibility error can be ignored? I am not sure whether the new signature will impact on the existing users. If it needs to be fixed, I think we can

  1. use original registration code
  2. use FOLLY_ALWAYS_INLINE bool call(...), and return false for any 0 size Variadic

Real-Chen-Happy avatar Apr 03 '24 04:04 Real-Chen-Happy

Circle Check seems to throw another error. The signature is changed from greatest(type...) -> type to greatest(type type...) -> type presto_pr_signatures.json

I think it's because the registration code is modified from registerFunction<ParameterBinder<GreatestFunction, T>, T, Variadic<T>>(...) to registerFunction<ParameterBinder<GreatestFunction, T>, T, T, Variadic<T>>(...)

May I know if the compatibility error can be ignored? I am not sure whether the new signature will impact on the existing users. If it needs to be fixed, I think we can

  1. use original registration code
  2. use FOLLY_ALWAYS_INLINE bool call(...), and return false for any 0 size Variadic

@mbasmanova Thanks for reviewing. May I know if the compatibility error might be a concern this time? Or it’s fine?

Real-Chen-Happy avatar Apr 03 '24 15:04 Real-Chen-Happy

May I know if the compatibility error might be a concern this time? Or it’s fine?

I think this is fine. See my earlier reply:

https://github.com/facebookincubator/velox/pull/9308#issuecomment-2029619013

mbasmanova avatar Apr 03 '24 15:04 mbasmanova

@mbasmanova Could you help review the updated code again? For the new version, I make the following changes:

  1. Refactor the tests

Thank you!

Real-Chen-Happy avatar Apr 05 '24 05:04 Real-Chen-Happy

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Apr 05 '24 10:04 facebook-github-bot

@mbasmanova Just updated the code to fix the nit issues. Thank you for helping review!

Also, may I know how to get invited to Velox Slack channel? I sent my request a week ago to [email protected], but still not hear back anything yet. Let me know if there is anything I need to do prior to joining. Thanks!

Real-Chen-Happy avatar Apr 05 '24 17:04 Real-Chen-Happy

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Apr 05 '24 17:04 facebook-github-bot

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Thank you for the reminder @mbasmanova . @Real-Chen-Happy I just sent you an invite. Can you check if you can access slack now?

sanumandla avatar Apr 05 '24 17:04 sanumandla

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Thank you for the reminder @mbasmanova . @Real-Chen-Happy I just sent you an invite. Can you check if you can access slack now?

Yes I have the access now! Thank you so much! @sanumandla

Real-Chen-Happy avatar Apr 05 '24 17:04 Real-Chen-Happy

@mbasmanova Just updated the code to rename the registration function. Thank you again for helping review!

Real-Chen-Happy avatar Apr 05 '24 18:04 Real-Chen-Happy

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Apr 05 '24 18:04 facebook-github-bot

@mbasmanova merged this pull request in facebookincubator/velox@e29cde7b220ac507f7c55e3101f47836a67c51f1.

facebook-github-bot avatar Apr 08 '24 15:04 facebook-github-bot

Conbench analyzed the 1 benchmark run on commit e29cde7b.

There were no benchmark performance regressions. πŸŽ‰

The full Conbench report has more details.

conbench-facebook[bot] avatar Apr 08 '24 16:04 conbench-facebook[bot]

@mbasmanova Thank you again for helping review and fix errors. I am very excited to submit my first commit to Velox!

Real-Chen-Happy avatar Apr 08 '24 17:04 Real-Chen-Happy

@Real-Chen-Happy Thank you for the contribution.

mbasmanova avatar Apr 08 '24 17:04 mbasmanova

This pull request has been reverted by fd5643aad261d809021b701555b0bfd5dd2870d0.

facebook-github-bot avatar Apr 16 '24 19:04 facebook-github-bot